diff options
Diffstat (limited to 'gitlab/issues_text/target_arm/host_missing/accel_TCG')
82 files changed, 2940 insertions, 0 deletions
diff --git a/gitlab/issues_text/target_arm/host_missing/accel_TCG/1034 b/gitlab/issues_text/target_arm/host_missing/accel_TCG/1034 new file mode 100644 index 000000000..e2f47c076 --- /dev/null +++ b/gitlab/issues_text/target_arm/host_missing/accel_TCG/1034 @@ -0,0 +1,17 @@ +Erlang/OTP 25 JIT on AArch64 fails in user mode emulation +Description of problem: +Compiling Erlang/OTP 25 fails with segfault when using user mode (but works in system mode), the Erlang devs have tracked it down in [ErlangForums](https://erlangforums.com/t/otp-25-0-rc3-release-candidate-3-is-released/1317/24) and give this explanation: + +> Thanks, I’ve found a bug in QEMU that explains this. The gist of it is: +> +> Instead of interpreting guest code, QEMU dynamically translates it to the host architecture. When the guest overwrites code for one reason or another, the translation is invalidated and redone if needed. +> +> Our JIT:ed code is mapped in two regions to work in the face of W^X restrictions: one executable but not writable, and one writable but not executable. Both of these regions point to the same physical memory and writes to the writable region are “magically” reflected in the executable one. +> +> I would’ve expected QEMU to honor the IC IVAU / ISB instructions we use to tell the processor that we’ve altered code at a particular address, but for some reason QEMU just ignores them 3 and relies entirely on trapping writes to previously translated code. +> +> In system mode QEMU emulates the MMU and sees that these two regions point at the same memory, and has no problem invalidating the executable region after writing to the writable region. +> +> In user mode it instead calls mprotect(..., PROT_READ) on all code regions it has translated, and invalidates translations in the signal handler. The problem is that we never write to the executable region – just the writable one – so the code doesn’t get invalidated. + +There doesn't seem to a open or closed QEMU bug that actually describes this problem. diff --git a/gitlab/issues_text/target_arm/host_missing/accel_TCG/1054 b/gitlab/issues_text/target_arm/host_missing/accel_TCG/1054 new file mode 100644 index 000000000..06f16ad58 --- /dev/null +++ b/gitlab/issues_text/target_arm/host_missing/accel_TCG/1054 @@ -0,0 +1,30 @@ +Unable to start CirrOS 0.5.1 on QEMU 7.0 with -M virt and -cpu max +Description of problem: + +Steps to reproduce: +1. Fetch CirrOS image: ```wget https://github.com/cirros-dev/cirros/releases/download/0.5.1/cirros-0.5.1-aarch64-disk.img``` +2. Run QEMU: + ``` + qemu-system-aarch64 -drive file=cirros-0.5.1-aarch64-disk.img -M virt -m 2048 \ + -bios /usr/share/qemu-efi-aarch64/QEMU_EFI.fd -cpu max -nographic + ``` +Additional information: +When image boots, GRUB window appears for a second and then kernel/initramfs are loaded and booted: +``` +EFI stub: Booting Linux Kernel... +EFI stub: EFI_RNG_PROTOCOL unavailable, no randomness supplied +EFI stub: Using DTB from configuration table +EFI stub: Exiting boot services and installing virtual address map... +``` + +When everything is fine we can see kernel output: +``` +[ 0.000000] Booting Linux on physical CPU 0x0000000000 [0x411fd070] +[ 0.000000] Linux version 5.3.0-26-generic (buildd@bos02-arm64-028) (gcc version 7.4.0 (Ubuntu/Linaro 7.4.0-1ubuntu1~18.04.1)) #28~18.04.1-Ubuntu SMP Wed Dec 18 16:41:01 UTC 2019 (Ubuntu 5.3.0-26.28~18.04.1-generic 5.3.13) +[ 0.000000] efi: Getting EFI parameters from FDT: +[ 0.000000] efi: EFI v2.70 by EDK II +``` + +But on QEMU 7.0 with ```-M virt -cpu max``` we never get kernel output. + +# diff --git a/gitlab/issues_text/target_arm/host_missing/accel_TCG/1057 b/gitlab/issues_text/target_arm/host_missing/accel_TCG/1057 new file mode 100644 index 000000000..82c426f9d --- /dev/null +++ b/gitlab/issues_text/target_arm/host_missing/accel_TCG/1057 @@ -0,0 +1,23 @@ +AArch64: ISV is set to 1 in ESR_EL2 when taking a data abort with post-indexed instructions +Description of problem: +I think that I have a Qemu bug in my hands, but, I could still be missing something. Consider the following instruction: +`0x0000000000000000: C3 44 00 B8 str w3, [x6], #4` + +notice the last #4, I think this is what we would call a post-indexed instruction (falls into the category of instructions with writeback). As I understand it, those instructions should not have ISV=1 in ESR_EL2 when faulting. + +Here is the relevant part of the manual: + +``` +For other faults reported in ESR_EL2, ISV is 0 except for the following stage 2 aborts: +• AArch64 loads and stores of a single general-purpose register (including the register specified with 0b11111, including those with Acquire/Release semantics, but excluding Load Exclusive or Store Exclusive and excluding those with writeback). +``` + +However, I can see that Qemu sets ISV to 1 here. The ARM hardware that I tested gave me a value of ISV=0 for similar instructions. + +Another example of instruction: `0x00000000000002f8: 01 1C 40 38 ldrb w1, [x0, #1]!` +Steps to reproduce: +1. Run some hypervisor in EL2 +2. Create a guest running at EL1 that executes one of the mentioned instructions (and make the instruction fault by writing to some unmapped page in SLP) +3. Observe the value of ESR_EL2 on data abort + +Unfortunately, I cannot provide an image to reproduce this (the software is not open-source). But, I would be happy to help test a patch. diff --git a/gitlab/issues_text/target_arm/host_missing/accel_TCG/1062 b/gitlab/issues_text/target_arm/host_missing/accel_TCG/1062 new file mode 100644 index 000000000..b4a09b094 --- /dev/null +++ b/gitlab/issues_text/target_arm/host_missing/accel_TCG/1062 @@ -0,0 +1,16 @@ +AArch64: SCR_EL3.RW behaves incorrectly for CPUs with no AArch32 +Description of problem: +In the ARM DDI 0487G.a, D13-3572, the SCR_EL3.RW bit is defined as RAO/WI if both EL2 and EL1 don't support Aarch32. However, the function `scr_write` in `target/arm/helper.c` does not reflect this behavior, even though it checks for Aarch32 EL1 support. + +This would break this EL3 code, which should run on cpu reset to attempt to return to EL1: +```asm +mov x1, #((1<<0)|(1<<2)|(1<<6)|(1<<7)|(1<<8)|(1<<9)) ; EL1h, DAIF masked +mov SPSR_EL3, x1 +adr x1, 1f +msr ELR_EL3, x1 +eret +1: +; something something +``` +Additional information: + diff --git a/gitlab/issues_text/target_arm/host_missing/accel_TCG/1130 b/gitlab/issues_text/target_arm/host_missing/accel_TCG/1130 new file mode 100644 index 000000000..ac09550f9 --- /dev/null +++ b/gitlab/issues_text/target_arm/host_missing/accel_TCG/1130 @@ -0,0 +1,29 @@ +error on run qemu-system-aarch64 -icount shift=1,align=off,sleep=on -smp 2 +Description of problem: +This issue happen with the most recent version. +* Compile parameters: +``` +./configure --target-list=aarch64-softmmu --prefix=pwd/release --disable-werror --enable-lto --enable-capstone --enable-system --enable-fdt --disable-xen --disable-kvm --enable-plugins +``` +* run: +``` +qemu-system-aarch64 -nographic -machine virt -cpu cortex-a57 -icount shift=1,align=off,sleep=on -smp 2 -vnc :2 -m 4080 -kernel /home/yuzy/mywork/linux/linux-5.15.30/arch/arm64/boot/Image.gz -initrd /home/yuzy/mywork/build/rootfs.cpio.gz +``` +* error occurred: +``` +** +ERROR:../accel/tcg/tcg-accel-ops.c:79:tcg_handle_interrupt: assertion failed: (qemu_mutex_iothread_locked()) +Aborted (core dumped) +``` +Steps to reproduce: +1. run qemu-system-aarch64 -machine virt -cpu cortex-a57 -icount shift=1,align=off,sleep=on -smp 2 -m 4080 -kernel Image.gz -initrd rootfs.cpio.gz +2. it will assertion failed: (qemu_mutex_iothread_locked()) +Additional information: +The following two situations are good: +``` +qemu-system-aarch64 -machine virt -cpu cortex-a57 -icount shift=1,align=off,sleep=on -smp 1 -m 4080 -kernel Image.gz -initrd rootfs.cpio.gz +``` +``` +qemu-system-aarch64 -machine virt -cpu cortex-a57 -smp 2 -m 4080 -kernel Image.gz -initrd rootfs.cpio.gz +``` +I assume the issues are: gic diff --git a/gitlab/issues_text/target_arm/host_missing/accel_TCG/1153 b/gitlab/issues_text/target_arm/host_missing/accel_TCG/1153 new file mode 100644 index 000000000..745a09c01 --- /dev/null +++ b/gitlab/issues_text/target_arm/host_missing/accel_TCG/1153 @@ -0,0 +1 @@ +arm: wrong syndrome reported for FP and SIMD traps to AArch32 Hyp diff --git a/gitlab/issues_text/target_arm/host_missing/accel_TCG/1154 b/gitlab/issues_text/target_arm/host_missing/accel_TCG/1154 new file mode 100644 index 000000000..60eae2809 --- /dev/null +++ b/gitlab/issues_text/target_arm/host_missing/accel_TCG/1154 @@ -0,0 +1 @@ +arm: M-profile loads and stores done via helpers should enforce alignment restrictions diff --git a/gitlab/issues_text/target_arm/host_missing/accel_TCG/1177 b/gitlab/issues_text/target_arm/host_missing/accel_TCG/1177 new file mode 100644 index 000000000..f17e8f086 --- /dev/null +++ b/gitlab/issues_text/target_arm/host_missing/accel_TCG/1177 @@ -0,0 +1,16 @@ +booting linux hangs with -cpu max or -cpu max,lpa2=off, but works with -cpu cortex-a57 +Description of problem: + +Steps to reproduce: +1. Snag mini.iso from http://ports.ubuntu.com/ubuntu-ports/dists/bionic-updates/main/installer-arm64/current/images/netboot/mini.iso +2. qemu-img create ubuntu-image.img 20G +3. dd if=/dev/zero of=flash1.img bs=1M count=64 +4. dd if=/dev/zero of=flash0.img bs=1M count=64 +5. dd if=/home/imp/git/qemu/00-build/pc-bios/edk2-aarch64-code.fd of=flash0.img conv=notrunc +6. Run the above command +7. Select install, watch the kernel hang. +8. Change -cpu max to -cpu cortex-a57 and it will work. -cpu max,lpa2=off also exhibits the problem +Additional information: +Just grabbed git and built it with ./configure in /home/imp/git/qemu/00-build. + +pm215 on irc suggested that it was an old EDK2 and a newer one is needed to cope with the newer CPU features in -cpu max diff --git a/gitlab/issues_text/target_arm/host_missing/accel_TCG/1204 b/gitlab/issues_text/target_arm/host_missing/accel_TCG/1204 new file mode 100644 index 000000000..b62fb60c5 --- /dev/null +++ b/gitlab/issues_text/target_arm/host_missing/accel_TCG/1204 @@ -0,0 +1,29 @@ +AArch64 unaligned accesses are allowed by QEMU when SCTLR_EL3.A is 0, but SCTLR_EL3.M is also 0 +Description of problem: +As per the ARM ARM, when address translation is disabled and the access is not done from EL1/0 with HCR_EL2.DC set to 1, data accesses receive the 'Device-nGnRnE' memory attribute (D.8.2.10 The effects of disabling an address translation stage - DDi0487I.a, Page D8-5119). +Memory regions marked as Device do not support unaligned access. +Steps to reproduce: +Run the following snippet under EL3, and notice the last load instruction completes successfully (doesn't raise an alignment fault) +``` +.balign 8 +.global first_variable +first_variable: + .word 0x1 +.balign 4 +.global second_variable +second_variable: + .word 0x2 + +no_mmu_sctlr: .dword 0x0000000030C51834 + +.globl reproducer +reproducer: + ldr x1, no_mmu_sctlr // A=0,M=0 + msr sctlr_el3, x1 + dsb sy + isb + + ldr x0, =first_variable + ldr x1, [x0, #0] // Aligned - Success + ldr x1, [x0, #4] // Unaligned - Success??? (Should be failure) +``` diff --git a/gitlab/issues_text/target_arm/host_missing/accel_TCG/1208 b/gitlab/issues_text/target_arm/host_missing/accel_TCG/1208 new file mode 100644 index 000000000..a7426046e --- /dev/null +++ b/gitlab/issues_text/target_arm/host_missing/accel_TCG/1208 @@ -0,0 +1,9 @@ +Raspberry Pi 4 Model B +Additional information: +There have been some attempts at implementing this a few years ago: see: +* https://gitlab.com/philmd/qemu/-/tree/raspi4_wip +* https://github.com/0xMirasio/qemu-patch-raspberry4 + + + +Thanks! diff --git a/gitlab/issues_text/target_arm/host_missing/accel_TCG/1293 b/gitlab/issues_text/target_arm/host_missing/accel_TCG/1293 new file mode 100644 index 000000000..7abe62b81 --- /dev/null +++ b/gitlab/issues_text/target_arm/host_missing/accel_TCG/1293 @@ -0,0 +1 @@ +Trusted Firmware stopped booting on SBSA-ref diff --git a/gitlab/issues_text/target_arm/host_missing/accel_TCG/1347 b/gitlab/issues_text/target_arm/host_missing/accel_TCG/1347 new file mode 100644 index 000000000..046a65385 --- /dev/null +++ b/gitlab/issues_text/target_arm/host_missing/accel_TCG/1347 @@ -0,0 +1,23 @@ +qemu-system-arm segfaults: arm_v7m_tcg_ops.restore_state_to_opc is NULL +Description of problem: +gdb backtrace: +``` +#0 0x0000000000000000 in ?? () +#1 0x0000555555eda714 in cpu_restore_state_from_tb (cpu=0x5555570020e0, tb=0x7fffb8f6ce80, host_pc=140735277023274) at ../accel/tcg/translate-all.c:311 +#2 0x0000555555eda785 in cpu_restore_state (cpu=0x5555570020e0, host_pc=140735277023274) at ../accel/tcg/translate-all.c:335 +#3 0x0000555555d01323 in arm_cpu_do_transaction_failed (cs=0x5555570020e0, physaddr=1073885184, addr=1073885184, size=4, access_type=MMU_DATA_LOAD, mmu_idx=1, attrs=..., response=1, retaddr=140735277023274) at ../target/arm/tlb_helper.c:199 +#4 0x0000555555ee4118 in cpu_transaction_failed (cpu=0x5555570020e0, physaddr=1073885184, addr=1073885184, size=4, access_type=MMU_DATA_LOAD, mmu_idx=1, attrs=..., response=1, retaddr=140735277023274) at ../accel/tcg/cputlb.c:1344 +#5 0x0000555555ee42aa in io_readx (env=0x555557003f50, full=0x5555580f26c0, mmu_idx=1, addr=1073885184, retaddr=140735277023274, access_type=MMU_DATA_LOAD, op=MO_32) at ../accel/tcg/cputlb.c:1380 +#6 0x0000555555ee59f2 in load_helper (env=0x555557003f50, addr=1073885184, oi=33, retaddr=140735277023274, op=MO_32, code_read=false, full_load=0x555555ee5dbf <full_le_ldul_mmu>) at ../accel/tcg/cputlb.c:1970 +#7 0x0000555555ee5e12 in full_le_ldul_mmu (env=0x555557003f50, addr=1073885184, oi=33, retaddr=140735277023274) at ../accel/tcg/cputlb.c:2070 +#8 0x0000555555ee5e44 in helper_le_ldul_mmu (env=0x555557003f50, addr=1073885184, oi=33, retaddr=140735277023274) at ../accel/tcg/cputlb.c:2077 +#9 0x00007fff7c31c0be in code_gen_buffer () +#10 0x0000555555ed15b8 in cpu_tb_exec (cpu=0x5555570020e0, itb=0x7fffb8f6ce80, tb_exit=0x7fff7a3fc068) at ../accel/tcg/cpu-exec.c:438 +#11 0x0000555555ed2185 in cpu_loop_exec_tb (cpu=0x5555570020e0, tb=0x7fffb8f6ce80, pc=2824872, last_tb=0x7fff7a3fc080, tb_exit=0x7fff7a3fc068) at ../accel/tcg/cpu-exec.c:868 +#12 0x0000555555ed2545 in cpu_exec (cpu=0x5555570020e0) at ../accel/tcg/cpu-exec.c:1032 +#13 0x0000555555ef3329 in tcg_cpus_exec (cpu=0x5555570020e0) at ../accel/tcg/tcg-accel-ops.c:69 +#14 0x0000555555ef39ca in mttcg_cpu_thread_fn (arg=0x5555570020e0) at ../accel/tcg/tcg-accel-ops-mttcg.c:95 +#15 0x00005555560b1e87 in qemu_thread_start (args=0x5555571358e0) at ../util/qemu-thread-posix.c:505 +#16 0x00007ffff7fb6cbe in start (p=0x7fff7a3fc1e0) at src/thread/pthread_create.c:195 +#17 0x00007ffff7fc3e7b in __clone () at src/thread/x86_64/clone.s:22 +``` diff --git a/gitlab/issues_text/target_arm/host_missing/accel_TCG/1400 b/gitlab/issues_text/target_arm/host_missing/accel_TCG/1400 new file mode 100644 index 000000000..95e70b927 --- /dev/null +++ b/gitlab/issues_text/target_arm/host_missing/accel_TCG/1400 @@ -0,0 +1 @@ +helper_access_check_cp_reg() raising Undefined Instruction on big-endian host diff --git a/gitlab/issues_text/target_arm/host_missing/accel_TCG/1412 b/gitlab/issues_text/target_arm/host_missing/accel_TCG/1412 new file mode 100644 index 000000000..4a807c443 --- /dev/null +++ b/gitlab/issues_text/target_arm/host_missing/accel_TCG/1412 @@ -0,0 +1,5 @@ +QEMU segfault (null pointer dereference) in sve_probe_page from ldff1* instructions +Description of problem: +After upgrading to QEMU v7.2.0 from v7.1.0, when executing any SVE ldff1* instructions with a faulting address, QEMU crashes due to a null pointer dereference at target/arm/sve_helper.c:5364 + +I believe this was introduced in b8967ddf393aaf35fdbc07b4cb538a40f8b6fe37 (@rth7680), since in that commit `full` is dereferenced before the `flags & TLB_INVALID_MASK` check at line 5369, and full is set to null by `probe_access_full` when `TLB_INVALID_MASK` is given. diff --git a/gitlab/issues_text/target_arm/host_missing/accel_TCG/1416 b/gitlab/issues_text/target_arm/host_missing/accel_TCG/1416 new file mode 100644 index 000000000..adb4d9180 --- /dev/null +++ b/gitlab/issues_text/target_arm/host_missing/accel_TCG/1416 @@ -0,0 +1,5 @@ +MTE tags are applied at page granularity (4K) instead of tag granularity (16) +Description of problem: +After upgrading to QEMU v7.2.0 from v7.1.0, when executing stg/ldg instructions on any address, QEMU behaves as if the instruction was executed on the page base of said address. + +I believe this was introduced in b8967ddf393aaf35fdbc07b4cb538a40f8b6fe37 (@rth7680), since in that commit `ptr_paddr` is changed to be calculated based on `CPUTLBEntryFull::phys_addr`, which contains the page base address, while beforehand it was calculated based on `host` which does have the page offset applied. diff --git a/gitlab/issues_text/target_arm/host_missing/accel_TCG/1417 b/gitlab/issues_text/target_arm/host_missing/accel_TCG/1417 new file mode 100644 index 000000000..eabead075 --- /dev/null +++ b/gitlab/issues_text/target_arm/host_missing/accel_TCG/1417 @@ -0,0 +1,5 @@ +QEMU fails an assertion when hitting a breakpoint that is set on a tlb-missed 2-stage translated AArch64 memory +Description of problem: +After upgrading to QEMU v7.2.0 from v7.1.0, when hitting an instruction breakpoint on a memory address that is translated by 2 stages of translation, and is not already cached in the TLB, QEMU fails the assertion at target/arm/ptw.c:301 (`assert(fi->type != ARMFault_None);`). + +I believe this was introduced in f3639a64f602ea5c1436eb9c9b89f42028e3a4a8 (@rth7680), since in that commit the failure check for the return value of `get_phys_addr_lpae()` changed from checking for true (meaning failure) to checking for false (which actually means success). diff --git a/gitlab/issues_text/target_arm/host_missing/accel_TCG/1498 b/gitlab/issues_text/target_arm/host_missing/accel_TCG/1498 new file mode 100644 index 000000000..90beb501c --- /dev/null +++ b/gitlab/issues_text/target_arm/host_missing/accel_TCG/1498 @@ -0,0 +1,5 @@ +LDC, STC unimplemented in qemu-system-arm +Description of problem: +I used differential testing to compared the instruction consistency (ARMv7) between QEMU and raspberry pi 2B in system level and some inconsistency in LDC, SDC instruction was detected. + +The instructions run successfully in raspi2b, but cause undefined in QEMU. I found that LDC and SDC instructions remain unimplemented in QEMU. diff --git a/gitlab/issues_text/target_arm/host_missing/accel_TCG/1499 b/gitlab/issues_text/target_arm/host_missing/accel_TCG/1499 new file mode 100644 index 000000000..c26e47ee7 --- /dev/null +++ b/gitlab/issues_text/target_arm/host_missing/accel_TCG/1499 @@ -0,0 +1,90 @@ +qemu-system-arm doesn't honour CPACR.ASEDIS, D32DIS +Description of problem: +We used differential testing to compared the instruction consistency (ARMv7) between QEMU and raspberry pi 2B in system level and some inconsistency in SIMD instruction was detected. + +We compiled the kernel with options `-mcpu=cortex-a7 -march=armv7ve -mfloat-abi=hard -mfpu=vfpv4 `. Some SIMD instructions are considered as **undefined** instructions in raspi2b, but run successfully in the QEMU. + +We checked that the CPACR.ASEDIS=1, which disables Advanced SIMD functionality, according to ARMv7-a manual B4.1.40. The manual says "All instruction encodings identified in the Alphabetical list of instructions on page A8-300 as being Advanced SIMD instructions, but that are not VFPv3 or VFPv4 +instructions, are UNDEFINED when accessed from PL1 and PL0 modes." + +Tested instruction samples are shown as follows: + +- VMAX_int_T1A1_A 11110010010010110000011010100100 0xf24b06a4 +- VMUL_scalar_A1_A 11110010101001001100100 001000011 0xf2a4c843 +- VADD_int_T1A1_A 11110010000111111010100000001100 0xf21fa80c + +... + +Some checks of the SIMD instructions may be needed before the execution of the instructions in function ` do_3same` etc. in target/arm/translate-neon.c. +Steps to reproduce: +1. Compile a kernel module to run the test instruction in PL1. +2. Hook a undefined handler in kernel module to catch the undefined instructions. A kernel module template we used to test is as follows + +```c +#include <linux/module.h> +#include <linux/kernel.h> +#include <asm/traps.h> + +MODULE_LICENSE("GPL"); +#pragma GCC optimize ("O0") +// instr is undefined instruction value +static int undef_instr_handler(struct pt_regs *regs, u32 instr) +{ + printk(KERN_INFO "get undefined instruction\n"); + // Just skip over to the next instruction. + regs->ARM_pc += 4; + return 0; // All fine! +} + +static struct undef_hook uh = { + .instr_mask = 0x0, // any instruction + .instr_val = 0x0, // any instruction + .cpsr_mask = 0x0, // any pstate + .cpsr_val = 0x0, // any pstate + .fn = undef_instr_handler +}; +int init_module(void) { + // Lookup wanted symbols. + register_undef_hook(&uh); + __asm__ __volatile__("push {R0-R12}"); + __asm__ __volatile__( + ".global inialize_location\n" + "inialize_location:\n" + "mov r0, %[reg_init] \n" + "mov r1, %[reg_init] \n" + "mov r2, %[reg_init] \n" + "mov r3, %[reg_init] \n" + "mov r4, %[reg_init] \n" + "mov r5, %[reg_init] \n" + "mov r6, %[reg_init] \n" + "mov r7, %[reg_init] \n" + "mov r8, %[reg_init] \n" + "mov r9, %[reg_init] \n" + "mov r10, %[reg_init] \n" + "mov r11, %[reg_init] \n" + "mov r12, %[reg_init] \n" + : + : [reg_init] "n"(0) + ); + // =======TODO======= + // replace nop with test instruction + __asm__ __volatile__( + ".global inst_location\n" + "inst_location:\n" + "nop\n" + ); + // kgdb_breakpoint(); + __asm__ __volatile__( + ".global finish_location\n" + "finish_location:\n" + ); + __asm__ __volatile__("pop {R0-R12}"); + return 0; +} + +void cleanup_module(void) { + unregister_undef_hook(&uh); +} +``` +Additional information: + diff --git a/gitlab/issues_text/target_arm/host_missing/accel_TCG/1500 b/gitlab/issues_text/target_arm/host_missing/accel_TCG/1500 new file mode 100644 index 000000000..8d52d0c68 --- /dev/null +++ b/gitlab/issues_text/target_arm/host_missing/accel_TCG/1500 @@ -0,0 +1,38 @@ +Some system/debug regisiters are inconsistent with real device in qemu-system-arm +Description of problem: +We used differential testing to compared the instruction consistency (ARMv7) between QEMU and raspberry pi 2B in system level and some inconsistency in system regisiter was detected. + +1. CCSIDR--Cache Size ID Registers + + **Inconsistency** + + - CCSIDR in QEMU: 0x701fe00a--Associativity: 2, Number of sets:256 + + - CCSIDR in Raspi2B: 0x700fe01a--Associativity: 4, Number of sets:128 + + **Tested Instruction sample** + + - MRC_T1A1_A 11101110001100000000111100010000 0xee300f10 + + According to ARMv7 Manual B4.1.19 encoding, the NumSets and Associativity are set different bewteen QEMU when emulating raspi2b and raspi2b. + + The CCSIDR is set in the function`cortex_a7_initfn(Object *obj)` in target/arm/cpu_tcg.c for cortex_a7. + +2. DBGDRAR--Debug ROM Address Register + + **Inconsistency** + + - DBGDRAR in QEMU: 0x0 --Invalid + + - DBGDRAR in Raspi2B: 0x40020003--Valid + + According to ARMv7 Manual C11.11.16 encoding, the DBGDRAR in qemu is invalid. + + **Tested Instruction sample** + + - MRC_T1A1_A 11101110000100010001111000010000 0xee111e10 +Steps to reproduce: +1. Compile a kernel module to run the test instruction in PL1. +2. Use kgdb to get the register info +Additional information: + diff --git a/gitlab/issues_text/target_arm/host_missing/accel_TCG/1551 b/gitlab/issues_text/target_arm/host_missing/accel_TCG/1551 new file mode 100644 index 000000000..bfc398eea --- /dev/null +++ b/gitlab/issues_text/target_arm/host_missing/accel_TCG/1551 @@ -0,0 +1,40 @@ +qemu-system-arm: ../accel/tcg/cpu-exec.c:917: cpu_loop_exec_tb: Assertion `icount_enabled()' failed. +Description of problem: +When starting the guest, the mentioned assertion is triggered very soon: +``` +qemu-system-arm: ../accel/tcg/cpu-exec.c:917: cpu_loop_exec_tb: Assertion `icount_enabled()' failed. +``` +I'm able to successfully boot the same image with QEMU 7.2.0. + +The last output from the qemu logging with `-d guest_errors,in_asm,int,pcall,cpu` is +``` +---------------- +IN: +0x40209100: e92d4ff0 push {r4, r5, r6, r7, r8, sb, sl, fp, lr} +0x40209104: e28db020 add fp, sp, #0x20 +0x40209108: e24b3f49 sub r3, fp, #0x124 +0x4020910c: e24ddf43 sub sp, sp, #0x10c +0x40209110: e1a0e00f mov lr, pc +0x40209114: e3e0f0ff mvn pc, #0xff + +R00=4021000c R01=4020a5f8 R02=0000000f R03=40209100 +R04=40210018 R05=40210018 R06=4020c000 R07=40002000 +R08=00000000 R09=00000000 R10=00000000 R11=4020d7fc +R12=00000000 R13=4020d7f0 R14=4020074c R15=40209100 +PSR=2000011f --C- A sys32 +---------------- +IN: +0xffffff00: ee1d0f50 mrc p15, #0, r0, c13, c0, #2 + +R00=4021000c R01=4020a5f8 R02=0000000f R03=4020d6c8 +R04=40210018 R05=40210018 R06=4020c000 R07=40002000 +R08=00000000 R09=00000000 R10=00000000 R11=4020d7ec +R12=00000000 R13=4020d6c0 R14=40209118 R15=ffffff00 +PSR=2000011f --C- A sys32 +``` + +Please note that the L4Re OS uses `mvn pc, #0xff` to switch from EL1 to EL2 (system call). +Steps to reproduce: +1. Boot the attached image with the provided command line to trigger the assertion +Additional information: +I will attach the bootstrap image to this ticket. diff --git a/gitlab/issues_text/target_arm/host_missing/accel_TCG/1612 b/gitlab/issues_text/target_arm/host_missing/accel_TCG/1612 new file mode 100644 index 000000000..93eecac2d --- /dev/null +++ b/gitlab/issues_text/target_arm/host_missing/accel_TCG/1612 @@ -0,0 +1,51 @@ +SVE first-faulting gather loads return incorrect data +Description of problem: +The results of `ldff1(b|h|w|d)` seem to be incorrect when `<Zt> == <Zm>`. The first element is duplicated throughout the vector while the FFR indicates that all elements were successfully loaded. This happens since https://gitlab.com/qemu-project/qemu/-/commit/50de9b78cec06e6d16e92a114a505779359ca532, and still happens on the latest master. +Steps to reproduce: +1. This assembly sequence loads data with an `ldff1d` instruction (and also loads the ffr). Note that with `ldff1d`, `<Zt> == <Zm>`. + +asmtest.s +``` + .type asmtest, @function + .balign 16 + .global asmtest +asmtest: + setffr + ptrue p0.d + index z1.d, #0, #1 + ldff1d z1.d, p0/z, [x0, z1.d, LSL #3] + rdffr p1.b + st1d {z1.d}, p0, [x1] + str p1, [x2] + ret +``` + +This harness for convenience intialises some data and checks the element at index 1, which should be 1. + +test.c +``` +#include <arm_sve.h> +#include <stdio.h> + +void asmtest(int64_t const * data, svint64_t * loaded, svbool_t * ffr); + +int main() { + const int64_t data[] = {42, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, + 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, + 22, 23, 24, 25, 26, 27, 28, 29, 30, 31}; + svint64_t loaded; + svbool_t ffr; + + asmtest(data, &loaded, &ffr); + + // Check value of element at index 1 + svuint64_t lanes = svindex_u64(0, 1); + svbool_t lane = svcmpeq_n_u64(svptrue_b64(), lanes, 1); + printf("%ld\n", svaddv_s64(lane, loaded)); +} +``` + +2. ```clang-15 -fuse-ld=lld -march=armv8-a+sve2 -target aarch64-unknown-linux-gnu -static *.c *.s -o svldffgathertest``` +3. ```qemu-aarch64 svldffgathertest``` - the value printed should be 1, but it can be seen that all values in the loaded vector are 42. +Additional information: +The above code was successfully tested on real SVE hardware. Normal gathers work fine in QEMU, as does a non-gather first-fault load. diff --git a/gitlab/issues_text/target_arm/host_missing/accel_TCG/1620 b/gitlab/issues_text/target_arm/host_missing/accel_TCG/1620 new file mode 100644 index 000000000..543af9ee0 --- /dev/null +++ b/gitlab/issues_text/target_arm/host_missing/accel_TCG/1620 @@ -0,0 +1,94 @@ +SME FMOPA outer product instruction gives incorrect result +Description of problem: +The SME outer product instructions operate on tiles of elements. In the +below example we are performing an outer product of a vector of 1.0 +by itself. This naturally should produce a matrix filled with 1.0 +values, however if we read the values of the tile and printf them we +instead observe 0.0 values. + +Without digging into the underlying QEMU code this appears to be a bug +in how elements are set based on the tile number, since the same code +using za0.s rather than za1.s correctly reports all 1.0 values as output +as expected. + +main.c +``` +#include <stdio.h> + +void foo(float *dst); + +int main() { + float dst[16]; + foo(dst); + + // This should print: + // >>> 1.000000 1.000000 1.000000 1.000000 + // >>> 1.000000 1.000000 1.000000 1.000000 + // >>> 1.000000 1.000000 1.000000 1.000000 + // >>> 1.000000 1.000000 1.000000 1.000000 + for (int i=0; i<4; ++i) { + printf(">>> "); + for (int j=0; j<4; ++j) { + printf("%lf ", (double)dst[i * 4 + j]); + } + printf("\n"); + } +} +``` + +foo.S +``` +.global foo +foo: + stp x29, x30, [sp, -80]! + mov x29, sp + stp d8, d9, [sp, 16] + stp d10, d11, [sp, 32] + stp d12, d13, [sp, 48] + stp d14, d15, [sp, 64] + + smstart + + ptrue p0.s, vl4 + fmov z0.s, #1.0 + + // An outer product of a vector of 1.0 by itself should be a matrix of 1.0. + // Note that we are using tile 1 here (za1.s) rather than tile 0. + zero {za} + fmopa za1.s, p0/m, p0/m, z0.s, z0.s + + // Read the first 4x4 sub-matrix of elements from tile 1: + // Note that za1h should be interchangable here. + mov w12, #0 + mova z0.s, p0/m, za1v.s[w12, #0] + mova z1.s, p0/m, za1v.s[w12, #1] + mova z2.s, p0/m, za1v.s[w12, #2] + mova z3.s, p0/m, za1v.s[w12, #3] + + // And store them to the input pointer (dst in the C code): + st1w {z0.s}, p0, [x0] + add x0, x0, #16 + st1w {z1.s}, p0, [x0] + add x0, x0, #16 + st1w {z2.s}, p0, [x0] + add x0, x0, #16 + st1w {z3.s}, p0, [x0] + + smstop + + ldp d8, d9, [sp, 16] + ldp d10, d11, [sp, 32] + ldp d12, d13, [sp, 48] + ldp d14, d15, [sp, 64] + ldp x29, x30, [sp], 80 + ret +``` +Steps to reproduce: +``` +$ clang -target aarch64-linux-gnu -march=armv9-a+sme test.c -O1 -static +$ ~/qemu/build/qemu-aarch64 ./a.out +>>> 0.000000 0.000000 0.000000 0.000000 +>>> 0.000000 0.000000 0.000000 0.000000 +>>> 0.000000 0.000000 0.000000 0.000000 +>>> 0.000000 0.000000 0.000000 0.000000 +``` diff --git a/gitlab/issues_text/target_arm/host_missing/accel_TCG/1658 b/gitlab/issues_text/target_arm/host_missing/accel_TCG/1658 new file mode 100644 index 000000000..d65804dcd --- /dev/null +++ b/gitlab/issues_text/target_arm/host_missing/accel_TCG/1658 @@ -0,0 +1,60 @@ +Zephyr TF-M IPC example triggers failed assertion !arm_feature(env, ARM_FEATURE_M) on recent Qemu +Description of problem: +I can't run the TrustedFirmware-M IPC example in the Zephyr repo with recent Qemu (in particular v8.0.0). + +By bisecting, I got the last commit OK : v7.2.0-351-gfaa1451e7b + +``` +$ qemu-system-arm -M mps2-an521 -device loader,file=tfm_merged.hex -serial stdio +[INF] Beginning TF-M provisioning +[WRN] TFM_DUMMY_PROVISIONING is not suitable for production! This device is NOT SECURE +[Sec Thread] Secure image initializing! +Booting TF-M 8209cb2ed +Creating an empty ITS flash layout. +Creating an empty PS flash layout. +[INF][Crypto] Provisioning entropy seed... complete. +*** Booting Zephyr OS build zephyr-v3.3.0-4041-g7ba5ecf451ef *** +TF-M IPC on mps2_an521_ns +The version of the PSA Framework API is 257. +The PSA Crypto service minor version is 1. +Generating 256 bytes of random data: +71 03 DD 50 8E E5 00 C7 E0 61 7B EB 77 15 E9 38 +E9 A8 7D 0C 51 23 76 9F C3 61 E9 8B 8A 67 BD 14 +73 A3 2C 6E E5 8C E3 19 53 6B 50 55 A8 A7 F4 7B +56 03 60 AA 48 B6 DF 04 33 56 BE 84 43 FA 4E AC +D7 6E 2E 2E 1D 7E 46 69 D5 9B B0 42 5C 54 E4 09 +73 9E 4F 55 F8 3E 05 9E A3 DE 46 D3 E4 02 B0 9C +F3 21 9F 20 85 74 34 07 19 79 07 B8 02 B5 0E 90 +74 21 BE B5 09 4C D7 20 D8 43 F7 72 23 1C F0 3E +77 7B D3 70 29 72 69 D3 7F 1F 61 16 12 73 D5 89 +C5 8B D1 A3 7B 4B FD F5 11 C2 B1 9A C0 A5 F9 7B +16 3D 98 17 66 FE E9 F4 FE 37 76 62 E0 E6 83 99 +69 26 41 CD FF 0C 44 AC F9 F4 91 B8 CA 63 5E 1D +B9 C4 38 D6 0C 11 19 1B 94 BE C9 4F EC 2E 5A 05 +3F 72 5F 41 44 3C 91 39 AC 2D 50 75 DF FD D3 11 +39 F2 43 18 D7 69 B0 A3 99 0C C0 6E 83 84 1A A8 +B0 37 6C 8E 32 B2 8E 4F AA 12 97 09 09 87 D3 FD +qemu-system-arm: terminating on signal 2 +``` + +But after 452c67a427, for example v8.0.0-918-g6972ef1440, I get : + +``` +$ qemu-system-arm -M mps2-an521 -device loader,file=tfm_merged.hex -serial stdio +[INF] Beginning TF-M provisioning +[WRN] TFM_DUMMY_PROVISIONING is not suitable for production! This device is NOT SECURE +[Sec Thread] Secure image initializing! +Booting TF-M 8209cb2ed +Creating an empty ITS flash layout. +Creating an empty PS flash layout. +[INF][Crypto] Provisioning entropy seed... complete. +*** Booting Zephyr OS build zephyr-v3.3.0-4041-g7ba5ecf451ef *** +TF-M IPC on mps2_an521_ns +qemu-system-arm: ../target/arm/cpu.h:2396: arm_is_secure_below_el3: Assertion `!arm_feature(env, ARM_FEATURE_M)' failed. +Aborted +``` +Steps to reproduce: +1. Build the Zephyr tfm_merged.hex file from Zephyr 7ba5ecf451 https://github.com/zephyrproject-rtos/zephyr/commit/7ba5ecf451ef29f96b30dbe5f0e54c1865839093 : ``west -v build -p -b mps2_an521_ns ./samples/tfm_integration/tfm_ipc`` +2. Build qemu-system-arm and run : ``qemu-system-arm -M mps2-an521 -device loader,file=tfm_merged.hex -serial stdio`` +Additional information: +More info to build Zephyr TF-M IPC example on the official repo https://github.com/zephyrproject-rtos/zephyr/tree/main/samples/tfm_integration/tfm_ipc diff --git a/gitlab/issues_text/target_arm/host_missing/accel_TCG/1697 b/gitlab/issues_text/target_arm/host_missing/accel_TCG/1697 new file mode 100644 index 000000000..094982db9 --- /dev/null +++ b/gitlab/issues_text/target_arm/host_missing/accel_TCG/1697 @@ -0,0 +1,19 @@ +qemu-arm -cpu cortex-m55 dummy_test qemu-arm: ../accel/tcg/user-exec.c:492: page_set_flags: Assertion `last <= GUEST_ADDR_MAX' failed. +Description of problem: +Basic testing failed for cortex m55 +Steps to reproduce: +1.Pulled the newest qemu 8.0.50 + +2.Create a Dummy test with only return 0 in main function + +3.run ` arm-none-eabi-gcc -o dummy_test -O2 -g -mcpu=cortex-m55 dummy_test.cc --specs=rdimon.specs` and then `qemu-arm -cpu cortex-m55 dummy_test` + +`arm-none-eabi-gcc (Arm GNU Toolchain 12.2.MPACBTI-Rel1 (Build arm-12-mpacbti.34)) 12.2.1 20230214 +Copyright (C) 2022 Free Software Foundation, Inc. +This is free software; see the source for copying conditions. There is NO +warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.` + +`qemu-arm version 8.0.50 (v8.0.0-1739-g5f9dd6a8ce) +Copyright (c) 2003-2023 Fabrice Bellard and the QEMU Project developers` +Additional information: +It is a known problem in another issues: https://gitlab.com/qemu-project/qemu/-/issues/1528#note_1389268261. diff --git a/gitlab/issues_text/target_arm/host_missing/accel_TCG/1704 b/gitlab/issues_text/target_arm/host_missing/accel_TCG/1704 new file mode 100644 index 000000000..79e3e2358 --- /dev/null +++ b/gitlab/issues_text/target_arm/host_missing/accel_TCG/1704 @@ -0,0 +1,69 @@ +Booting arm64 Linux in TCG mode fails with "ERROR:../tcg/tcg.c:4317:temp_load: code should not be reached" +Description of problem: +Linux seems to boot successfully, but around loading/executing userspace, QEMU crashes with an error: + +``` +[ 4.047919] EXT4-fs (vda): mounted filesystem 59b147ee-5613-43a2-aab4-eaceb6e95be5 with ordered data mode. Quota mode: none. +[ 4.049630] VFS: Mounted root (ext4 filesystem) on device 254:0. +[ 4.055437] devtmpfs: mounted +[ 4.160039] Freeing unused kernel memory: 8256K +[ 4.161855] Run /sbin/init as init process +[ 4.547387] EXT4-fs (vda): re-mounted 59b147ee-5613-43a2-aab4-eaceb6e95be5. Quota mode: none. +** +ERROR:../tcg/tcg.c:4317:temp_load: code should not be reached +Bail out! ERROR:../tcg/tcg.c:4317:temp_load: code should not be reached +zsh: abort /home/mark/.opt/apps/qemu-v8.0.0-1645-ge6dd5e782b/bin/qemu-system-aarch64 -sm +``` +Steps to reproduce: +1. Run the provided qemu commandline +2. Wait for QEMU to crash +Additional information: +I attempted a bisect, which suggests that the first bad commit is: + +``` +[e6dd5e782becfe6d51f3575c086f5bd7162421d0] target/arm: Use tcg_gen_qemu_{ld, st}_i128 in gen_sve_{ld, st}r +``` + +The full bisect log is: + +``` +[mark@lakrids:~/src/qemu]% git bisect log +git bisect start +# good: [f7f686b61cf7ee142c9264d2e04ac2c6a96d37f8] Update version for 8.0.2 release +git bisect good f7f686b61cf7ee142c9264d2e04ac2c6a96d37f8 +# bad: [5f9dd6a8ce3961db4ce47411ed2097ad88bdf5fc] Merge tag 'pull-9p-20230608' of https://github.com/cschoenebeck/qemu into staging +git bisect bad 5f9dd6a8ce3961db4ce47411ed2097ad88bdf5fc +# good: [c1eb2ddf0f8075faddc5f7c3d39feae3e8e9d6b4] Update version for v8.0.0 release +git bisect good c1eb2ddf0f8075faddc5f7c3d39feae3e8e9d6b4 +# good: [1a42d9d472b61e4db2fb16800495d402cb9b94af] tcg/sparc64: Split out tcg_out_movi_s32 +git bisect good 1a42d9d472b61e4db2fb16800495d402cb9b94af +# good: [a30498fcea5a8b9c544324ccfb0186090104b229] tcg/riscv: Support CTZ, CLZ from Zbb +git bisect good a30498fcea5a8b9c544324ccfb0186090104b229 +# good: [759573d05b808344f7047f893d2dd095884dfa4d] test-cutils: Add coverage of qemu_strtod +git bisect good 759573d05b808344f7047f893d2dd095884dfa4d +# good: [dc2a070d125772fe30384596d4d4ce6d9950b004] hw/arm/allwinner-r40: add Clock Control Unit +git bisect good dc2a070d125772fe30384596d4d4ce6d9950b004 +# good: [c0dde5fc5ccce56b69095bc29af72987efd65d1e] accel/tcg: Fix undefined shift in store_whole_le16 +git bisect good c0dde5fc5ccce56b69095bc29af72987efd65d1e +# bad: [e58e55dd8d5777f8a58ce30cfe04a8023282eb80] meson: fix "static build" entry in summary +git bisect bad e58e55dd8d5777f8a58ce30cfe04a8023282eb80 +# bad: [5c13983e23de4095e2dfa8bc52333ef40ebe40db] target/arm: Sink gen_mte_check1 into load/store_exclusive +git bisect bad 5c13983e23de4095e2dfa8bc52333ef40ebe40db +# good: [6c4f229a2e0d6f882bae389ce0c5bdaea712ce0f] tests: avocado: boot_linux_console: Add test case for bpim2u +git bisect good 6c4f229a2e0d6f882bae389ce0c5bdaea712ce0f +# good: [e452ca5af88fc49b3026c2de0f1e65fd18d1a656] target/arm: Introduce finalize_memop_{atom,pair} +git bisect good e452ca5af88fc49b3026c2de0f1e65fd18d1a656 +# good: [d450bd0157be43d273116c3e3617883c8a0ac3d1] target/arm: Use tcg_gen_qemu_{st, ld}_i128 for do_fp_{st, ld} +git bisect good d450bd0157be43d273116c3e3617883c8a0ac3d1 +# bad: [e6dd5e782becfe6d51f3575c086f5bd7162421d0] target/arm: Use tcg_gen_qemu_{ld, st}_i128 in gen_sve_{ld, st}r +git bisect bad e6dd5e782becfe6d51f3575c086f5bd7162421d0 +# good: [e6073d88cc1fb43b00be16f79d9d6b0f9d2276f5] target/arm: Use tcg_gen_qemu_st_i128 for STZG, STZ2G +git bisect good e6073d88cc1fb43b00be16f79d9d6b0f9d2276f5 +# first bad commit: [e6dd5e782becfe6d51f3575c086f5bd7162421d0] target/arm: Use tcg_gen_qemu_{ld, st}_i128 in gen_sve_{ld, st}r +``` + +Each build step was performed with: + +``` + git clean -fdx && ./configure --prefix=/home/mark/.opt/apps/qemu-$(git describe --long HEAD) --enable-debug-info --disable-strip && make -j64 && make install +``` diff --git a/gitlab/issues_text/target_arm/host_missing/accel_TCG/1737 b/gitlab/issues_text/target_arm/host_missing/accel_TCG/1737 new file mode 100644 index 000000000..30dbe2092 --- /dev/null +++ b/gitlab/issues_text/target_arm/host_missing/accel_TCG/1737 @@ -0,0 +1,49 @@ +qemu-aarch64: Incorrect result for ssra instruction when using vector lengths of 1024-bit or higher. +Description of problem: +``` +#include <arm_sve.h> +#include <stdio.h> + +#define SZ 32 + +int main(int argc, char* argv[]) { + svbool_t pg = svptrue_b64(); + uint64_t VL = svcntd(); + + fprintf(stderr, "One SVE vector can hold %li uint64_ts\n", VL); + + int64_t sr[SZ], sx[SZ], sy[SZ]; + uint64_t ur[SZ], ux[SZ], uy[SZ]; + + for (uint64_t i = 0; i < SZ; ++i) { + sx[i] = ux[i] = 0; + sy[i] = uy[i] = 1024; + } + + for (uint64_t i = 0; i < SZ; i+=VL) { + fprintf(stderr, "Processing elements %li - %li\n", i, i + VL - 1); + + svint64_t SX = svld1(pg, sx + i); + svint64_t SY = svld1(pg, sy + i); + svint64_t SR = svsra(SX, SY, 4); + svst1(pg, sr + i, SR); + + svuint64_t UX = svld1(pg, ux + i); + svuint64_t UY = svld1(pg, uy + i); + svuint64_t UR = svsra(UX, UY, 4); + svst1(pg, ur + i, UR); + } + + for (uint64_t i = 0; i < SZ; ++i) { + fprintf(stderr, "sr[%li]=%li, ur[%li]\n", i, sr[i], ur[i]); + } + + return 0; +} +``` +Steps to reproduce: +1. Build the above C source using "gcc -march=armv9-a -O1 ssra.c", can also use clang. +2. Run with "qemu-aarch64 -cpu max,sve-default-vector-length=64 ./a.out" and you'll see the expected result of 64 (signed and unsigned) +3. Run with "qemu-aarch64 -cpu max,sve-default-vector-length=128 ./a.out" and you'll see the expected result of 64 for unsigned but the signed result is 0. This suggests the emulation of SVE2 ssra instruction is incorrect for this and bigger vector lengths. +Additional information: + diff --git a/gitlab/issues_text/target_arm/host_missing/accel_TCG/1740 b/gitlab/issues_text/target_arm/host_missing/accel_TCG/1740 new file mode 100644 index 000000000..d4db06c7d --- /dev/null +++ b/gitlab/issues_text/target_arm/host_missing/accel_TCG/1740 @@ -0,0 +1,73 @@ +QEMU Abort in Cortex-M Exception raising +Description of problem: +When an exception should be raised in a ARM Cortex-M board QEMU aborts. + +``` +$ qemu-system-arm --version +QEMU emulator version 8.0.2 + +$ qemu-system-arm -M stm32vldiscovery -device loader,file=/tmp/raw-hardfault.hex -d in_asm,exec,int +[...] +Trace 0: 0x7f2aa8000680 [00800400/00000110/00000110/ff200000] +---------------- +IN: +0x00000140: f64b 6eef movw lr, #0xbeef +0x00000144: f6cd 6ead movt lr, #0xdead +0x00000148: 4770 bx lr + +Linking TBs 0x7f2aa8000680 index 0 -> 0x7f2aa80007c0 +Trace 0: 0x7f2aa80007c0 [00800400/00000140/00000110/ff200000] +qemu-system-arm: ../qemu-8.0.2/target/arm/cpu.h:2396: arm_is_secure_below_el3: Assertion `!arm_feature(env, ARM_FEATURE_M)' failed. +``` + +Expected behavior: +``` +$ qemu-system-arm --version +QEMU emulator version 7.1.0 + +$ qemu-system-arm -M stm32vldiscovery -device loader,file=raw-hardfault.hex -d in_asm,exec,int +[...] +Trace 0: 0x7fb488000680 [00800400/00000110/00000110/ff000000] +---------------- +IN: +0x00000140: f64b 6eef movw lr, #0xbeef +0x00000144: f6cd 6ead movt lr, #0xdead +0x00000148: 4770 bx lr + +Linking TBs 0x7fb488000680 [00000110] index 0 -> 0x7fb488000780 [00000140] +Trace 0: 0x7fb488000780 [00800400/00000140/00000110/ff000000] +Taking exception 3 [Prefetch Abort] on CPU 0 +...at fault address 0xdeadbeee +...with CFSR.IACCVIOL +...BusFault with BFSR.STKERR +...taking pending nonsecure exception 3 +...loading from element 3 of non-secure vector table at 0xc +...loaded new PC 0x0 +``` +Steps to reproduce: +1. Run any Cortex-M firmware that raises an exception. (minimal example attached) +Additional information: +- Minimal Reproducer: +[raw-hardfault.hex](/uploads/113889116675b608e05748280d1db354/raw-hardfault.hex) +- Assert introduced in fcc7404eff24b4c8b322fb27ca5ae7f3113129c3. +- Stacktrace: +``` +#4 0x00007ffff6a483d6 in __assert_fail () from /usr/lib/libc.so.6 +#5 0x00007ffff73afe67 in arm_is_secure_below_el3 (env=0x55555712f9b0) at target/arm/cpu.h:2396 +#6 0x00007ffff73afedd in arm_is_el2_enabled (env=0x55555712f9b0) at target/arm/cpu.h:2448 +#7 0x00007ffff73afcd4 in arm_el_is_aa64 (env=0x55555712f9b0, el=0x1) at target/arm/cpu.h:2509 +#8 0x00007ffff73af68f in compute_fsr_fsc (env=0x55555712f9b0, fi=0x7fffffff7098, target_el=0x1, mmu_idx=0x1, ret_fsc=0x7fffffff6fe0) + at target/arm/tcg/tlb_helper.c:71 +#9 0x00007ffff73af483 in arm_deliver_fault (cpu=0x55555712d250, addr=0xdeadbeee, access_type=MMU_INST_FETCH, mmu_idx=0x1, fi=0x7fffffff7098) + at target/arm/tcg/tlb_helper.c:114 +#10 0x00007ffff73afa4c in arm_cpu_tlb_fill (cs=0x55555712d250, address=0xdeadbeee, size=0x1, access_type=MMU_INST_FETCH, mmu_idx=0x1, probe=0x0, retaddr=0x0) + at target/arm/tcg/tlb_helper.c:242 +#11 0x00007ffff74a3a1e in probe_access_internal (env=0x55555712f9b0, addr=0xdeadbeee, fault_size=0x1, access_type=MMU_INST_FETCH, mmu_idx=0x1, nonfault=0x0, phost=0x7fffffff71c8, + pfull=0x7fffffff71d0, retaddr=0x0) at accel/tcg/cputlb.c:1555 +#12 0x00007ffff74a4085 in get_page_addr_code_hostp (env=0x55555712f9b0, addr=0xdeadbeee, hostp=0x0) at accel/tcg/cputlb.c:1694 +#13 0x00007ffff7490c0f in get_page_addr_code (env=0x55555712f9b0, addr=0xdeadbeee) at include/exec/exec-all.h:748 +#14 0x00007ffff7490b2a in tb_htable_lookup (cpu=0x55555712d250, pc=0xdeadbeee, cs_base=0x800408, flags=0x110, cflags=0xff200200) at accel/tcg/cpu-exec.c:233 +#15 0x00007ffff748f719 in tb_lookup (cpu=0x55555712d250, pc=0xdeadbeee, cs_base=0x800408, flags=0x110, cflags=0xff200200) at accel/tcg/cpu-exec.c:270 +#16 0x00007ffff748f463 in helper_lookup_tb_ptr (env=0x55555712f9b0) at accel/tcg/cpu-exec.c:425 +#17 0x00007fff6800091c in code_gen_buffer () +``` diff --git a/gitlab/issues_text/target_arm/host_missing/accel_TCG/1742 b/gitlab/issues_text/target_arm/host_missing/accel_TCG/1742 new file mode 100644 index 000000000..a1e96d779 --- /dev/null +++ b/gitlab/issues_text/target_arm/host_missing/accel_TCG/1742 @@ -0,0 +1,95 @@ +Arm64 kernel run with qemu-system-aarch64 crashes handling program using SVE and Streaming SVE modes +Description of problem: +The userspace program shown, which switches between SVE/SME states, crashes the kernel on task switch when running under qemu-system-aarch64. This does not reproduce on an Arm Fast Model, but I can't be sure that that is not a timing difference. + +The kernel appears to have no space allocated to save SVE state for this process, but also believes that it should save the state, where it then faults. +Steps to reproduce: +1. Compile the following program: +``` +#include <sys/prctl.h> + +int main() { + asm volatile("msr s0_3_c4_c7_3, xzr" /*smstart*/); + prctl(PR_SVE_SET_VL, 8 * 4); + asm volatile("msr s0_3_c4_c7_3, xzr" /*smstart*/); + while (1) {} // Wait to be preempted? + return 0; +} +``` +With: +``` +$ aarch64-unknown-linux-gnu-gcc main.c -o main.o -g -O3 -march=armv8.6-a+sve +``` +Compiler version does not matter I don't think, but in case: +``` +$ aarch64-unknown-linux-gnu-gcc --version +aarch64-unknown-linux-gnu-gcc (crosstool-NG 1.25.0.85_61c4cca) 10.4.0 +``` +It is a 10.4.0 built with CrossToolNG. + +2. Boot Linux and run the program in the emulated environment. I've found looping it to be more consistent: +``` +$ while true; do ./main.o; done +``` +Though sometimes it will crash after only one run. +Additional information: +Here is the output from the kernel: +``` +$ /mnt/virt_root/sme_crash/main.o +[ 190.813392] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 +[ 190.818912] Mem abort info: +[ 190.819255] ESR = 0x0000000096000046 +[ 190.819727] EC = 0x25: DABT (current EL), IL = 32 bits +[ 190.820391] SET = 0, FnV = 0 +[ 190.820757] EA = 0, S1PTW = 0 +[ 190.821145] FSC = 0x06: level 2 translation fault +[ 190.821635] Data abort info: +[ 190.821978] ISV = 0, ISS = 0x00000046, ISS2 = 0x00000000 +[ 190.822490] CM = 0, WnR = 1, TnD = 0, TagAccess = 0 +[ 190.822991] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 +[ 190.823645] user pgtable: 4k pages, 48-bit VAs, pgdp=00000000475f1000 +[ 190.824269] [0000000000000000] pgd=0800000047645003, p4d=0800000047645003, pud=0800000047641003, pmd=0000000000000000 +[ 190.826225] Internal error: Oops: 0000000096000046 [#1] PREEMPT SMP +[ 190.826996] Modules linked in: +[ 190.827748] CPU: 0 PID: 198 Comm: main.o Not tainted 6.4.0-01761-g6aeadf7896bf #1 +[ 190.828638] Hardware name: linux,dummy-virt (DT) +[ 190.829304] pstate: 234000c5 (nzCv daIF +PAN -UAO +TCO +DIT -SSBS BTYPE=--) +[ 190.830115] pc : sve_save_state+0x4/0xf0 +[ 190.831378] lr : fpsimd_save+0x184/0x1f0 +[ 190.831848] sp : ffff80008047bc70 +[ 190.832223] x29: ffff80008047bc70 x28: ffff0000036c49c0 x27: 0000000000000000 +[ 190.833182] x26: ffff0000036c4f58 x25: ffff0000036c49c0 x24: ffff0000036c5868 +[ 190.834045] x23: 0000000000000020 x22: ffff24441ea31000 x21: 0000000000000001 +[ 190.834894] x20: ffff00003fdc50b0 x19: ffffdbbc213940b0 x18: 0000000000000000 +[ 190.835759] x17: ffff24441ea31000 x16: ffff800080000000 x15: 0000000000000000 +[ 190.836593] x14: 000000000000026c x13: 0000000000000001 x12: 0000000000000020 +[ 190.837436] x11: 0000000000000000 x10: 0000000000000001 x9 : 0000000000000800 +[ 190.838323] x8 : ffff00003fdcffc0 x7 : ffff00003fdcff40 x6 : 0000000002da9c8c +[ 190.839149] x5 : 0000000000000001 x4 : 0000000000000000 x3 : 0000000000000000 +[ 190.839976] x2 : 0000000000000001 x1 : ffff0000036c56a0 x0 : 0000000000000440 +[ 190.840936] Call trace: +[ 190.841406] sve_save_state+0x4/0xf0 +[ 190.841993] fpsimd_thread_switch+0x24/0xd4 +[ 190.842572] __switch_to+0x20/0x1d4 +[ 190.843043] __schedule+0x2a0/0xa7c +[ 190.843488] schedule+0x5c/0xc4 +[ 190.843912] do_notify_resume+0x1a4/0x474 +[ 190.844410] el0_interrupt+0xc4/0xd4 +[ 190.844855] __el0_irq_handler_common+0x18/0x24 +[ 190.845350] el0t_64_irq_handler+0x10/0x1c +[ 190.845824] el0t_64_irq+0x190/0x194 +[ 190.846661] Code: 54000040 d51b4408 d65f03c0 d503245f (e5bb5800) +[ 190.847545] ---[ end trace 0000000000000000 ]--- +[ 190.848125] note: main.o[198] exited with irqs disabled +``` + +I have looked the kernel functions in the backtrace and it seems to be loading memory fine, so it's not obviously a code generation problem. The pointer loaded prior to the crash is definitely a nullptr. + +Removing any of the lines (`while (1) {}` aside) from the example seems to avoid the issue but again, could be timing. + +An important point here is that the kernel syscall ABI states that streaming mode will be exited on +a syscall. I have observed that this does happen as expected. This is why the test case does a syscall, then immediately goes back to streaming mode. And it is perhaps where the confusion starts. + +I have confirmed that SME is supported by the emulated CPU and other SME programs do run correctly. + +I initially thought this was to do with having many cores, but it reproduces on a single core also. diff --git a/gitlab/issues_text/target_arm/host_missing/accel_TCG/1790 b/gitlab/issues_text/target_arm/host_missing/accel_TCG/1790 new file mode 100644 index 000000000..60af2f7bc --- /dev/null +++ b/gitlab/issues_text/target_arm/host_missing/accel_TCG/1790 @@ -0,0 +1,29 @@ +[AARCH64] STGP instruction is not writing the value of the second register to memory +Description of problem: +My application is built with Clang 16 and the option -fsanitize=memtag-stack. +It means the the MTE protection is activated for the stack. +The local variables are tagged and the compiler is often using the STGP instruction "Store Allocation Tag and Pair of registers" in order to transfer the value of two 64-bit registers to memory. +The following instruction was not working as expected: + 18004: 69000895 stgp x21, x2, [x4] +The value of the second register x2 is not transferred to the memory. +Only x21 is written. + +I think that the issue is in trans_STGP(). +We don't call finalize_memop_pair() like we do for in the general trans_STP(). + +``` +diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c +index 7d0c8f79a7..f599f3e136 100644 +--- a/target/arm/tcg/translate-a64.c ++++ b/target/arm/tcg/translate-a64.c +@@ -3034,6 +3034,8 @@ static bool trans_STGP(DisasContext *s, arg_ldstpair *a) + + tcg_rt = cpu_reg(s, a->rt); + tcg_rt2 = cpu_reg(s, a->rt2); ++ mop = a->sz + 1; ++ mop = finalize_memop_pair(s, mop); + + assert(a->sz == 3); +``` + +With this fix, my OS (Kinibi) is now able to boot. diff --git a/gitlab/issues_text/target_arm/host_missing/accel_TCG/1799 b/gitlab/issues_text/target_arm/host_missing/accel_TCG/1799 new file mode 100644 index 000000000..3c3cfe609 --- /dev/null +++ b/gitlab/issues_text/target_arm/host_missing/accel_TCG/1799 @@ -0,0 +1,177 @@ +Support running real-world Android on Arm by supporting one-register list for the POP (LDMIA) Thumb32 instruction. +Steps to reproduce: +1. Get any aarch64 Linux on QEMU for x86_64 running. Make sure that Wayland is running. (For example, build PostmarketOS with "phosh" for aarch64 and install it.) +2. Install waydroid (e.g. `apk add waydroid`). +3. Install the LineageOS 18.1 for waydroid image (e.g. `waydroid init`). +4. Run the waydroid-container (e.g. `rc-service waydroid-container restart`). +5. Start the waydroid session (e.g. click on the "Waydroid" symbol on the graphical user interface). +6. Observe the waydroid log file (e.g. run `waydroid logcat`). +Additional information: +The output of the Android log (using `waydroid logcat`) will be akin: + +``` +23908 23908 D AndroidRuntime: >>>>>> START com.android.internal.os.ZygoteInit uid 0 <<<<<< +23908 23908 I AndroidRuntime: Using default boot image +23908 23908 I AndroidRuntime: Leaving lock profiling enabled +23908 23908 E cutils-trace: Error opening trace file: No such file or directory (2) +23908 23908 I zygote : option[0]=-Xzygote +23908 23908 I zygote : option[1]=exit +23908 23908 I zygote : option[2]=vfprintf +23908 23908 I zygote : option[3]=sensitiveThread +23908 23908 I zygote : option[4]=-verbose:gc +23908 23908 I zygote : option[5]=-XX:PerfettoHprof=true +23908 23908 I zygote : option[6]=-Xms8m +23908 23908 I zygote : option[7]=-Xmx512m +23908 23908 I zygote : option[8]=-XX:HeapGrowthLimit=192m +23908 23908 I zygote : option[9]=-XX:HeapMinFree=8m +23908 23908 I zygote : option[10]=-XX:HeapMaxFree=16m +23908 23908 I zygote : option[11]=-XX:HeapTargetUtilization=0.6 +23908 23908 I zygote : option[12]=-Xusejit:true +23908 23908 I zygote : option[13]=-Xjitsaveprofilinginfo +23908 23908 I zygote : option[14]=-XjdwpOptions:suspend=n,server=y +23908 23908 I zygote : option[15]=-XjdwpProvider:default +23908 23908 I zygote : option[16]=-Xopaque-jni-ids:swapable +23908 23908 I zygote : option[17]=-Xlockprofthreshold:500 +23908 23908 I zygote : option[18]=-Xcompiler-option +23908 23908 I zygote : option[19]=--instruction-set-variant=generic +23908 23908 I zygote : option[20]=-Xcompiler-option +23908 23908 I zygote : option[21]=--instruction-set-features=default +23908 23908 I zygote : option[22]=-Xcompiler-option +23908 23908 I zygote : option[23]=--generate-mini-debug-info +23908 23908 I zygote : option[24]=-Ximage-compiler-option +23908 23908 I zygote : option[25]=--runtime-arg +23908 23908 I zygote : option[26]=-Ximage-compiler-option +23908 23908 I zygote : option[27]=-Xms64m +23908 23908 I zygote : option[28]=-Ximage-compiler-option +23908 23908 I zygote : option[29]=--runtime-arg +23908 23908 I zygote : option[30]=-Ximage-compiler-option +23908 23908 I zygote : option[31]=-Xmx64m +23908 23908 I zygote : option[32]=-Ximage-compiler-option +23908 23908 I zygote : option[33]=--dirty-image-objects=/system/etc/dirty-image-objects +23908 23908 I zygote : option[34]=-Ximage-compiler-option +23908 23908 I zygote : option[35]=--instruction-set-variant=generic +23908 23908 I zygote : option[36]=-Ximage-compiler-option +23908 23908 I zygote : option[37]=--instruction-set-features=default +23908 23908 I zygote : option[38]=-Ximage-compiler-option +23908 23908 I zygote : option[39]=--generate-mini-debug-info +23908 23908 I zygote : option[40]=-Duser.locale=en-US +23908 23908 I zygote : option[41]=--cpu-abilist=armeabi-v7a,armeabi +23908 23908 I zygote : option[42]=-Xcore-platform-api-policy:just-warn +23908 23908 I zygote : option[43]=-Xfingerprint:waydroid/lineage_waydroid_arm64/waydroid_arm64:11/RQ3A.211001.001/48:userdebug/test-keys +23908 23908 I zygote : Core platform API reporting enabled, enforcing=false +23908 23908 D zygote : Time zone APEX ICU file found: /apex/com.android.tzdata/etc/icu/icu_tzdata.dat +23908 23908 D zygote : I18n APEX ICU file found: /apex/com.android.i18n/etc/icu/icudt66l.dat +23908 23908 I zygote : Using memfd for future sealing +23908 23908 W zygote : Using default instruction set features for ARM CPU variant (generic) using conservative defaults + 49 49 I tombstoned: received crash request for pid 23908 +23908 23908 F DEBUG : *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** +23908 23908 F DEBUG : LineageOS Version: '18.1-20230723-VANILLA-waydroid_arm64' +23908 23908 F DEBUG : Build fingerprint: 'waydroid/lineage_waydroid_arm64/waydroid_arm64:11/RQ3A.211001.001/48:userdebug/test-keys' +23908 23908 F DEBUG : Revision: '0' +23908 23908 F DEBUG : ABI: 'arm' +23908 23908 F DEBUG : Timestamp: 2023-07-28 14:13:34+0000 +23908 23908 F DEBUG : pid: 23908, tid: 23908, name: main >>> zygote <<< +23908 23908 F DEBUG : uid: 0 +23908 23908 F DEBUG : signal 4 (SIGILL), code 1 (ILL_ILLOPC), fault addr 0x709443da (*pc=0x4000e8bd) +23908 23908 F DEBUG : r0 54647764 r1 3fb9709b r2 fffffe56 r3 4337ffff +23908 23908 F DEBUG : r4 707184b0 r5 3fdaaaaa r6 f295837e r7 00000001 +23908 23908 F DEBUG : r8 00000000 r9 f7986e00 r10 ffa33320 r11 ffa332e4 +23908 23908 F DEBUG : ip e9930ba4 sp ffa332cc lr 709443d5 pc 709443da +23908 23908 F DEBUG : +23908 23908 F DEBUG : backtrace: +23908 23908 F DEBUG : #00 pc 0007e3da /apex/com.android.art/javalib/arm/boot.oat (art_jni_trampoline+34) (BuildId: 4af94ec040111dd87be55d34780e36769428675c) +23908 23908 F DEBUG : #01 pc 000d39d5 /apex/com.android.art/lib/libart.so (art_quick_invoke_stub_internal+68) (BuildId: d0f40e4862987997ffa9c0a264e61174) +23908 23908 F DEBUG : #02 pc 004f0759 /apex/com.android.art/lib/libart.so (art_quick_invoke_static_stub+276) (BuildId: d0f40e4862987997ffa9c0a264e61174) +23908 23908 F DEBUG : #03 pc 0012ca93 /apex/com.android.art/lib/libart.so (art::ArtMethod::Invoke(art::Thread*, unsigned int*, unsigned int, art::JValue*, char const*)+166) (BuildId: d0f40e4862987997ffa9c0a264e61174) +23908 23908 F DEBUG : #04 pc 00240bbf /apex/com.android.art/lib/libart.so (art::interpreter::ArtInterpreterToCompiledCodeBridge(art::Thread*, art::ArtMethod*, art::ShadowFrame*, unsigned short, art::JValue*)+254) (BuildId: d0f40e4862987997ffa9c0a264e61174) +23908 23908 F DEBUG : #05 pc 002388df /apex/com.android.art/lib/libart.so (bool art::interpreter::DoCall<false, false>(art::ArtMethod*, art::Thread*, art::ShadowFrame&, art::Instruction const*, unsigned short, art::JValue*)+746) (BuildId: d0f40e4862987997ffa9c0a264e61174) +23908 23908 F DEBUG : #06 pc 004e44db /apex/com.android.art/lib/libart.so (MterpInvokeStatic+482) (BuildId: d0f40e4862987997ffa9c0a264e61174) +23908 23908 F DEBUG : #07 pc 000ce594 /apex/com.android.art/lib/libart.so (mterp_op_invoke_static+20) (BuildId: d0f40e4862987997ffa9c0a264e61174) +23908 23908 F DEBUG : #08 pc 003bdaa0 /system/framework/framework.jar +23908 23908 F DEBUG : #09 pc 0023182b /apex/com.android.art/lib/libart.so (art::interpreter::Execute(art::Thread*, art::CodeItemDataAccessor const&, art::ShadowFrame&, art::JValue, bool, bool) (.llvm.10727712076471079728)+254) (BuildId: d0f40e4862987997ffa9c0a264e61174) +23908 23908 F DEBUG : #10 pc 00238109 /apex/com.android.art/lib/libart.so (art::interpreter::ArtInterpreterToInterpreterBridge(art::Thread*, art::CodeItemDataAccessor const&, art::ShadowFrame*, art::JValue*)+144) (BuildId: d0f40e4862987997ffa9c0a264e61174) +23908 23908 F DEBUG : #11 pc 00239581 /apex/com.android.art/lib/libart.so (bool art::interpreter::DoCall<true, false>(art::ArtMethod*, art::Thread*, art::ShadowFrame&, art::Instruction const*, unsigned short, art::JValue*)+536) (BuildId: d0f40e4862987997ffa9c0a264e61174) +23908 23908 F DEBUG : #12 pc 004e7239 /apex/com.android.art/lib/libart.so (MterpInvokeStaticRange+372) (BuildId: d0f40e4862987997ffa9c0a264e61174) +23908 23908 F DEBUG : #13 pc 000ce894 /apex/com.android.art/lib/libart.so (mterp_op_invoke_static_range+20) (BuildId: d0f40e4862987997ffa9c0a264e61174) +23908 23908 F DEBUG : #14 pc 003bd9d4 /system/framework/framework.jar +23908 23908 F DEBUG : #15 pc 0023182b /apex/com.android.art/lib/libart.so (art::interpreter::Execute(art::Thread*, art::CodeItemDataAccessor const&, art::ShadowFrame&, art::JValue, bool, bool) (.llvm.10727712076471079728)+254) (BuildId: d0f40e4862987997ffa9c0a264e61174) +23908 23908 F DEBUG : #16 pc 00238109 /apex/com.android.art/lib/libart.so (art::interpreter::ArtInterpreterToInterpreterBridge(art::Thread*, art::CodeItemDataAccessor const&, art::ShadowFrame*, art::JValue*)+144) (BuildId: d0f40e4862987997ffa9c0a264e61174) +23908 23908 F DEBUG : #17 pc 00239581 /apex/com.android.art/lib/libart.so (bool art::interpreter::DoCall<true, false>(art::ArtMethod*, art::Thread*, art::ShadowFrame&, art::Instruction const*, unsigned short, art::JValue*)+536) (BuildId: d0f40e4862987997ffa9c0a264e61174) +23908 23908 F DEBUG : #18 pc 004e7239 /apex/com.android.art/lib/libart.so (MterpInvokeStaticRange+372) (BuildId: d0f40e4862987997ffa9c0a264e61174) +23908 23908 F DEBUG : #19 pc 000ce894 /apex/com.android.art/lib/libart.so (mterp_op_invoke_static_range+20) (BuildId: d0f40e4862987997ffa9c0a264e61174) +23908 23908 F DEBUG : #20 pc 003bc286 /system/framework/framework.jar +23908 23908 F DEBUG : #21 pc 0023182b /apex/com.android.art/lib/libart.so (art::interpreter::Execute(art::Thread*, art::CodeItemDataAccessor const&, art::ShadowFrame&, art::JValue, bool, bool) (.llvm.10727712076471079728)+254) (BuildId: d0f40e4862987997ffa9c0a264e61174) +23908 23908 F DEBUG : #22 pc 00238109 /apex/com.android.art/lib/libart.so (art::interpreter::ArtInterpreterToInterpreterBridge(art::Thread*, art::CodeItemDataAccessor const&, art::ShadowFrame*, art::JValue*)+144) (BuildId: d0f40e4862987997ffa9c0a264e61174) +23908 23908 F DEBUG : #23 pc 002388c7 /apex/com.android.art/lib/libart.so (bool art::interpreter::DoCall<false, false>(art::ArtMethod*, art::Thread*, art::ShadowFrame&, art::Instruction const*, unsigned short, art::JValue*)+722) (BuildId: d0f40e4862987997ffa9c0a264e61174) +23908 23908 F DEBUG : #24 pc 004e44db /apex/com.android.art/lib/libart.so (MterpInvokeStatic+482) (BuildId: d0f40e4862987997ffa9c0a264e61174) +23908 23908 F DEBUG : #25 pc 000ce594 /apex/com.android.art/lib/libart.so (mterp_op_invoke_static+20) (BuildId: d0f40e4862987997ffa9c0a264e61174) +23908 23908 F DEBUG : #26 pc 003b1c7c /system/framework/framework.jar +23908 23908 F DEBUG : #27 pc 0023182b /apex/com.android.art/lib/libart.so (art::interpreter::Execute(art::Thread*, art::CodeItemDataAccessor const&, art::ShadowFrame&, art::JValue, bool, bool) (.llvm.10727712076471079728)+254) (BuildId: d0f40e4862987997ffa9c0a264e61174) +23908 23908 F DEBUG : #28 pc 0023803d /apex/com.android.art/lib/libart.so (art::interpreter::EnterInterpreterFromEntryPoint(art::Thread*, art::CodeItemDataAccessor const&, art::ShadowFrame*)+120) (BuildId: d0f40e4862987997ffa9c0a264e61174) +23908 23908 F DEBUG : #29 pc 004d321b /apex/com.android.art/lib/libart.so (artQuickToInterpreterBridge+686) (BuildId: d0f40e4862987997ffa9c0a264e61174) +23908 23908 F DEBUG : #30 pc 000d8561 /apex/com.android.art/lib/libart.so (art_quick_to_interpreter_bridge+32) (BuildId: d0f40e4862987997ffa9c0a264e61174) +23908 23908 F DEBUG : #31 pc 0042dbaf /system/framework/arm/boot-framework.oat (android.graphics.ColorSpace$Rgb.isSrgb+446) (BuildId: 7ce3c24f3f20164927036fc8f58e1baa2a8f4020) +23908 23908 F DEBUG : #32 pc 0042cddf /system/framework/arm/boot-framework.oat (android.graphics.ColorSpace$Rgb.<init>+822) (BuildId: 7ce3c24f3f20164927036fc8f58e1baa2a8f4020) +23908 23908 F DEBUG : #33 pc 000d39d5 /apex/com.android.art/lib/libart.so (art_quick_invoke_stub_internal+68) (BuildId: d0f40e4862987997ffa9c0a264e61174) +23908 23908 F DEBUG : #34 pc 004f0627 /apex/com.android.art/lib/libart.so (art_quick_invoke_stub+282) (BuildId: d0f40e4862987997ffa9c0a264e61174) +23908 23908 F DEBUG : #35 pc 0012ca81 /apex/com.android.art/lib/libart.so (art::ArtMethod::Invoke(art::Thread*, unsigned int*, unsigned int, art::JValue*, char const*)+148) (BuildId: d0f40e4862987997ffa9c0a264e61174) +23908 23908 F DEBUG : #36 pc 00240bbf /apex/com.android.art/lib/libart.so (art::interpreter::ArtInterpreterToCompiledCodeBridge(art::Thread*, art::ArtMethod*, art::ShadowFrame*, unsigned short, art::JValue*)+254) (BuildId: d0f40e4862987997ffa9c0a264e61174) +23908 23908 F DEBUG : #37 pc 00239597 /apex/com.android.art/lib/libart.so (bool art::interpreter::DoCall<true, false>(art::ArtMethod*, art::Thread*, art::ShadowFrame&, art::Instruction const*, unsigned short, art::JValue*)+558) (BuildId: d0f40e4862987997ffa9c0a264e61174) +23908 23908 F DEBUG : #38 pc 004e6b7d /apex/com.android.art/lib/libart.so (MterpInvokeDirectRange+392) (BuildId: d0f40e4862987997ffa9c0a264e61174) +23908 23908 F DEBUG : #39 pc 000ce814 /apex/com.android.art/lib/libart.so (mterp_op_invoke_direct_range+20) (BuildId: d0f40e4862987997ffa9c0a264e61174) +23908 23908 F DEBUG : #40 pc 003bce74 /system/framework/framework.jar +23908 23908 F DEBUG : #41 pc 004e6cdd /apex/com.android.art/lib/libart.so (MterpInvokeDirectRange+744) (BuildId: d0f40e4862987997ffa9c0a264e61174) +23908 23908 F DEBUG : #42 pc 000ce814 /apex/com.android.art/lib/libart.so (mterp_op_invoke_direct_range+20) (BuildId: d0f40e4862987997ffa9c0a264e61174) +23908 23908 F DEBUG : #43 pc 003bce8c /system/framework/framework.jar +23908 23908 F DEBUG : #44 pc 004e6cdd /apex/com.android.art/lib/libart.so (MterpInvokeDirectRange+744) (BuildId: d0f40e4862987997ffa9c0a264e61174) +23908 23908 F DEBUG : #45 pc 000ce814 /apex/com.android.art/lib/libart.so (mterp_op_invoke_direct_range+20) (BuildId: d0f40e4862987997ffa9c0a264e61174) +23908 23908 F DEBUG : #46 pc 003be6b6 /system/framework/framework.jar +23908 23908 F DEBUG : #47 pc 0023182b /apex/com.android.art/lib/libart.so (art::interpreter::Execute(art::Thread*, art::CodeItemDataAccessor const&, art::ShadowFrame&, art::JValue, bool, bool) (.llvm.10727712076471079728)+254) (BuildId: d0f40e4862987997ffa9c0a264e61174) +23908 23908 F DEBUG : #48 pc 0023803d /apex/com.android.art/lib/libart.so (art::interpreter::EnterInterpreterFromEntryPoint(art::Thread*, art::CodeItemDataAccessor const&, art::ShadowFrame*)+120) (BuildId: d0f40e4862987997ffa9c0a264e61174) +23908 23908 F DEBUG : #49 pc 004d321b /apex/com.android.art/lib/libart.so (artQuickToInterpreterBridge+686) (BuildId: d0f40e4862987997ffa9c0a264e61174) +23908 23908 F DEBUG : #50 pc 000d8561 /apex/com.android.art/lib/libart.so (art_quick_to_interpreter_bridge+32) (BuildId: d0f40e4862987997ffa9c0a264e61174) +23908 23908 F DEBUG : #51 pc 000d39d5 /apex/com.android.art/lib/libart.so (art_quick_invoke_stub_internal+68) (BuildId: d0f40e4862987997ffa9c0a264e61174) +23908 23908 F DEBUG : #52 pc 004f0759 /apex/com.android.art/lib/libart.so (art_quick_invoke_static_stub+276) (BuildId: d0f40e4862987997ffa9c0a264e61174) +23908 23908 F DEBUG : #53 pc 0012ca93 /apex/com.android.art/lib/libart.so (art::ArtMethod::Invoke(art::Thread*, unsigned int*, unsigned int, art::JValue*, char const*)+166) (BuildId: d0f40e4862987997ffa9c0a264e61174) +``` + + +Analyzing with `gdb` (by repeatedly calling `gdb -p "$(ps xua | grep zygote | grep -v grep | grep -v zygote64 | awk {'print $2'})"` until `gdb` attaches earlier to the current `zygote` process than the offending instruction is reached) reveals that the crash happens here: + +``` + 0x6fc373b0 <+944>: cmp r3, #223 @ 0xdf + 0x6fc373b2 <+946>: movs r6, r0 + 0x6fc373b4 <+948>: movs r0, r5 + 0x6fc373b6 <+950>: movs r0, r0 + 0x6fc373b8 <+952>: push {lr} + 0x6fc373ba <+954>: sub sp, #4 + 0x6fc373bc <+956>: vstr d0, [sp, #12] + 0x6fc373c0 <+960>: vstr d1, [sp, #20] + 0x6fc373c4 <+964>: mov r4, r0 + 0x6fc373c6 <+966>: ldr r2, [sp, #20] + 0x6fc373c8 <+968>: ldr r3, [sp, #24] + 0x6fc373ca <+970>: ldr r0, [sp, #12] + 0x6fc373cc <+972>: ldr r1, [sp, #16] + 0x6fc373ce <+974>: ldr.w r12, [r4, #20] + 0x6fc373d2 <+978>: blx r12 + 0x6fc373d4 <+980>: vmov d0, r0, r1 + 0x6fc373d8 <+984>: add sp, #4 +=> 0x6fc373da <+986>: ldmia.w sp!, {lr} + 0x6fc373de <+990>: bx lr +``` + +(note that the actual address changes for every instance of `zygote`, probably due to address-space layout randomization) + +The instruction at this location is 0xe8bd4000, as evidenced by: + +``` +(gdb) x/16hx 0x6fc373da +0x6fc373da <oatexec+986>: 0xe8bd 0x4000 0x4770 0x2c0f 0x0006 0x0020 0x0000 0xb500 +0x6fc373ea <oatexec+1002>: 0xb081 0xed8d 0x0b03 0x4604 0x9803 0x9904 0xf8d4 0xc014 +``` + +The disassembly into `ldmia.w sp!, {lr}` is indeed correct. However, such an instruction [would be assembled](https://developer.arm.com/documentation/ddi0308/d/Thumb-Instructions/Alphabetical-list-of-Thumb-instructions/POP?lang=en) into `pop lr` and then into `ldr.w lr,[sp,#-4]`, which would be encoded differently. Hence, the assembly into this instruction was incorrect in the first place. + +It turns out that the assembly error is due to an error in the [`vixl` ARMv8 Runtime Code Generation Library](https://github.com/Linaro/vixl), which is also used by Android. This error [has been fixed by Feb 9, 2021](https://github.com/Linaro/vixl/commit/b0a2e281aebbf93e6ee521dcc40ba6dd2aa5124d). However, this fix has [not made it into Android 13](https://android.googlesource.com/platform/external/vixl/+log/02ab12aafeb5278d89184ae6a3ff3a7883b34c5e). Thus, at least Android 11, Android 12, Android 13 cannot run on current `qemu-system-aarch64`, while it should. + +Users of the Android emulator (also based on QEMU) do not seem to suffer from this bug because the Android QEMU [has bitrotted since the year 2018](https://android.googlesource.com/platform/external/qemu/+log/e7390f2265257d66093dfe858ce3a47b2e1de539/target/arm/translate.c) and hence has not seen any Arm emulation modernization in QEMU (e.g. the Tiny Code Generator) since, and only this modernization has exposed this bug in the first place. diff --git a/gitlab/issues_text/target_arm/host_missing/accel_TCG/1812 b/gitlab/issues_text/target_arm/host_missing/accel_TCG/1812 new file mode 100644 index 000000000..b80e492cc --- /dev/null +++ b/gitlab/issues_text/target_arm/host_missing/accel_TCG/1812 @@ -0,0 +1,25 @@ +older programs running under qemu-aarch64 segfaults +Description of problem: +Numerous aarch64 programs segfaults when run under qemu-aarch64. +Steps to reproduce: +1. Install an arm64 chroot (with working qemu-aarch64 binfmt_misc setup): +``` +debootstrap --variant=minbase --arch=arm64 jessie /tmp/jessie-arm64/ http://archive.debian.org/debian +or +debootstrap --variant=minbase --arch=arm64 xenial /tmp/xenial-arm64/ http://ports.ubuntu.com/ +``` +2. build qemu-aarch64; cp qemu-aarch64 /tmp/jessie-arm64/ +3. chroot /tmp/jessie-arm64/ +4. ./qemu-aarch64 /bin/ls +``` +qemu: uncaught target signal 11 (Segmentation fault) - core dumped +Segmentation fault +``` +Additional information: +Old userspace (eg Debian jessie, Ubuntu xenial) does not work within qemu 8.1-rc2 aarch64 linux-user emulation, since commit 59b6b42cd3446862567637f3a7ab31d69c9bef51 . My guess is that old userspace isn't prepared for recent CPU features, but it still smells strange. + +Not all programs segfaults. dash works, ls or bash does not. + +A chroot is easier in this case, since many old programs don't run inside current environment, like asserting while reading locale-specific information. To run debootstrap and to enter the resulting chroot, a working qemu-aarch64 binfmt_misc setup is needed. + +Reverting the mentioned commit makes everything work again. diff --git a/gitlab/issues_text/target_arm/host_missing/accel_TCG/1833 b/gitlab/issues_text/target_arm/host_missing/accel_TCG/1833 new file mode 100644 index 000000000..a74f7bdb9 --- /dev/null +++ b/gitlab/issues_text/target_arm/host_missing/accel_TCG/1833 @@ -0,0 +1,84 @@ +ARM64 SME ST1Q incorrectly stores 9 bytes (rather than 16) per 128-bit element +Description of problem: +QEMU incorrectly stores 9 bytes instead of 16 per 128-bit element in the ST1Q SME instruction (https://developer.arm.com/documentation/ddi0602/2022-06/SME-Instructions/ST1Q--Contiguous-store-of-quadwords-from-128-bit-element-ZA-tile-slice-). It copies the first byte of the upper 64-bits, then lower the 64-bits. + +This seems to be a simple issue; I tracked it down to: +https://gitlab.com/qemu-project/qemu/-/blob/master/target/arm/tcg/sme_helper.c?ref_type=heads#L382 + +Updating that `+ 1` to a `+ 8` fixes the problem. +Steps to reproduce: +```c +#include <stdio.h> +#include <stdint.h> +#include <string.h> + +void st1q_sme_copy_test(uint8_t* src, uint8_t* dest) { + asm volatile( + "smstart sm\n" + "smstart za\n" + "ptrue p0.b\n" + "mov x12, xzr\n" + "ld1q {za0h.q[w12, 0]}, p0/z, %0\n" + "st1q {za0h.q[w12, 0]}, p0, %1\n" + "smstop za\n" + "smstop sm\n" : : "m"(*src), "m"(*dest) : "w12", "p0"); +} + +void print_first_128(uint8_t* data) { + putchar('['); + for (int i = 0; i < 16; i++) { + printf("%02d", data[i]); + if (i != 15) + printf(", "); + } + printf("]\n"); +} + +int main() { + _Alignas(16) uint8_t dest[512] = { }; + _Alignas(16) uint8_t src[512] = { }; + for (int i = 0; i < sizeof(src); i++) + src[i] = i; + puts("Before"); + printf(" src: "); + print_first_128(src); + printf("dest: "); + print_first_128(dest); + st1q_sme_copy_test(src, dest); + puts("\nAfter "); + printf(" src: "); + print_first_128(src); + printf("dest: "); + print_first_128(dest); +} +``` + +Compile with (requires at least clang ~14, tested with clang 16):<br/> +`clang ./qemu_repro.c -march=armv9-a+sme+sve -o ./qemu_repro` + +Run with:<br/> +`qemu-aarch64 -cpu max,sme=on ./qemu_repro` + +It's expected just to copy from `src` to `dest` and output: +``` +Before + src: [00, 01, 02, 03, 04, 05, 06, 07, 08, 09, 10, 11, 12, 13, 14, 15] +dest: [00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00] + +After + src: [00, 01, 02, 03, 04, 05, 06, 07, 08, 09, 10, 11, 12, 13, 14, 15] +dest: [00, 01, 02, 03, 04, 05, 06, 07, 08, 09, 10, 11, 12, 13, 14, 15] +``` + +But currently outputs: +``` +Before + src: [00, 01, 02, 03, 04, 05, 06, 07, 08, 09, 10, 11, 12, 13, 14, 15] +dest: [00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00] + +After + src: [00, 01, 02, 03, 04, 05, 06, 07, 08, 09, 10, 11, 12, 13, 14, 15] +dest: [00, 08, 09, 10, 11, 12, 13, 14, 15, 00, 00, 00, 00, 00, 00, 00] +``` +Additional information: +N/A diff --git a/gitlab/issues_text/target_arm/host_missing/accel_TCG/1953 b/gitlab/issues_text/target_arm/host_missing/accel_TCG/1953 new file mode 100644 index 000000000..d03aeccf4 --- /dev/null +++ b/gitlab/issues_text/target_arm/host_missing/accel_TCG/1953 @@ -0,0 +1,146 @@ +Segmentation fault when compiling elixir app on qemu aarch64 on x86_64 host +Description of problem: +When I try to install an elixir escript using + +``` +mix escript.install github upmaru/pakman --force +``` + +I run into a segfault with the following output + +``` + + +Build and Deploy +failed Oct 22, 2023 in 1m 27s +2s +2s +22s +56s +remote: Compressing objects: 86% (144/167) +remote: Compressing objects: 87% (146/167) +remote: Compressing objects: 88% (147/167) +remote: Compressing objects: 89% (149/167) +remote: Compressing objects: 90% (151/167) +remote: Compressing objects: 91% (152/167) +remote: Compressing objects: 92% (154/167) +remote: Compressing objects: 93% (156/167) +remote: Compressing objects: 94% (157/167) +remote: Compressing objects: 95% (159/167) +remote: Compressing objects: 96% (161/167) +remote: Compressing objects: 97% (162/167) +remote: Compressing objects: 98% (164/167) +remote: Compressing objects: 99% (166/167) +remote: Compressing objects: 100% (167/167) +remote: Compressing objects: 100% (167/167), done. +remote: Total 2568 (delta 86), reused 188 (delta 58), pack-reused 2341 +origin/HEAD set to develop +Resolving Hex dependencies... +Resolution completed in 0.872s +New: + castore 1.0.4 + finch 0.16.0 + hpax 0.1.2 + jason 1.4.1 + mime 2.0.5 + mint 1.5.1 + nimble_options 1.0.2 + nimble_pool 1.0.0 + slugger 0.3.0 + telemetry 1.2.1 + tesla 1.7.0 + yamerl 0.10.0 + yaml_elixir 2.8.0 +* Getting tesla (Hex package) +* Getting jason (Hex package) +* Getting yaml_elixir (Hex package) +* Getting slugger (Hex package) +* Getting finch (Hex package) +* Getting mint (Hex package) +* Getting castore (Hex package) +* Getting hpax (Hex package) +* Getting mime (Hex package) +* Getting nimble_options (Hex package) +* Getting nimble_pool (Hex package) +* Getting telemetry (Hex package) +* Getting yamerl (Hex package) +Resolving Hex dependencies... +Resolution completed in 0.413s +Unchanged: + castore 1.0.4 + finch 0.16.0 + hpax 0.1.2 + jason 1.4.1 + mime 2.0.5 + mint 1.5.1 + nimble_options 1.0.2 + nimble_pool 1.0.0 + slugger 0.3.0 + telemetry 1.2.1 + tesla 1.7.0 + yamerl 0.10.0 + yaml_elixir 2.8.0 +All dependencies are up to date +==> mime +Compiling 1 file (.ex) +Generated mime app +==> nimble_options +Compiling 3 files (.ex) +qemu: uncaught target signal 11 (Segmentation fault) - core dumped +Segmentation fault (core dumped) +``` +Steps to reproduce: +1. Create a repo using the github action zacksiri/setup-alpine +2. Install elixir +3. run `mix escript.install github upmaru/pakman --force` +Additional information: +You can use the following github action config as an example / starting point. + + +```yml +name: 'Deployment' + +on: + push: + branches: + - main + - master + - develop + +jobs: + build_and_deploy: + name: Build and Deploy + runs-on: ubuntu-latest + steps: + - name: 'Checkout' + uses: actions/checkout@v3 + with: + ref: ${{ github.event.workflow_run.head_branch }} + fetch-depth: 0 + + - name: 'Setup Alpine' + uses: zacksiri/setup-alpine@master + with: + branch: v3.18 + arch: aarch64 + qemu-repo: edge + packages: | + zip + tar + sudo + alpine-sdk + coreutils + cmake + elixir + + - name: 'Setup PAKman' + run: | + export MIX_ENV=prod + + mix local.rebar --force + mix local.hex --force + mix escript.install github upmaru/pakman --force + shell: alpine.sh {0} +``` + +I'm using alpine 3.18 which has otp25 with jit enabled so I suspect this is something to do with https://gitlab.com/qemu-project/qemu/-/issues/1034 diff --git a/gitlab/issues_text/target_arm/host_missing/accel_TCG/1970 b/gitlab/issues_text/target_arm/host_missing/accel_TCG/1970 new file mode 100644 index 000000000..208e209e2 --- /dev/null +++ b/gitlab/issues_text/target_arm/host_missing/accel_TCG/1970 @@ -0,0 +1 @@ +A64 LDRA decode scales the immediate by wrong amount diff --git a/gitlab/issues_text/target_arm/host_missing/accel_TCG/2005 b/gitlab/issues_text/target_arm/host_missing/accel_TCG/2005 new file mode 100644 index 000000000..6a80c698b --- /dev/null +++ b/gitlab/issues_text/target_arm/host_missing/accel_TCG/2005 @@ -0,0 +1,29 @@ +qemu-system-aarch64: ../target/arm/helper.c:6757: sve_vqm1_for_el_sm: Assertion `sm' failed. +Description of problem: +Qemu crashes when sve is completely disabled for CPU model "max" (`-cpu max,sve=off`). Using any CPU model which does not include SVE, or using only e.g. SVE128 (`-cpu max,sve128=on`) works fine.\ +\ +`#0 0x00007f94b8291dec in __pthread_kill_implementation () at /lib64/libc.so.6 `\ +`#1 0x00007f94b823f0c6 in raise () at /lib64/libc.so.6 `\ +`#2 0x00007f94b82268d7 in abort () at /lib64/libc.so.6 `\ +`#3 0x00007f94b82267eb in _nl_load_domain.cold () at /lib64/libc.so.6 `\ +`#4 0x00007f94b8237016 in () at /lib64/libc.so.6 `\ +`#5 0x000055d6794aa698 in sve_vqm1_for_el_sm (env=env@entry=0x55d67c6ff9b0, el=el@entry=1, sm=false) at ../target/arm/helper.c:6757 `\ +`#6 0x000055d6794afc29 in sve_vqm1_for_el (el=1, env=0x55d67c6ff9b0) at ../target/arm/helper.c:6763 `\ +`#7 smcr_write (env=0x55d67c6ff9b0, ri=0x55d67c78f600, value=<optimized out>) at ../target/arm/helper.c:6887 `\ +`#8 0x00007f9469bad101 in code_gen_buffer () `\ +`#9 0x000055d67977dc19 in cpu_tb_exec (cpu=cpu@entry=0x55d67c6fd1f0, itb=<optimized out>, tb_exit=tb_exit@entry=0x7f94acdcc4c4) at ../accel/tcg/cpu-exec.c:457 `\ +`#10 0x000055d67977e59f in cpu_loop_exec_tb (tb_exit=0x7f94acdcc4c4, last_tb=<synthetic pointer>, pc=<optimized out>, tb=<optimized out>, cpu=<optimized out>) at ../accel/tcg/cpu-exec.c:919 `\ +`#11 cpu_exec_loop (cpu=cpu@entry=0x55d67c6fd1f0, sc=sc@entry=0x7f94acdcc570) at ../accel/tcg/cpu-exec.c:1040 `\ +`#12 0x000055d67977ee7d in cpu_exec_setjmp (cpu=0x55d67c6fd1f0, sc=0x7f94acdcc570) at ../accel/tcg/cpu-exec.c:1057 `\ +`#13 0x000055d679787c3d in cpu_exec (cpu=0x55d67c6fd1f0) at ../accel/tcg/cpu-exec.c:1083 `\ +`#14 0x000055d6797a1d52 in tcg_cpus_exec (cpu=0x55d67c6fd1f0) at ../accel/tcg/tcg-accel-ops.c:75 `\ +`#15 mttcg_cpu_thread_fn (arg=arg@entry=0x55d67c6fd1f0) at ../accel/tcg/tcg-accel-ops-mttcg.c:95 `\ +`#16 0x000055d679938698 in qemu_thread_start (args=0x55d67c7a1500) at ../util/qemu-thread-posix.c:541 `\ +`#17 0x00007f94b828ff44 in start_thread () at /lib64/libc.so.6 `\ +`#18 0x00007f94b8318314 in clone () at /lib64/``libc.so``.6`\ + \ +This happens when the system is booting, i.e. grub has just finished, loaded kernel and initrd, and the kernel has just began to run, i.e. early in the kernel startup. +Steps to reproduce: +1. +2. +3. diff --git a/gitlab/issues_text/target_arm/host_missing/accel_TCG/2083 b/gitlab/issues_text/target_arm/host_missing/accel_TCG/2083 new file mode 100644 index 000000000..0e6cc43a2 --- /dev/null +++ b/gitlab/issues_text/target_arm/host_missing/accel_TCG/2083 @@ -0,0 +1,111 @@ +AArch64 SME SMOPA (4-way) outer product instruction gives incorrect result +Description of problem: +The SME SMOPA (4-way) instruction ([spec](https://developer.arm.com/documentation/ddi0602/2023-09/SME-Instructions/SMOPA--4-way---Signed-integer-sum-of-outer-products-and-accumulate-?lang=en)) is giving incorrect result. Example below for 8-bit variant, which is equivalent to following Python example (128-bit VL) to make it clearer: + +``` +import numpy as np +vl = 128 +esize = 32 +dim = vl // esize + +A = range(16) +B = range(16, 32) +C = np.zeros((4, 4,), dtype=np.int32) + +for row in range(dim): + for col in range(dim): + for k in range(4): + C[row, col] += A[4*row + k] * B[4*col + k] + +print(C) + +[[ 110 134 158 182] + [ 390 478 566 654] + [ 670 822 974 1126] + [ 950 1166 1382 1598]] +``` + +main.c +``` +#include <stdio.h> +#include <stdint.h> + +void foo(int *dst); + +int main() { + int32_t dst[16]; + foo(dst); + + // This should print: + // >>> 110 134 158 182 + // >>> 390 478 566 654 + // >>> 670 822 974 1126 + // >>> 950 1166 1382 1598 + for (int i=0; i<4; ++i) { + printf(">>> "); + for (int j=0; j<4; ++j) { + printf("%d ", dst[i * 4 + j]); + } + printf("\n"); + } +} +``` + +foo.S + +``` +.global foo +foo: + stp x29, x30, [sp, -80]! + mov x29, sp + stp d8, d9, [sp, 16] + stp d10, d11, [sp, 32] + stp d12, d13, [sp, 48] + stp d14, d15, [sp, 64] + + smstart + + ptrue p0.b + index z0.b, #0, #1 + mov z1.d, z0.d + add z1.b, z1.b, #16 + + zero {za} + smopa za0.s, p0/m, p0/m, z0.b, z1.b + + // Read the first 4x4 sub-matrix of elements from tile 0: + mov w12, #0 + mova z0.s, p0/m, za0h.s[w12, #0] + mova z1.s, p0/m, za0h.s[w12, #1] + mova z2.s, p0/m, za0h.s[w12, #2] + mova z3.s, p0/m, za0h.s[w12, #3] + + // And store them to the input pointer (dst in the C code): + st1w {z0.s}, p0, [x0] + add x0, x0, #16 + st1w {z1.s}, p0, [x0] + add x0, x0, #16 + st1w {z2.s}, p0, [x0] + add x0, x0, #16 + st1w {z3.s}, p0, [x0] + + smstop + + ldp d8, d9, [sp, 16] + ldp d10, d11, [sp, 32] + ldp d12, d13, [sp, 48] + ldp d14, d15, [sp, 64] + ldp x29, x30, [sp], 80 + ret +``` +Steps to reproduce: +``` +$ clang -target aarch64-linux-gnu -march=armv9-a+sme main.c foo.S +$ ~/qemu/build/qemu-aarch64 -cpu max,sme128=on a.out +>>> 110 478 158 654 +>>> 0 0 0 0 +>>> 670 1166 974 1598 +>>> 0 0 0 0 +``` +Additional information: + diff --git a/gitlab/issues_text/target_arm/host_missing/accel_TCG/2089 b/gitlab/issues_text/target_arm/host_missing/accel_TCG/2089 new file mode 100644 index 000000000..d06599a62 --- /dev/null +++ b/gitlab/issues_text/target_arm/host_missing/accel_TCG/2089 @@ -0,0 +1,27 @@ +aarch64: incorrect emulation of sqshrn instruction +Description of problem: +`sqshrn` instruction test fails with qemu-aarch64, but passes on real aarch64 hardware. +Steps to reproduce: +1. Build [inline_asm_tests](https://cs.android.com/android/platform/superproject/main/+/main:frameworks/libs/binary_translation/tests/inline_asm_tests/) and run with qemu-aarch64 +2. Observe two failures + +``` +[ RUN ] Arm64InsnTest.SignedSaturatingShiftRightNarrowInt16x1 +frameworks/libs/binary_translation/tests/inline_asm_tests/main_arm64.cc:6697: Failure +Expected equality of these values: + res1 + Which is: 4294967188 + MakeUInt128(0x94U, 0U) + Which is: 148 +[ FAILED ] Arm64InsnTest.SignedSaturatingShiftRightNarrowInt16x1 (5 ms) +[ RUN ] Arm64InsnTest.SignedSaturatingRoundingShiftRightNarrowInt16x1 +frameworks/libs/binary_translation/tests/inline_asm_tests/main_arm64.cc:6793: Failure +Expected equality of these values: + res3 + Which is: 4294967168 + MakeUInt128(0x0000000000000080ULL, 0x0000000000000000ULL) + Which is: 128 +[ FAILED ] Arm64InsnTest.SignedSaturatingRoundingShiftRightNarrowInt16x1 (2 ms) +``` +Additional information: +[Direct link to SignedSaturatingShiftRightNarrowInt16x1 test source](https://cs.android.com/android/platform/superproject/main/+/main:frameworks/libs/binary_translation/tests/inline_asm_tests/main_arm64.cc;l=6692;drc=4ee2c3035fa5dc0b7a48b6c6dc498296be071861) diff --git a/gitlab/issues_text/target_arm/host_missing/accel_TCG/2098 b/gitlab/issues_text/target_arm/host_missing/accel_TCG/2098 new file mode 100644 index 000000000..4c6ef6a36 --- /dev/null +++ b/gitlab/issues_text/target_arm/host_missing/accel_TCG/2098 @@ -0,0 +1 @@ +AArch32 Arm CPUs no longer support the 'vfp' property diff --git a/gitlab/issues_text/target_arm/host_missing/accel_TCG/2150 b/gitlab/issues_text/target_arm/host_missing/accel_TCG/2150 new file mode 100644 index 000000000..0545ea316 --- /dev/null +++ b/gitlab/issues_text/target_arm/host_missing/accel_TCG/2150 @@ -0,0 +1,13 @@ +ERROR:tcg/optimize.c:580:do_constant_folding_2: code should not be reached +Description of problem: +After booting Windows 10 or 11 (ARM) QEMU suddenly quits with: + +ERROR:tcg/optimize.c:580:do_constant_folding_2: code should not be reached + +It seems like it is missing an OPCODE in that function? +Steps to reproduce: +1. Boot Windows +2. QEMU quits +3. +Additional information: + diff --git a/gitlab/issues_text/target_arm/host_missing/accel_TCG/2183 b/gitlab/issues_text/target_arm/host_missing/accel_TCG/2183 new file mode 100644 index 000000000..937cfc0df --- /dev/null +++ b/gitlab/issues_text/target_arm/host_missing/accel_TCG/2183 @@ -0,0 +1,20 @@ +aarch-64 emulation much slower since release 8.1.5 (issue also present on 8.2.1) +Description of problem: +Since QEMU 8.1.5 our aarch64 based emulation got much slower. We use a linux 5.4 kernel which we cross-compile with the ARM toolchain. Things that are noticable: +- Boot time got a lot longer +- All memory accesses seem to take 3x longer (can be verified by e.g. executing below script, address does not matter): +``` +date +for i in $(seq 0 1000); do + devmem 0x200000000 2>/dev/null +done +date +``` +Steps to reproduce: +Just boot an ARM based kernel on the virt machine and execute above script. +Additional information: +I've tried reproducing the issue on the master branch. There the issue is not present. It only seems to be present on releases 8.1.5 and 8.2.1. + +I've narrowed the problem down to following commit on the 8.2 branch (@bonzini): ef74024b76bf285e247add8538c11cb3c7399a1a accel/tcg: Revert mapping of PCREL translation block to multiple virtual addresses. + +Let me know if any other information / tests are required. diff --git a/gitlab/issues_text/target_arm/host_missing/accel_TCG/2224 b/gitlab/issues_text/target_arm/host_missing/accel_TCG/2224 new file mode 100644 index 000000000..7e3771aa6 --- /dev/null +++ b/gitlab/issues_text/target_arm/host_missing/accel_TCG/2224 @@ -0,0 +1,205 @@ +OpenBSD 7.4+ does not boot on sbsa-ref with Neoverse-V1/N2 or max cpu core +Description of problem: +System boots and then hangs: + +``` +disks: sd0* +>> OpenBSD/arm64 BOOTAA64 1.18 +boot> +cannot open sd0a:/etc/random.seed: No such file or directory +booting sd0a:/bsd: 2861736+1091248+12711584+634544 [233295+91+666048+260913]=0x1 +3d5cf8 +FACP DBG2 MCFG SPCR IORT APIC SSDT PPTT GTDT BGRT +Copyright (c) 1982, 1986, 1989, 1991, 1993 + The Regents of the University of California. All rights reserved. +Copyright (c) 1995-2023 OpenBSD. All rights reserved. https://www.OpenBSD.org + +OpenBSD 7.4 (RAMDISK) #2131: Sun Oct 8 13:35:40 MDT 2023 + deraadt@arm64.openbsd.org:/usr/src/sys/arch/arm64/compile/RAMDISK +real mem = 1066156032 (1016MB) +avail mem = 996659200 (950MB) +random: boothowto does not indicate good seed +mainbus0 at root: ACPI +psci0 at mainbus0: PSCI 1.1, SMCCC 1.4 +efi0 at mainbus0: UEFI 2.7 +efi0: EFI Development Kit II / SbsaQemu rev 0x10000 +smbios0 at efi0: SMBIOS 3.4.0 +smbios0: vendor EFI Development Kit II / SbsaQemu version "1.0" date 03/13/2024 +smbios0: QEMU QEMU SBSA-REF Machine +cpu0 at mainbus0 mpidr 0: ARM Neoverse N2 r0p3 +cpu0: 0KB 64b/line 4-way L1 PIPT I-cache, 0KB 64b/line 4-way L1 D-cache +cpu0: 0KB 64b/line 8-way L2 cache +cpu0: RNDR,TLBIOS+IRANGE,TS+AXFLAG,FHM,DP,SM4,SM3,SHA3,RDM,Atomic,CRC32,SHA2+SHA512,SHA1,AES+PMULL,SPECRES,SB,FRINTTS,GPI,GPA,LRCPC+LDAPUR,FCMA,JSCVT,APA+PAC,DPB,ASID16,PAN+ATS1E1,LO,HPDS,VH,HAFDBS,CSV3,CSV2+SCXT,DIT,BT,SBSS+MSR +agintc0 at mainbus0 shift 4:3 nirq 288 nredist 4: "interrupt-controller" +agintcmsi0 at agintc0 +agtimer0 at mainbus0: 62500 kHz +acpi0 at mainbus0: ACPI 6.0 +acpi0: tables DSDT FACP DBG2 MCFG SPCR IORT APIC SSDT PPTT GTDT BGRT +acpimcfg0 at acpi0 +acpimcfg0: addr 0xf0000000, bus 0-255 +acpiiort0 at acpi0 +pluart0 at acpi0 COM0 addr 0x60000000/0x1000 irq 33 +pluart0: console +ahci0 at acpi0 AHC0 addr 0x60100000/0x10000 irq 42: AHCI 1.0 +ahci0: port 0: 1.5Gb/s +scsibus0 at ahci0: 32 targets +sd0 at scsibus0 targ 0 lun 0: <ATA, QEMU HARDDISK, 2.5+> t10.ATA_QEMU_HARDDISK_QM00001_ +sd0: 43MB, 512 bytes/sector, 88064 sectors, thin +xhci0 at acpi0 USB0 addr 0x60110000/0x10000 irq 43, xHCI 0.0 +usb0 at xhci0: USB revision 3.0 +uhub0 at usb0 configuration 1 interface 0 "Generic xHCI root hub" rev 3.00/1.00 addr 1 +acpipci0 at acpi0 PCI0 +pci0 at acpipci0 +0:1:0: rom address conflict 0xfffc0000/0x40000 +0:2:0: rom address conflict 0xffff8000/0x8000 +"Red Hat Host" rev 0x00 at pci0 dev 0 function 0 not configured +em0 at pci0 dev 1 function 0 "Intel 82574L" rev 0x00: msi, address 52:54:00:12:34:56 +"Bochs VGA" rev 0x02 at pci0 dev 2 function 0 not configured +"ACPI0007" at acpi0 not configured +"ACPI0007" at acpi0 not configured +"ACPI0007" at acpi0 not configured +"ACPI0007" at acpi0 not configured +simplefb0 at mainbus0: 1280x800, 32bpp +wsdisplay0 at simplefb0 mux 1 +wsdisplay0: screen 0 added (std, vt100 emulation) +``` + +If I use Neoverse-N1 (sbsa-ref default core type) then it boots into installer: + +``` +disks: sd0* +>> OpenBSD/arm64 BOOTAA64 1.18 +boot> +cannot open sd0a:/etc/random.seed: No such file or directory +booting sd0a:/bsd: 2861736+1091248+12711584+634544 [233295+91+666048+260913]=0x1 +3d5cf8 +FACP DBG2 MCFG SPCR IORT APIC SSDT PPTT GTDT BGRT +Copyright (c) 1982, 1986, 1989, 1991, 1993 + The Regents of the University of California. All rights reserved. +Copyright (c) 1995-2023 OpenBSD. All rights reserved. https://www.OpenBSD.org + +OpenBSD 7.4 (RAMDISK) #2131: Sun Oct 8 13:35:40 MDT 2023 + deraadt@arm64.openbsd.org:/usr/src/sys/arch/arm64/compile/RAMDISK +real mem = 1066156032 (1016MB) +avail mem = 996659200 (950MB) +random: boothowto does not indicate good seed +mainbus0 at root: ACPI +psci0 at mainbus0: PSCI 1.1, SMCCC 1.4 +efi0 at mainbus0: UEFI 2.7 +efi0: EFI Development Kit II / SbsaQemu rev 0x10000 +smbios0 at efi0: SMBIOS 3.4.0 +smbios0: vendor EFI Development Kit II / SbsaQemu version "1.0" date 03/13/2024 +smbios0: QEMU QEMU SBSA-REF Machine +cpu0 at mainbus0 mpidr 0: ARM Neoverse N1 r4p1 +cpu0: 64KB 64b/line 4-way L1 PIPT I-cache, 64KB 64b/line 4-way L1 D-cache +cpu0: 1024KB 64b/line 8-way L2 cache +cpu0: DP,RDM,Atomic,CRC32,SHA2,SHA1,AES+PMULL,LRCPC,DPB,ASID16,PAN+ATS1E1,LO,HPDS,VH,HAFDBS,CSV3,CSV2,SBSS+MSR +agintc0 at mainbus0 shift 4:3 nirq 288 nredist 4: "interrupt-controller" +agintcmsi0 at agintc0 +agtimer0 at mainbus0: 62500 kHz +acpi0 at mainbus0: ACPI 6.0 +acpi0: tables DSDT FACP DBG2 MCFG SPCR IORT APIC SSDT PPTT GTDT BGRT +acpimcfg0 at acpi0 +acpimcfg0: addr 0xf0000000, bus 0-255 +acpiiort0 at acpi0 +pluart0 at acpi0 COM0 addr 0x60000000/0x1000 irq 33 +pluart0: console +ahci0 at acpi0 AHC0 addr 0x60100000/0x10000 irq 42: AHCI 1.0 +ahci0: port 0: 1.5Gb/s +scsibus0 at ahci0: 32 targets +sd0 at scsibus0 targ 0 lun 0: <ATA, QEMU HARDDISK, 2.5+> t10.ATA_QEMU_HARDDISK_QM00001_ +sd0: 43MB, 512 bytes/sector, 88064 sectors, thin +xhci0 at acpi0 USB0 addr 0x60110000/0x10000 irq 43, xHCI 0.0 +usb0 at xhci0: USB revision 3.0 +uhub0 at usb0 configuration 1 interface 0 "Generic xHCI root hub" rev 3.00/1.00 addr 1 +acpipci0 at acpi0 PCI0 +pci0 at acpipci0 +0:1:0: rom address conflict 0xfffc0000/0x40000 +0:2:0: rom address conflict 0xffff8000/0x8000 +"Red Hat Host" rev 0x00 at pci0 dev 0 function 0 not configured +em0 at pci0 dev 1 function 0 "Intel 82574L" rev 0x00: msi, address 52:54:00:12:34:56 +"Bochs VGA" rev 0x02 at pci0 dev 2 function 0 not configured +"ACPI0007" at acpi0 not configured +"ACPI0007" at acpi0 not configured +"ACPI0007" at acpi0 not configured +"ACPI0007" at acpi0 not configured +simplefb0 at mainbus0: 1280x800, 32bpp +wsdisplay0 at simplefb0 mux 1 +wsdisplay0: screen 0 added (std, vt100 emulation) +softraid0 at root +scsibus1 at softraid0: 256 targets +root on rd0a swap on rd0b dump on rd0b +WARNING: CHECK AND RESET THE DATE! +erase ^?, werase ^W, kill ^U, intr ^C, status ^T + +Welcome to the OpenBSD/arm64 7.4 installation program. +(I)nstall, (U)pgrade, (A)utoinstall or (S)hell? +``` +Steps to reproduce: +1. download OpenBSD 7.4 image: https://cdn.openbsd.org/pub/OpenBSD/7.4/arm64/miniroot74.img +2. download sbsa-ref firmware files from https://artifacts.codelinaro.org/ui/native/linaro-419-sbsa-ref/20240313-116475/edk2/ and decompress them +3. start qemu-system-aarch64 as shown above (adapt paths if needed) +4. watch console serial output +Additional information: +I am going to discuss this on OpenBSD mailing list. Will point to this bug. + +OpenBSD 7.5-current snapshot works on Neoverse-N1 and fails on Neoverse-V1/N2/max: + +``` +disks: sd0* +>> OpenBSD/arm64 BOOTAA64 1.18 +boot> +cannot open sd0a:/etc/random.seed: No such file or directory +booting sd0a:/bsd: 3015576+1213504+12712936+634144 [269381+91+701664+287051]=0x1 +3edee0 +FACP DBG2 MCFG SPCR IORT APIC SSDT PPTT GTDT BGRT +Copyright (c) 1982, 1986, 1989, 1991, 1993 + The Regents of the University of California. All rights reserved. +Copyright (c) 1995-2024 OpenBSD. All rights reserved. https://www.OpenBSD.org + +OpenBSD 7.5 (RAMDISK) #121: Thu Mar 14 03:28:46 MDT 2024 + deraadt@arm64.openbsd.org:/usr/src/sys/arch/arm64/compile/RAMDISK +real mem = 1066147840 (1016MB) +avail mem = 992886784 (946MB) +random: boothowto does not indicate good seed +mainbus0 at root: ACPI +psci0 at mainbus0: PSCI 1.1, SMCCC 1.4 +efi0 at mainbus0: UEFI 2.7 +efi0: EFI Development Kit II / SbsaQemu rev 0x10000 +smbios0 at efi0: SMBIOS 3.4.0 +smbios0: vendor EFI Development Kit II / SbsaQemu version "1.0" date 03/13/2024 +smbios0: QEMU QEMU SBSA-REF Machine +cpu0 at mainbus0 mpidr 0: ARM Neoverse N2 r0p3 +cpu0: 0KB 64b/line 4-way L1 PIPT I-cache, 0KB 64b/line 4-way L1 D-cache +cpu0: 0KB 64b/line 8-way L2 cache +cpu0: RNDR,TLBIOS+IRANGE,TS+AXFLAG,FHM,DP,SM4,SM3,SHA3,RDM,Atomic,CRC32,SHA2+SHA512,SHA1,AES+PMULL,SPECRES,SB,FRINTTS,GPA,LRCPC+LDAPUR,FCMA,JSCVT,APA+PAC,DPB,ASID16,PAN+ATS1E1,LO,HPDS,VH,HAFDBS,CSV3,CSV2+SCXT,DIT,BT,SBSS+MSR,MTE +agintc0 at mainbus0 shift 4:3 nirq 288 nredist 4: "interrupt-controller" +agintcmsi0 at agintc0 +agtimer0 at mainbus0: 62500 kHz +acpi0 at mainbus0: ACPI 6.0 +acpi0: tables DSDT FACP DBG2 MCFG SPCR IORT APIC SSDT PPTT GTDT BGRT +acpimcfg0 at acpi0 +acpimcfg0: addr 0xf0000000, bus 0-255 +acpiiort0 at acpi0 +pluart0 at acpi0 COM0 addr 0x60000000/0x1000 irq 33 +pluart0: console +ahci0 at acpi0 AHC0 addr 0x60100000/0x10000 irq 42: AHCI 1.0 +ahci0: port 0: 1.5Gb/s +scsibus0 at ahci0: 32 targets +sd0 at scsibus0 targ 0 lun 0: <ATA, QEMU HARDDISK, 2.5+> t10.ATA_QEMU_HARDDISK_QM00001_ +sd0: 43MB, 512 bytes/sector, 88064 sectors, thin +xhci0 at acpi0 USB0 addr 0x60110000/0x10000 irq 43, xHCI 0.0 +usb0 at xhci0: USB revision 3.0 +uhub0 at usb0 configuration 1 interface 0 "Generic xHCI root hub" rev 3.00/1.00 addr 1 +acpipci0 at acpi0 PCI0 +pci0 at acpipci0 +0:1:0: rom address conflict 0xfffc0000/0x40000 +0:2:0: rom address conflict 0xffff8000/0x8000 +"Red Hat Host" rev 0x00 at pci0 dev 0 function 0 not configured +em0 at pci0 dev 1 function 0 "Intel 82574L" rev 0x00: msi, address 52:54:00:12:34:56 +"Bochs VGA" rev 0x02 at pci0 dev 2 function 0 not configured +"ACPI0007" at acpi0 not configured +"ACPI0007" at acpi0 not configured +"ACPI0007" at acpi0 not configured +"ACPI0007" at acpi0 not configured +``` diff --git a/gitlab/issues_text/target_arm/host_missing/accel_TCG/2248 b/gitlab/issues_text/target_arm/host_missing/accel_TCG/2248 new file mode 100644 index 000000000..368fe9374 --- /dev/null +++ b/gitlab/issues_text/target_arm/host_missing/accel_TCG/2248 @@ -0,0 +1,36 @@ +qemu-aarch64: wrong execution result when executing the code +Description of problem: +The following aarch64 code results in the wrong execution result `4611686018427387903`, which is `0x3fffffffffffffff`. (The correct result is `-1`) The bug seems to be introduced in between v8.1.5 and v8.2.1 since the results are correct in v8.1.5. + +```c +// foo.c +#include <stdio.h> +#include <stdint.h> + +int64_t callme(size_t _1, size_t _2, int64_t a, int64_t b, int64_t c); + +int main() { + int64_t ret = callme(0, 0, 0, 1, 2); + printf("%ld\n", ret); + return 0; +} +``` + +```s +// foo.S +.global callme +callme: + cmp x2, x3 + cset x12, lt + and w11, w12, #0xff + cmp w11, #0x0 + csetm x14, ne + lsr x13, x14, x4 + sxtb x0, w13 + ret +``` +Steps to reproduce: +1. Build the code with `aarch64-linux-gnu-gcc foo.c foo.S -o foo` (`aarch64-linux-gnu-gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0`) +2. Run the code with `qemu-aarch64 -L /usr/aarch64-linux-gnu -E LD_LIBRARY_PATH=/usr/aarch64-linux-gnu/lib foo` and see the result +Additional information: +- Original discussion is held in [this wasmtime issue](https://github.com/bytecodealliance/wasmtime/issues/8233). Thanks to Alex Crichton for clarifying this bug. diff --git a/gitlab/issues_text/target_arm/host_missing/accel_TCG/2250 b/gitlab/issues_text/target_arm/host_missing/accel_TCG/2250 new file mode 100644 index 000000000..9e9a9051a --- /dev/null +++ b/gitlab/issues_text/target_arm/host_missing/accel_TCG/2250 @@ -0,0 +1,44 @@ +FEAT_RME: NS EL1/0 Address Translation from EL3 fails +Description of problem: +I'm playing around with the QEMU RME Stack (TF-A, TF-RMM, Linux/KVM) for a research project. +For this I want to access some virtual normal world memory address from within EL3. +To translate the address to the physical address I use the `AT` instructions (e.g., `ats1e2r`). +If the NW memory is initially mapped in the GPT as `GPT_GPI_ANY`, this works fine, however, if the NW memory is mapped as `GPT_GPI_NS` the address translation fails with the error `0b100101`/GPT on PTW. +However, EL3/Root World should be able to access memory from all PAS, and therefore, if I understand the ARM documentation correctly, should also be able to execute a PTW for an address marked NS in the GPT. +Steps to reproduce: +1. Setup GPT with some memory marked as `GPT_GPI_NS` +2. Forward some NW virtual address from the kernel to EL3 +3. Execute a PTW on this address via the `AT` instructions. +Additional information: +I also took a look into the QEMU source code and potentially found the issue. +When executing a PTW we execute `target/arm/ptw.c:granule_protection_check`. +The function extracts the target page's GPI (`ptw.c:440'): +```c + switch (gpi) { + case 0b0000: /* no access */ + break; + case 0b1111: /* all access */ + return true; + case 0b1000: + case 0b1001: + case 0b1010: + case 0b1011: + if (pspace == (gpi & 3)) { + return true; + } + break; + default: + goto fault_walk; /* reserved */ + } +``` +The if statement checks if the current `pstate` (previously set to `ptw->in_space`) is the same security state as the one contained in the GPI. +If this is not the case, we generate a GPF. +However, I think the code misses the fact, that EL3/Root world can access memory from each PAS, meaning that the if statement should be something like +```c +if (pspace == (gpi & 3) || (pspace == ARMSS_Root)) { + return true; +} +``` +Additionally, as both Secure and Realm World can also access Normal World memory, similar checks should also be added in such cases. + +I have a patch prepared for this, however, I first want to check in if I'm in line with the Arm ARM or if I'm missing something and EL3 is indeed not supposed to execute PTWs for NS memory. diff --git a/gitlab/issues_text/target_arm/host_missing/accel_TCG/2326 b/gitlab/issues_text/target_arm/host_missing/accel_TCG/2326 new file mode 100644 index 000000000..2e68b3be9 --- /dev/null +++ b/gitlab/issues_text/target_arm/host_missing/accel_TCG/2326 @@ -0,0 +1,24 @@ +qemu-system-arm regression with Qemu 9.0.0 +Description of problem: +Bootup of the userland crashes: +``` +[ 1.713693] Run /init as init process +[ 2.372470] Alignment trap: not handling instruction f8530b04 at [<0001225a>] +[ 2.391053] 8<--- cut here --- +[ 2.392942] Unhandled fault: alignment exception (0x001) at 0x00035335 +[ 2.397042] [00035335] *pgd=6066b831, *pte=6030734f, *ppte=6030783f +``` +Steps to reproduce: +wget https://debug.openadk.org/vexpress-v2p-ca9.dtb + +wget https://debug.openadk.org/qemu-arm-vexpress-a9-initramfspiggyback-kernel + +qemu-system-arm -M vexpress-a9 -nographic -cpu cortex-a9 -net user -net nic,model=lan9118 -dtb vexpress-v2p-ca9.dtb -kernel qemu-arm-vexpress-a9-initramfspiggyback-kernel -qmp tcp:127.0.0.1:4444,server,nowait -no-reboot +Additional information: +It works fine for ARM instruction set, but not for Thumb2. + +Git bisect showed following commit as the problematic one:<br> +From 59754f85ed35cbd5f4bf2663ca2136c78d5b2413 Mon Sep 17 00:00:00 2001<br> +From: Richard Henderson <richard.henderson@linaro.org><br> +Date: Fri, 1 Mar 2024 10:41:09 -1000<br> +Subject: [PATCH] target/arm: Do memory type alignment check when translation disabled<br> diff --git a/gitlab/issues_text/target_arm/host_missing/accel_TCG/2372 b/gitlab/issues_text/target_arm/host_missing/accel_TCG/2372 new file mode 100644 index 000000000..cbbaf878d --- /dev/null +++ b/gitlab/issues_text/target_arm/host_missing/accel_TCG/2372 @@ -0,0 +1,109 @@ +A bug in AArch64 UMOPA/UMOPS (4-way) instruction +Description of problem: +umopa computes the multiplication of two matrices in the source registers and accumulates the result to the destination register. A source register’s element size is 16 bits, while a destination register’s element size is 64 bits in case of the 4-way variant of this instruction. Before performing matrix multiplication, each element should be zero-extended to a 64-bit element. + +However, the current implementation of the helper function fails to convert the element type correctly. Below is the helper function implementation: +``` +// target/arm/tcg/sme_helper.c +#define DEF_IMOP_64(NAME, NTYPE, MTYPE) \ +static uint64_t NAME(uint64_t n, uint64_t m, uint64_t a, uint8_t p, bool neg) \ +{ \ + uint64_t sum = 0; \ + /* Apply P to N as a mask, making the inactive elements 0. */ \ + n &= expand_pred_h(p); \ + sum += (NTYPE)(n >> 0) * (MTYPE)(m >> 0); \ + sum += (NTYPE)(n >> 16) * (MTYPE)(m >> 16); \ + sum += (NTYPE)(n >> 32) * (MTYPE)(m >> 32); \ + sum += (NTYPE)(n >> 48) * (MTYPE)(m >> 48); \ + return neg ? a - sum : a + sum; \ +} + +DEF_IMOP_64(umopa_d, uint16_t, uint16_t) +``` +When the multiplication is performed, each element, such as `(NTYPE)(n >> 0)`, is automatically converted to `int32_t`, so the computation result has a type `int32_t`. The result is then converted to `uint64_t`, and it is added to `sum`. It seems the elements should be casted to `uint64_t` **before** performing the multiplication. +Steps to reproduce: +1. Write `test.c`. +``` +#include <stdio.h> + +char i_P1[4] = { 0xff, 0xff, 0xff, 0xff }; +char i_P5[4] = { 0xff, 0xff, 0xff, 0xff }; +char i_Z0[32] = { // Set only the first element as non-zero + 0xff, 0xff, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, + 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, + 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, + 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, +}; +char i_Z20[32] = { // Set only the first element as non-zero + 0xff, 0xff, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, + 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, + 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, + 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, +}; +char i_ZA2H[128] = { 0x0, }; +char o_ZA2H[128]; + +void __attribute__ ((noinline)) show_state() { + for (int i = 0; i < 8; i++) { + for (int j = 0; j < 16; j++) { + printf("%02x ", o_ZA2H[16*i+j]); + } + printf("\n"); + } +} + +void __attribute__ ((noinline)) run() { + __asm__ ( + ".arch armv9.3-a+sme\n" + "smstart\n" + "adrp x29, i_P1\n" + "add x29, x29, :lo12:i_P1\n" + "ldr p1, [x29]\n" + "adrp x29, i_P5\n" + "add x29, x29, :lo12:i_P5\n" + "ldr p5, [x29]\n" + "adrp x29, i_Z0\n" + "add x29, x29, :lo12:i_Z0\n" + "ldr z0, [x29]\n" + "adrp x29, i_Z20\n" + "add x29, x29, :lo12:i_Z20\n" + "ldr z20, [x29]\n" + "adrp x29, i_ZA2H\n" + "add x29, x29, :lo12:i_ZA2H\n" + "mov x15, 0\n" + "ld1d {za2h.d[w15, 0]}, p1, [x29]\n" + "add x29, x29, 32\n" + "ld1d {za2h.d[w15, 1]}, p1, [x29]\n" + "add x29, x29, 32\n" + "mov x15, 2\n" + "ld1d {za2h.d[w15, 0]}, p1, [x29]\n" + "add x29, x29, 32\n" + "ld1d {za2h.d[w15, 1]}, p1, [x29]\n" + ".inst 0xa1f43402\n" // umopa za2.d, p5/m, p1/m, z0.h, z20.h + "adrp x29, o_ZA2H\n" + "add x29, x29, :lo12:o_ZA2H\n" + "mov x15, 0\n" + "st1d {za2h.d[w15, 0]}, p1, [x29]\n" + "add x29, x29, 32\n" + "st1d {za2h.d[w15, 1]}, p1, [x29]\n" + "add x29, x29, 32\n" + "mov x15, 2\n" + "st1d {za2h.d[w15, 0]}, p1, [x29]\n" + "add x29, x29, 32\n" + "st1d {za2h.d[w15, 1]}, p1, [x29]\n" + "smstop\n" + ".arch armv8-a\n" + ); +} + +int main(int argc, char **argv) { + run(); + show_state(); + return 0; +} +``` +2. Compile `test.bin` using this command: `aarch64-linux-gnu-gcc-12 -O2 -no-pie ./test.c -o ./test.bin`. +3. Run `QEMU` using this command: `qemu-aarch64 -L /usr/aarch64-linux-gnu/ -cpu max,sme256=on ./test.bin`. +4. The program, runs on top of the buggy QEMU, prints the first 8 bytes of `ZA2H` as `01 00 fe ff ff ff ff ff`. It should print `01 00 fe ff 00 00 00 00` after the bug is fixed. +Additional information: + diff --git a/gitlab/issues_text/target_arm/host_missing/accel_TCG/2373 b/gitlab/issues_text/target_arm/host_missing/accel_TCG/2373 new file mode 100644 index 000000000..de6ede920 --- /dev/null +++ b/gitlab/issues_text/target_arm/host_missing/accel_TCG/2373 @@ -0,0 +1,95 @@ +A bug in AArch64 FMOPA/FMOPS (widening) instruction +Description of problem: +fmopa computes the multiplication of two matrices in the source registers and accumulates the result to the destination register. A source register’s element size is 16 bits, while a destination register’s element size is 64 bits in the case of widening variant of this instruction. Before the matrix multiplication is performed, each element should be converted to a 64-bit floating point. FPCR flags are considered when converting floating point values. Especially, when the FZ (or FZ16) flag is set, denormalized values are converted into zero. When the floating point size is 16 bits, FZ16 should be considered; otherwise, FZ flag should be used. + +However, the current implementation only considers FZ flag, not FZ16 flag, so it computes the wrong value. +Steps to reproduce: +1. Write `test.c`. +``` +#include <stdio.h> + +char i_P2[4] = { 0xff, 0xff, 0xff, 0xff }; +char i_P5[4] = { 0xff, 0xff, 0xff, 0xff }; +char i_Z0[32] = { // Set only the first element as non-zero + 0xff, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, + 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, + 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, + 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, +}; +char i_Z16[32] = { // Set only the first element as non-zero + 0xff, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, + 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, + 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, + 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, +}; +char i_ZA3H[128] = { 0x0, }; +uint64_t i_fpcr = 0x0001000000; // FZ = 1; +char o_ZA3H[128]; + +void __attribute__ ((noinline)) show_state() { + for (int i = 0; i < 8; i++) { + for (int j = 0; j < 16; j++) { + printf("%02x ", o_ZA3H[16*i+j]); + } + printf("\n"); + } +} + +void __attribute__ ((noinline)) run() { + __asm__ ( + ".arch armv9.3-a+sme\n" + "smstart\n" + "adrp x29, i_P2\n" + "add x29, x29, :lo12:i_P2\n" + "ldr p2, [x29]\n" + "adrp x29, i_P5\n" + "add x29, x29, :lo12:i_P5\n" + "ldr p5, [x29]\n" + "adrp x29, i_Z0\n" + "add x29, x29, :lo12:i_Z0\n" + "ldr z0, [x29]\n" + "adrp x29, i_Z16\n" + "add x29, x29, :lo12:i_Z16\n" + "ldr z16, [x29]\n" + "adrp x29, i_ZA3H\n" + "add x29, x29, :lo12:i_ZA3H\n" + "mov x15, 0\n" + "ld1w {za3h.s[w15, 0]}, p2, [x29]\n" + "add x29, x29, 32\n" + "ld1w {za3h.s[w15, 1]}, p2, [x29]\n" + "add x29, x29, 32\n" + "mov x15, 2\n" + "ld1w {za3h.s[w15, 0]}, p2, [x29]\n" + "add x29, x29, 32\n" + "ld1w {za3h.s[w15, 1]}, p2, [x29]\n" + "adrp x29, i_fpcr\n" + "add x29, x29, :lo12:i_fpcr\n" + "ldr x29, [x29]\n" + "msr fpcr, x29\n" + ".inst 0x81a0aa03\n" // fmopa za3.s, p2/m, p5/m, z16.h, z0.h + "adrp x29, o_ZA3H\n" + "add x29, x29, :lo12:o_ZA3H\n" + "mov x15, 0\n" + "st1w {za3h.s[w15, 0]}, p2, [x29]\n" + "add x29, x29, 32\n" + "st1w {za3h.s[w15, 1]}, p2, [x29]\n" + "add x29, x29, 32\n" + "mov x15, 2\n" + "st1w {za3h.s[w15, 0]}, p2, [x29]\n" + "add x29, x29, 32\n" + "st1w {za3h.s[w15, 1]}, p2, [x29]\n" + ".arch armv8-a\n" + ); +} + +int main(int argc, char **argv) { + run(); + show_state(); + return 0; +} +``` +2. Compile `test.bin` using this command: `aarch64-linux-gnu-gcc-12 -O2 -no-pie ./test.c -o ./test.bin`. +3. Run QEMU using this command: `qemu-aarch64 -L /usr/aarch64-linux-gnu/ -cpu max,sme256=on ./test.bin`. +4. The program, runs on top of the buggy QEMU, prints only zero bytes. It should print `00 01 7e 2f + 00 .. (rest of bytes) .. 00` after the bug is fixed. +Additional information: + diff --git a/gitlab/issues_text/target_arm/host_missing/accel_TCG/2374 b/gitlab/issues_text/target_arm/host_missing/accel_TCG/2374 new file mode 100644 index 000000000..e96796272 --- /dev/null +++ b/gitlab/issues_text/target_arm/host_missing/accel_TCG/2374 @@ -0,0 +1,111 @@ +A bug in AArch64 FMOPA/FMOPS (non-widening) instruction +Description of problem: +fmopa computes the multiplication of two matrices in the source registers and accumulates the result to the destination register. Depending on the instruction encoding, the element size of operands is either 32 bits or 64 bits. When the computation produces a NaN as a result, the default NaN should be generated. + +However, the current implementation of 32-bit variant of this instruction does not generate default NaNs, because invalid float_status pointer is passed: +``` +// target/arm/tcg/sme_helper.c +void HELPER(sme_fmopa_s)(void *vza, void *vzn, void *vzm, void *vpn, + void *vpm, void *vst, uint32_t desc) +{ +... + float_status fpst; + + /* + * Make a copy of float_status because this operation does not + * update the cumulative fp exception status. It also produces + * default nans. + */ + fpst = *(float_status *)vst; + set_default_nan_mode(true, &fpst); + +... + *a = float32_muladd(n, *m, *a, 0, vst); // &fpst should be used +... +} +``` +Steps to reproduce: +1. Write `test.c`. +``` +#include <stdio.h> + +char i_P0[4] = { 0xff, 0xff, 0xff, 0xff }; +char i_P6[4] = { 0xff, 0xff, 0xff, 0xff }; +char i_Z9[32] = { // Set only the first element as NaN, but it is not default NaN. + 0xff, 0xff, 0xff, 0xff, 0x0, 0x0, 0x0, 0x0, + 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, + 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, + 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, +}; +char i_Z27[32] = { + 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, + 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, + 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, + 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, +}; +char i_ZA1H[128] = { 0x0, }; +char o_ZA1H[128]; + +void __attribute__ ((noinline)) show_state() { + for (int i = 0; i < 8; i++) { + for (int j = 0; j < 16; j++) { + printf("%02x ", o_ZA1H[16*i+j]); + } + printf("\n"); + } +} + +void __attribute__ ((noinline)) run() { + __asm__ ( + ".arch armv9.3-a+sme\n" + "smstart\n" + "adrp x29, i_P0\n" + "add x29, x29, :lo12:i_P0\n" + "ldr p0, [x29]\n" + "adrp x29, i_P6\n" + "add x29, x29, :lo12:i_P6\n" + "ldr p6, [x29]\n" + "adrp x29, i_Z9\n" + "add x29, x29, :lo12:i_Z9\n" + "ldr z9, [x29]\n" + "adrp x29, i_Z27\n" + "add x29, x29, :lo12:i_Z27\n" + "ldr z27, [x29]\n" + "adrp x29, i_ZA1H\n" + "add x29, x29, :lo12:i_ZA1H\n" + "mov x15, 0\n" + "ld1w {za1h.s[w15, 0]}, p0, [x29]\n" + "add x29, x29, 32\n" + "ld1w {za1h.s[w15, 1]}, p0, [x29]\n" + "add x29, x29, 32\n" + "mov x15, 2\n" + "ld1w {za1h.s[w15, 0]}, p0, [x29]\n" + "add x29, x29, 32\n" + "ld1w {za1h.s[w15, 1]}, p0, [x29]\n" + ".inst 0x809bc121\n" // fmopa za1.s, p0/m, p6/m, z9.s, z27.s + "adrp x29, o_ZA1H\n" + "add x29, x29, :lo12:o_ZA1H\n" + "mov x15, 0\n" + "st1w {za1h.s[w15, 0]}, p0, [x29]\n" + "add x29, x29, 32\n" + "st1w {za1h.s[w15, 1]}, p0, [x29]\n" + "add x29, x29, 32\n" + "mov x15, 2\n" + "st1w {za1h.s[w15, 0]}, p0, [x29]\n" + "add x29, x29, 32\n" + "st1w {za1h.s[w15, 1]}, p0, [x29]\n" + ".arch armv8-a\n" + ); +} + +int main(int argc, char **argv) { + run(); + show_state(); + return 0; +} +``` +2. Compile `test.bin` using this command: `aarch64-linux-gnu-gcc-12 -O2 -no-pie ./test.c -o ./test.bin`. +3. Run QEMU using this command: `qemu-aarch64 -L /usr/aarch64-linux-gnu/ -cpu max,sme256=on ./test.bin`. +4. The program, runs on top of the buggy QEMU, prints 8 non-default NaNs (ff ff ff ff). It should print 8 default NaNs (00 00 c0 7f) after the bug is fixed. +Additional information: + diff --git a/gitlab/issues_text/target_arm/host_missing/accel_TCG/2375 b/gitlab/issues_text/target_arm/host_missing/accel_TCG/2375 new file mode 100644 index 000000000..07d51b341 --- /dev/null +++ b/gitlab/issues_text/target_arm/host_missing/accel_TCG/2375 @@ -0,0 +1,85 @@ +A bug in AArch64 FJCVTZS instruction +Description of problem: +fjcvtzs instruction converts a double-precision floating-point value in the source register into a 32-bit signed integer, and stores the result in the destination register. The contents of the FPCR register influence the exception result. Especially, when FPCR.FZ (Flushing denormalized numbers to Zero) is set and an input is a denormalized number, the PSTATE.Z flag should be cleared even if the conversion result is zero. + +However, because the helper function for this instruction does not properly check the denormalized case, the Z flag will have an incorrect value: +``` +// target/arm/vfp_helper.c +uint64_t HELPER(fjcvtzs)(float64 value, void *vstatus) +{ + float_status *status = vstatus; + uint32_t inexact, frac; + uint32_t e_old, e_new; + + e_old = get_float_exception_flags(status); + set_float_exception_flags(0, status); + frac = float64_to_int32_modulo(value, float_round_to_zero, status); + e_new = get_float_exception_flags(status); + set_float_exception_flags(e_old | e_new, status); + + if (value == float64_chs(float64_zero)) { + /* While not inexact for IEEE FP, -0.0 is inexact for JavaScript. */ + inexact = 1; + } else { + /* Normal inexact or overflow or NaN */ + inexact = e_new & (float_flag_inexact | float_flag_invalid); // float_flag_input_denormal should also be checked. + } + + /* Pack the result and the env->ZF representation of Z together. */ + return deposit64(frac, 32, 32, inexact); +} +``` +Steps to reproduce: +1. Write `test.c`. +``` +#include <stdint.h> +#include <stdio.h> +#include <string.h> + +char i_D27[8] = { 0x0, 0xff, 0xfc, 0x0, 0x0, 0x0, 0x0, 0x0 }; +uint64_t i_fpcr = 0x01000000; // FZ = 1; +char o_X28[8]; +uint64_t o_nzcv; + +void __attribute__ ((noinline)) show_state() { + char Z = ((o_nzcv >> 30) & 1); + + printf("PSTATE.Z: %d\n", Z); + printf("X28: "); + for (int i = 0; i < 8; i++) { + printf("%02x ", o_X28[i]); + } + printf("\n"); +} + +void __attribute__ ((noinline)) run() { + __asm__ ( + "adrp x29, i_D27\n" + "add x29, x29, :lo12:i_D27\n" + "ldr d27, [x29]\n" + "adrp x29, i_fpcr\n" + "add x29, x29, :lo12:i_fpcr\n" + "ldr x29, [x29]\n" + "msr fpcr, x29\n" + ".inst 0x1e7e037c\n" // fjcvtzs w28, d27 + "mrs x26, nzcv\n" + "adrp x29, o_nzcv\n" + "add x29, x29, :lo12:o_nzcv\n" + "str x26, [x29]\n" + "adrp x29, o_X28\n" + "add x29, x29, :lo12:o_X28\n" + "str x28, [x29]\n" + ); +} + +int main(int argc, char **argv) { + run(); + show_state(); + return 0; +} +``` +2. Compile `test.bin` using this command: `aarch64-linux-gnu-gcc-12 -O2 -no-pie ./test.c -o ./test.bin`. +3. Run QEMU using this command: `qemu-aarch64 -L /usr/aarch64-linux-gnu/ ./test.bin`. +4. The program, runs on top of the buggy QEMU, prints the value of Z as `01`. It should print `00` after the bug is fixed. +Additional information: + diff --git a/gitlab/issues_text/target_arm/host_missing/accel_TCG/2376 b/gitlab/issues_text/target_arm/host_missing/accel_TCG/2376 new file mode 100644 index 000000000..9428fc045 --- /dev/null +++ b/gitlab/issues_text/target_arm/host_missing/accel_TCG/2376 @@ -0,0 +1,114 @@ +A bug in ARM VCMLA.f16/VCMLA.f32 instructions +Description of problem: +The vcmla instruction performs complex-number operations on the vector registers. There is a bug in which this instruction modifies the contents of an irrelevant vector register. + +The reason is simple out-of-bound; the helper functions should correctly check the number of modified elements: +``` +// target/arm/tcg/vec_helper.c +void HELPER(gvec_fcmlah_idx)(void *vd, void *vn, void *vm, void *va, + void *vfpst, uint32_t desc) +{ + uintptr_t opr_sz = simd_oprsz(desc); + float16 *d = vd, *n = vn, *m = vm, *a = va; + float_status *fpst = vfpst; + intptr_t flip = extract32(desc, SIMD_DATA_SHIFT, 1); + uint32_t neg_imag = extract32(desc, SIMD_DATA_SHIFT + 1, 1); + intptr_t index = extract32(desc, SIMD_DATA_SHIFT + 2, 2); + uint32_t neg_real = flip ^ neg_imag; + intptr_t elements = opr_sz / sizeof(float16); + intptr_t eltspersegment = 16 / sizeof(float16); // This should be fixed; + intptr_t i, j; + + ... +} + +... + +void HELPER(gvec_fcmlas_idx)(void *vd, void *vn, void *vm, void *va, + void *vfpst, uint32_t desc) +{ + uintptr_t opr_sz = simd_oprsz(desc); + float32 *d = vd, *n = vn, *m = vm, *a = va; + float_status *fpst = vfpst; + intptr_t flip = extract32(desc, SIMD_DATA_SHIFT, 1); + uint32_t neg_imag = extract32(desc, SIMD_DATA_SHIFT + 1, 1); + intptr_t index = extract32(desc, SIMD_DATA_SHIFT + 2, 2); + uint32_t neg_real = flip ^ neg_imag; + intptr_t elements = opr_sz / sizeof(float32); + intptr_t eltspersegment = 16 / sizeof(float32); // This should be fixed; + intptr_t i, j; + + ... +} +``` +Steps to reproduce: +1. Write `test.c`. +``` +#include <stdint.h> +#include <stdio.h> +#include <string.h> + +// zero inputs should produce zero output +char i_D4[8] = { 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0 }; +char i_D8[8] = { 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0 }; +char i_D30[8] = { 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0 }; +char i_D31[8] = { 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff }; // this should never be touched +char o_D30[8]; +char o_D31[8]; + +void __attribute__ ((noinline)) show_state() { + printf("D30: "); + for (int i = 0; i < 8; i++) { + printf("%02x ", o_D30[i]); + } + printf("\n"); + printf("D31: "); + for (int i = 0; i < 8; i++) { + printf("%02x ", o_D31[i]); + } + printf("\n"); +} + +void __attribute__ ((noinline)) run() { + __asm__ ( + "movw r7, #:lower16:i_D4\n" + "movt r7, #:upper16:i_D4\n" + "vldr d4, [r7]\n" + "movw r7, #:lower16:i_D8\n" + "movt r7, #:upper16:i_D8\n" + "vldr d8, [r7]\n" + "movw r7, #:lower16:i_D30\n" + "movt r7, #:upper16:i_D30\n" + "vldr d30, [r7]\n" + "movw r7, #:lower16:i_D31\n" + "movt r7, #:upper16:i_D31\n" + "vldr d31, [r7]\n" + "adr r7, Lbl_thumb + 1\n" + "bx r7\n" + ".thumb\n" + "Lbl_thumb:\n" + ".inst 0xfed8e804\n" // vcmla.f32 d30, d8, d4[0], #90 + "adr r7, Lbl_arm\n" + "bx r7\n" + ".arm\n" + "Lbl_arm:\n" + "movw r7, #:lower16:o_D30\n" + "movt r7, #:upper16:o_D30\n" + "vstr d30, [r7]\n" + "movw r7, #:lower16:o_D31\n" + "movt r7, #:upper16:o_D31\n" + "vstr d31, [r7]\n" + ); +} + +int main(int argc, char **argv) { + run(); + show_state(); + return 0; +} +``` +2. Compile `test.bin` using this command: `arm-linux-gnueabihf-gcc-12 -O2 -no-pie -marm -march=armv7-a+vfpv4 ./test.c -o ./test.bin`. +3. Run QEMU using this command: `qemu-arm -L /usr/arm-linux-gnueabihf/ ./test.bin`. +4. The program, runs on top of the buggy QEMU, prints the value of D31 as `00 00 c0 7f 00 00 c0 7f`. It should print `ff ff ff ff ff ff ff ff` after the bug is fixed. +Additional information: + diff --git a/gitlab/issues_text/target_arm/host_missing/accel_TCG/2419 b/gitlab/issues_text/target_arm/host_missing/accel_TCG/2419 new file mode 100644 index 000000000..56611baf7 --- /dev/null +++ b/gitlab/issues_text/target_arm/host_missing/accel_TCG/2419 @@ -0,0 +1,18 @@ +ldapr_stlr_i instructions doesn't consider signed offset +Description of problem: +The format ldapr_stlr_i models the load acquire / store release immediate instructions. \ +These instructions has a bug in the sign extension calculation of the imm field. \ +imm should be defined as s9 instead of 9. + +@ldapr_stlr_i .. ...... .. . imm:9 .. rn:5 rt:5 &ldapr_stlr_i + +Should be changed to: + +@ldapr_stlr_i .. ...... .. . imm:s9 .. rn:5 rt:5 &ldapr_stlr_i +Steps to reproduce: +1. Run ARM target +2. Generate any ldapr_stlr_i instructions (for example: LDAPUR) +3. When the imm value is negative, the immediate calculation is done wrong. In case the calculation leads to an undefined location, QEMU will fail. +Additional information: +In trans_LDAPR_i (translate-a64.c), when imm field is negative, the value of a->imm will be 512-x instead of x. \ +I already fixed the issue by adding the s9 to the imm field. This made a call to sextend32 for imm instead of extend32 in the generated file build/libqemu-aarch64-softmmu.fa.p/decode-a64.c.inc diff --git a/gitlab/issues_text/target_arm/host_missing/accel_TCG/2432 b/gitlab/issues_text/target_arm/host_missing/accel_TCG/2432 new file mode 100644 index 000000000..2e74bc8e7 --- /dev/null +++ b/gitlab/issues_text/target_arm/host_missing/accel_TCG/2432 @@ -0,0 +1,70 @@ +Bug in bcm2835_thermal interface +Description of problem: +Stack traces, crash detail: +``` +#0 __pthread_kill_implementation (no_tid=0, signo=6, threadid=140737230841344) at ./nptl/pthread_kill.c:44 +#1 __pthread_kill_internal (signo=6, threadid=140737230841344) at ./nptl/pthread_kill.c:78 +#2 __GI___pthread_kill (threadid=140737230841344, signo=signo@entry=6) at ./nptl/pthread_kill.c:89 +#3 0x00007ffff5042476 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26 +#4 0x00007ffff50287f3 in __GI_abort () at ./stdlib/abort.c:79 +#5 0x00007ffff6f0eb57 in () at /lib/x86_64-linux-gnu/libglib-2.0.so.0 +#6 0x00007ffff6f6870f in g_assertion_message_expr () at /lib/x86_64-linux-gnu/libglib-2.0.so.0 +#7 0x0000555555d642a6 in bcm2835_thermal_write (opaque=0x7ffff0a475b0, addr=2, value=0, size=2) + at ../hw/misc/bcm2835_thermal.c:76 +#8 0x0000555556c4a119 in memory_region_write_accessor + (mr=0x7ffff0a478e0, addr=2, value=0x7fffffffd250, size=2, shift=0, mask=65535, attrs=...) + at ../system/memory.c:497 +#9 0x0000555556c49da6 in access_with_adjusted_size + (addr=2, value=0x7fffffffd250, size=2, access_size_min=1, access_size_max=4, access_fn=0x555556c49ef0 <memory_region_write_accessor>, mr=0x7ffff0a478e0, attrs=...) at ../system/memory.c:573 +#10 0x0000555556c49395 in memory_region_dispatch_write (mr=0x7ffff0a478e0, addr=2, data=0, op=MO_16, attrs=...) + at ../system/memory.c:1521 +#11 0x0000555556c84e88 in flatview_write_continue_step + (attrs=..., buf=0x7fffffffd470 "", len=512, mr_addr=2, l=0x7fffffffd360, mr=0x7ffff0a478e0) + at ../system/physmem.c:2757 +#12 0x0000555556c84c42 in flatview_write_continue + (fv=0x555559717490, addr=1059135490, attrs=..., ptr=0x7fffffffd470, len=512, mr_addr=2, l=2, mr=0x7ffff0a478e0) at ../system/physmem.c:2787 +#13 0x0000555556c73305 in flatview_write + (fv=0x555559717490, addr=1059135490, attrs=..., buf=0x7fffffffd470, len=512) at ../system/physmem.c:2818 +#14 0x0000555556c73179 in address_space_write +--Type <RET> for more, q to quit, c to continue without paging--c + (as=0x5555598056f0, addr=1059135490, attrs=..., buf=0x7fffffffd470, len=512) at ../system/physmem.c:2938 +#15 0x0000555556c735df in address_space_set (as=0x5555598056f0, addr=1059135490, c=0 '\000', len=2025625, attrs=...) at ../system/physmem.c:2965 +#16 0x0000555555a95b66 in rom_reset (unused=0x0) at ../hw/core/loader.c:1284 +#17 0x0000555555ab872d in legacy_reset_hold (obj=0x5555598069b0, type=RESET_TYPE_COLD) at ../hw/core/reset.c:76 +#18 0x0000555556d7dbf4 in resettable_phase_hold (obj=0x5555598069b0, opaque=0x0, type=RESET_TYPE_COLD) at ../hw/core/resettable.c:180 +#19 0x0000555556d7d19f in resettable_container_child_foreach (obj=0x5555595573d0, cb=0x555556d7d970 <resettable_phase_hold>, opaque=0x0, type=RESET_TYPE_COLD) at ../hw/core/resetcontainer.c:54 +#20 0x0000555556d7f4a4 in resettable_child_foreach (rc=0x555558b02f50, obj=0x5555595573d0, cb=0x555556d7d970 <resettable_phase_hold>, opaque=0x0, type=RESET_TYPE_COLD) at ../hw/core/resettable.c:92 +#21 0x0000555556d7da92 in resettable_phase_hold (obj=0x5555595573d0, opaque=0x0, type=RESET_TYPE_COLD) at ../hw/core/resettable.c:169 +#22 0x0000555556d7d47a in resettable_assert_reset (obj=0x5555595573d0, type=RESET_TYPE_COLD) at ../hw/core/resettable.c:58 +#23 0x0000555556d7d2f7 in resettable_reset (obj=0x5555595573d0, type=RESET_TYPE_COLD) at ../hw/core/resettable.c:45 +#24 0x0000555555ab842e in qemu_devices_reset (reason=SHUTDOWN_CAUSE_NONE) at ../hw/core/reset.c:179 +#25 0x000055555633227d in qemu_system_reset (reason=SHUTDOWN_CAUSE_NONE) at ../system/runstate.c:493 +#26 0x0000555555aa6bd2 in qdev_machine_creation_done () at ../hw/core/machine.c:1643 +#27 0x000055555633679f in qemu_machine_creation_done (errp=0x555558587ee0 <error_fatal>) at ../system/vl.c:2685 +#28 0x0000555556335ffd in qmp_x_exit_preconfig (errp=0x555558587ee0 <error_fatal>) at ../system/vl.c:2715 +#29 0x000055555633bfe4 in qemu_init (argc=9, argv=0x7fffffffdc68) at ../system/vl.c:3759 +#30 0x0000555556d6eea2 in main (argc=9, argv=0x7fffffffdc68) at ../system/main.c:47 +``` +Description: +I encountered a part of the code during QEMU execution that shouldn't have been reached, which led to an error. + +Crash detail: +``` +ERROR:../hw/misc/bcm2835_thermal.c:76:bcm2835_thermal_write: code should not be reached +Bail out! ERROR:../hw/misc/bcm2835_thermal.c:76:bcm2835_thermal_write: code should not be reached +Aborted +``` + +Malicious inputs: +Malicious input is attached as tar.gz archive to this file, it contains file name id:000017,sig:06,src:000428,time:48261741,execs:1725363,op:havoc,rep:8 +[malicious_input.tar.gz](/uploads/fcf47faafb59308cfdb04b3e81e788f3/malicious_input.tar.gz) + +Affected code area/snippet: + +qemu/hw/misc/bcm2835_thermal.c:bcm2835_thermal_write + + +Acknowledge for reporting this issue: +Alisher Darmenov (darmenovalisher@gmail.com), +Mohamadreza Rostami (mohamadreza.rostami@trust.tu-darmstadt.de), +Ahmad-Reza Sadeghi (ahmad.sadeghi@trust.tu-darmstadt.de) diff --git a/gitlab/issues_text/target_arm/host_missing/accel_TCG/2542 b/gitlab/issues_text/target_arm/host_missing/accel_TCG/2542 new file mode 100644 index 000000000..a8f8ec916 --- /dev/null +++ b/gitlab/issues_text/target_arm/host_missing/accel_TCG/2542 @@ -0,0 +1 @@ +qemu-system-arm failure with picolibc tests since 59754f85ed35cbd5f4bf2663ca2136c78d5b2413 diff --git a/gitlab/issues_text/target_arm/host_missing/accel_TCG/2568 b/gitlab/issues_text/target_arm/host_missing/accel_TCG/2568 new file mode 100644 index 000000000..df5f92fd7 --- /dev/null +++ b/gitlab/issues_text/target_arm/host_missing/accel_TCG/2568 @@ -0,0 +1 @@ +[AARCH64] HPFAR_EL2.NS not set for non secure read in S-EL1 diff --git a/gitlab/issues_text/target_arm/host_missing/accel_TCG/2585 b/gitlab/issues_text/target_arm/host_missing/accel_TCG/2585 new file mode 100644 index 000000000..0b5a60265 --- /dev/null +++ b/gitlab/issues_text/target_arm/host_missing/accel_TCG/2585 @@ -0,0 +1,7 @@ +qemu-system-arm highmem support broken with TCG +Additional information: +I initially bisected this to commit 39a1fd25287f ("target/arm: Fix handling of LPAE block descriptors"), which introduced an identical bug by masking the wrong address bits due to a type mismatch, but this was in turn fixed by commit c2360eaa0262 ("target/arm: Fix qemu-system-arm handling of LPAE block descriptors for highmem"). The bug resurfaced between qemu-7.1.0 and qemu-7.2.0 after commit f3639a64f602 ("target/arm: Use softmmu tlbs for page table walking"), but may be caused by the preceding 4a35855682ce ("target/arm: Plumb debug into S1Translate") which fails to boot for an unrelated reason. + +I reproduced this on qemu-7.2 as shipped by Debian as well as on qemu-9.1 (built locally). + +Part of this problem appeared to be hidden by the 'highmem=on' argument not having the intended effect during parts of the bisection, which I worked around by overriding the 'pa_bits' variable in machvirt_init(). diff --git a/gitlab/issues_text/target_arm/host_missing/accel_TCG/2601 b/gitlab/issues_text/target_arm/host_missing/accel_TCG/2601 new file mode 100644 index 000000000..07f88e196 --- /dev/null +++ b/gitlab/issues_text/target_arm/host_missing/accel_TCG/2601 @@ -0,0 +1,36 @@ +Executing LD1SB + MTE on Arm64 fails an assert +Description of problem: +I'm getting +``` +qemu-system-aarch64: ../tcg/tcg-op-gvec.c:91: simd_desc: Assertion `data == sextract32(data, 0, (32 - ((0 + 8) + 2)))' failed. +``` +This is caused by the upper bits of `data` being set to 1, which violates the condition. +Steps to reproduce: +1. build QEMU with assertions enabled (e.g., `configure --enable-debug-tcg`). +2. have a `LD1SB f{z25.d}, p3/z, [x14, x9]` (binary a5894dd9) instruction in the executed code. +3. enable mte +4. Let QEMU execute the ld1sb instruction. +Additional information: +{width=699 height=141} + +This issue happens because for ld1sb, nregs=0 in `sve.decode`: +``` +# SVE contiguous load (scalar plus scalar) +LD_zprr 1010010 .... ..... 010 ... ..... ..... @rprr_load_dt nreg=0 +``` +As a result, in do_mem_zpa is called with n_reg=0, which becomes mte_n inside do_mem_zpa. +Since mte_n==0, and mte_active, then +```c +desc = FIELD_DP32(desc, MTEDESC, SIZEM1, (mte_n << msz) - 1); +``` +sets (0) - 1 == -1 to the field, which also sets the upper bits of `desc`. +The `desc` with upper bits set to 1 is used to call: +```c +desc = simd_desc(vsz, vsz, zt | desc); +``` +Inside `simd_desc`, the last parameter is named `data` and it fails the assertion: +```c +tcg_debug_assert(data == sextract32(data, 0, SIMD_DATA_BITS)) +``` + +# diff --git a/gitlab/issues_text/target_arm/host_missing/accel_TCG/271 b/gitlab/issues_text/target_arm/host_missing/accel_TCG/271 new file mode 100644 index 000000000..ede2bc292 --- /dev/null +++ b/gitlab/issues_text/target_arm/host_missing/accel_TCG/271 @@ -0,0 +1 @@ +ARM cpu emulation regression on QEMU 4.2.0 diff --git a/gitlab/issues_text/target_arm/host_missing/accel_TCG/2823 b/gitlab/issues_text/target_arm/host_missing/accel_TCG/2823 new file mode 100644 index 000000000..77541ec91 --- /dev/null +++ b/gitlab/issues_text/target_arm/host_missing/accel_TCG/2823 @@ -0,0 +1,43 @@ +func-aarch64-aarch64_rme_virt function test hangs especially when built with --enable-debug +Description of problem: + +Steps to reproduce: +1. Build with ../../configure --enable-debug +2. ./pyvenv/bin/meson test --setup thorough --suite func-thorough func-aarch64-aarch64_rme_virt +3. repeat until hang +Additional information: +Comparing a normal build to the hang: + +``` +2025-02-20 16:54:15,519: NOTICE: Booting Trusted Firmware | 2025-02-20 16:16:06,740: NOTICE: Booting Trusted Firmware +2025-02-20 16:54:15,519: NOTICE: BL1: v2.11.0(release):f2f94 | 2025-02-20 16:16:06,740: NOTICE: BL1: v2.11.0(release):f2f94 +2025-02-20 16:54:15,519: NOTICE: BL1: Built : 17:54:58, Dec | 2025-02-20 16:16:06,740: NOTICE: BL1: Built : 17:54:58, Dec +2025-02-20 16:54:15,520: NOTICE: BL1: Booting BL2 | 2025-02-20 16:16:06,741: NOTICE: BL1: Booting BL2 +2025-02-20 16:54:15,522: NOTICE: BL2: v2.11.0(release):f2f94 | 2025-02-20 16:16:06,743: NOTICE: BL2: v2.11.0(release):f2f94 +2025-02-20 16:54:15,522: NOTICE: BL2: Built : 17:55:12, Dec | 2025-02-20 16:16:06,743: NOTICE: BL2: Built : 17:55:12, Dec +2025-02-20 16:54:15,545: NOTICE: BL2: Booting BL31 | 2025-02-20 16:16:06,762: NOTICE: BL2: Booting BL31 +2025-02-20 16:54:15,550: NOTICE: BL31: v2.11.0(release):f2f9 | 2025-02-20 16:16:06,768: NOTICE: BL31: v2.11.0(release):f2f9 +2025-02-20 16:54:15,550: NOTICE: BL31: Built : 17:55:22, Dec | 2025-02-20 16:16:06,768: NOTICE: BL31: Built : 17:55:22, Dec +2025-02-20 16:54:15,555: Booting RMM v.0.5.0(release) 1b6bdf8 | 2025-02-20 16:16:06,774: Booting RMM v.0.5.0(release) 1b6bdf8 +2025-02-20 16:54:15,556: RMM-EL3 Interface v.0.4 | 2025-02-20 16:16:06,774: RMM-EL3 Interface v.0.4 +2025-02-20 16:54:15,556: Boot Manifest Interface v.0.3 | 2025-02-20 16:16:06,775: Boot Manifest Interface v.0.3 +2025-02-20 16:54:15,556: RMI/RSI ABI v.1.0/1.0 built: Dec 2 | 2025-02-20 16:16:06,775: RMI/RSI ABI v.1.0/1.0 built: Dec 2 +2025-02-20 16:54:15,587: UEFI firmware (version built at 17: | 2025-02-20 16:16:06,837: UEFI firmware (version built at 17: +2025-02-20 16:54:15,876: ESC[2JESC[01;01HESC[=3hESC[2JESC[01;01HESC[2JESC[01;01HESC[= | 2025-02-20 16:16:07,420: ESC[2JESC[01;01HESC[=3hESC[2JESC[01;01HESC[2JESC[01;01HESC[= +2025-02-20 16:54:15,898: EFI stub: Using DTB from configurati | 2025-02-20 16:16:07,421: +2025-02-20 16:54:15,898: EFI stub: Exiting boot services... | 2025-02-20 16:16:07,421: +2025-02-20 16:54:16,170: [ 0.000000] Booting Linux on phys | 2025-02-20 16:16:07,421: Synchronous Exception at 0x00000000B +2025-02-20 16:54:16,171: [ 0.000000] Linux version 6.12.0- | 2025-02-20 16:16:07,421: +2025-02-20 16:54:16,171: [ 0.000000] KASLR enabled | 2025-02-20 16:16:07,421: +2025-02-20 16:54:16,171: [ 0.000000] random: crng init don | 2025-02-20 16:16:07,421: Synchronous Exception at 0x00000000B +2025-02-20 16:54:16,171: [ 0.000000] Machine model: linux, < +2025-02-20 16:54:16,171: [ 0.000000] efi: EFI v2.7 by EDK < +2025-02-20 16:54:16,171: [ 0.000000] efi: SMBIOS=0xbf3c000 < +2025-02-20 16:54:16,171: [ 0.000000] OF: reserved mem: 0x0 < +2025-02-20 16:54:16,171: [ 0.000000] NUMA: Faking a node a < +2025-02-20 16:54:16,171: [ 0.000000] NODE_DATA(0) allocate < +2025-02-20 16:54:16,171: [ 0.000000] Zone ranges: < +2025-02-20 16:54:16,171: [ 0.000000] DMA [mem 0x000 < +2025-02-20 16:54:16,171: [ 0.000000] DMA32 empty < +2025-02-20 16:54:16,171: [ 0.000000] Normal empty < +``` diff --git a/gitlab/issues_text/target_arm/host_missing/accel_TCG/2942 b/gitlab/issues_text/target_arm/host_missing/accel_TCG/2942 new file mode 100644 index 000000000..536f0741b --- /dev/null +++ b/gitlab/issues_text/target_arm/host_missing/accel_TCG/2942 @@ -0,0 +1,65 @@ +arm: TCG debug assertion failure when handling an ISB or SB insn inside an IT block +Description of problem: +ARM thumb `IT` instruction triggers TCG debug asserts. + +``` +$ ./qemu-system-arm --version +QEMU emulator version 10.0.0 (v10.0.0) + +$ ./qemu-system-arm -M stm32vldiscovery -nographic -device loader,file=raw-it-bug.hex -d in_asm,exec +[...] +Trace 0: 0x72a584000800 [00800400/0000000000000164/00000110/ff020000] +---------------- +IN: +0x00000108: f000 f80a bl #0x120 + +Trace 0: 0x72a584000940 [00800400/0000000000000108/00000110/ff020000] +qemu-system-arm: ../tcg/tcg-op.c:3343: void tcg_gen_goto_tb(unsigned int): Assertion `(tcg_ctx->goto_tb_issue_mask & (1 << idx)) == 0' failed. +``` + +Expected behavior: +``` +$ qemu-system-arm -M stm32vldiscovery -device loader,file=raw-hardfault.hex -d in_asm,exec,int +[...] +Trace 0: 0x7df6dc000800 [00800400/0000000000000164/00000110/ff020000] +---------------- +IN: +0x00000108: f000 f80a bl #0x120 + +Trace 0: 0x7df6dc000940 [00800400/0000000000000108/00000110/ff020000] +---------------- +IN: +0x00000120: 2302 movs r3, #2 +0x00000122: bf00 nop +0x00000124: f04f 25e0 mov.w r5, #-0x1fff2000 +0x00000128: f8d5 1d10 ldr.w r1, [r5, #0xd10] +0x0000012c: f041 0014 orr r0, r1, #0x14 +0x00000130: f8c5 0d10 str.w r0, [r5, #0xd10] +0x00000134: f8d5 0200 ldr.w r0, [r5, #0x200] +0x00000138: f8d5 6100 ldr.w r6, [r5, #0x100] +0x0000013c: 4206 tst r6, r0 +0x0000013e: bf02 ittt eq +0x00000140: f3bf 8f4f dsbeq sy +0x00000144: bf20 wfeeq + +Linking TBs 0x7df6dc000940 index 0 -> 0x7df6dc000a80 +Trace 0: 0x7df6dc000a80 [00800400/0000000000000120/00000110/ff020000] +[...] +Trace 0: 0x7df6dc001fc0 [00800400/0000000000000170/00000110/ff020000] +Taking exception 3 [Prefetch Abort] on CPU 0 +...at fault address 0xdeadbeee +...with CFSR.IACCVIOL +...BusFault with BFSR.STKERR +...taking pending nonsecure exception 3 +...loading from element 3 of non-secure vector table at 0xc +...loaded new PC 0x111 +---------------- +IN: +0x00000110: e7fe b #0x110 +``` +Steps to reproduce: +1. Build QEMU with `CONFIG_DEBUG_TCG` enabled, e.g. with `./configure --enable-debug`. +1. Run Cortex-M firmware with `IT` instruction. (minimal example attached) +Additional information: +- Minimal Reproducer: [raw-it-bug.hex](/uploads/3ae30ab78f49bbc933e48c51f6bf2a2b/raw-it-bug.hex) +- Reproduced on `main`, `v10.0.0` and `v9.1.0`. diff --git a/gitlab/issues_text/target_arm/host_missing/accel_TCG/317 b/gitlab/issues_text/target_arm/host_missing/accel_TCG/317 new file mode 100644 index 000000000..e19bf23f5 --- /dev/null +++ b/gitlab/issues_text/target_arm/host_missing/accel_TCG/317 @@ -0,0 +1 @@ +synchronous abort on accessing unused I/O ports on aarch64 diff --git a/gitlab/issues_text/target_arm/host_missing/accel_TCG/333 b/gitlab/issues_text/target_arm/host_missing/accel_TCG/333 new file mode 100644 index 000000000..bf6093df3 --- /dev/null +++ b/gitlab/issues_text/target_arm/host_missing/accel_TCG/333 @@ -0,0 +1 @@ +random errors on aarch64 when executing __aarch64_cas8_acq_rel diff --git a/gitlab/issues_text/target_arm/host_missing/accel_TCG/364 b/gitlab/issues_text/target_arm/host_missing/accel_TCG/364 new file mode 100644 index 000000000..814a33066 --- /dev/null +++ b/gitlab/issues_text/target_arm/host_missing/accel_TCG/364 @@ -0,0 +1 @@ +qemu-aarch64: incorrect signed comparison in ldsmax instructions diff --git a/gitlab/issues_text/target_arm/host_missing/accel_TCG/367 b/gitlab/issues_text/target_arm/host_missing/accel_TCG/367 new file mode 100644 index 000000000..48db80475 --- /dev/null +++ b/gitlab/issues_text/target_arm/host_missing/accel_TCG/367 @@ -0,0 +1 @@ +qemu-system-aarch64 crash on qemu 6.0 - Windows 10 diff --git a/gitlab/issues_text/target_arm/host_missing/accel_TCG/381 b/gitlab/issues_text/target_arm/host_missing/accel_TCG/381 new file mode 100644 index 000000000..b9ea3c9f2 --- /dev/null +++ b/gitlab/issues_text/target_arm/host_missing/accel_TCG/381 @@ -0,0 +1 @@ +ERROR:target/arm/translate-a64.c:13229:disas_simd_two_reg_misc_fp16: code should not be reached diff --git a/gitlab/issues_text/target_arm/host_missing/accel_TCG/385 b/gitlab/issues_text/target_arm/host_missing/accel_TCG/385 new file mode 100644 index 000000000..870c77878 --- /dev/null +++ b/gitlab/issues_text/target_arm/host_missing/accel_TCG/385 @@ -0,0 +1 @@ +ARM user regression since 87b74e8b6edd287ea2160caa0ebea725fa8f1ca1 diff --git a/gitlab/issues_text/target_arm/host_missing/accel_TCG/403 b/gitlab/issues_text/target_arm/host_missing/accel_TCG/403 new file mode 100644 index 000000000..15569dfb9 --- /dev/null +++ b/gitlab/issues_text/target_arm/host_missing/accel_TCG/403 @@ -0,0 +1 @@ +MTE false positives for unaligned accesses diff --git a/gitlab/issues_text/target_arm/host_missing/accel_TCG/503 b/gitlab/issues_text/target_arm/host_missing/accel_TCG/503 new file mode 100644 index 000000000..0de93658c --- /dev/null +++ b/gitlab/issues_text/target_arm/host_missing/accel_TCG/503 @@ -0,0 +1 @@ +QEMU aarch64 Segmentation fault on Mac OSX, machine raspi3 diff --git a/gitlab/issues_text/target_arm/host_missing/accel_TCG/514 b/gitlab/issues_text/target_arm/host_missing/accel_TCG/514 new file mode 100644 index 000000000..3c3fd8c4d --- /dev/null +++ b/gitlab/issues_text/target_arm/host_missing/accel_TCG/514 @@ -0,0 +1,25 @@ +MTE reports false positive for "str" instruction with the SP as the base register. +Description of problem: +When PE executes "sp"-based store instruction with offset I got tag check fault exception. But according to arm spec. load or store that uses "sp" register should generate Tag Unchecked access. +Steps to reproduce: +Clang version: clang version 12.0.1. +I compiled my code using "-target aarch64-linux -march=armv8+memtag -fsanitize=memtag" for Clang. Clang generates following code: +``` +0000000000000c14 <test_func>: + c14: a9bc7bfd stp x29, x30, [sp, #-64]! + c18: f9000bf7 str x23, [sp, #16] + ... +``` +Whole stack was mapped in translation tables as Tagged memory."SCTLR" register was configured to trigger synchronous exception on tag mismatch. +When cpu executes firs instruction "stp x29, x30, [sp, #-64]!" I got tag check fault exception: "0b010001 When FEAT_MTE is implemented Synchronous Tag Check Fault": +ESR_EL1=0x96000051. + +According to ARM specification load or store that uses "sp" register should generate Tag Unchecked access: +``` +A Tag Unchecked access will be generated for a load or store that uses either of the following: +• A base register only, with the SP as the base register. +• A base register plus immediate offset addressing form, with the SP as the base register. +``` +Looks like qemu erroneously generates tag mismatch exceptions for SP-based loads and stores with immediate offset. +Additional information: + diff --git a/gitlab/issues_text/target_arm/host_missing/accel_TCG/60 b/gitlab/issues_text/target_arm/host_missing/accel_TCG/60 new file mode 100644 index 000000000..8a3371100 --- /dev/null +++ b/gitlab/issues_text/target_arm/host_missing/accel_TCG/60 @@ -0,0 +1 @@ +qemu-system-aarch64 (tcg): cval + voff overflow not handled, causes qemu to hang diff --git a/gitlab/issues_text/target_arm/host_missing/accel_TCG/734 b/gitlab/issues_text/target_arm/host_missing/accel_TCG/734 new file mode 100644 index 000000000..6b891d938 --- /dev/null +++ b/gitlab/issues_text/target_arm/host_missing/accel_TCG/734 @@ -0,0 +1,28 @@ +aarch64 tlb range invalidate is not accurate +Description of problem: +In this (https://gitlab.com/qemu-project/qemu/-/commit/84940ed82552d3c7c7327c83076b02cee7978257) commit, tlb range invalidate support is added, and I think qemu's range calculation is wrong. + +In `tlbi_aa64_range_get_length` function, `num`, `scale`, `page_size_granule` is caculated as below. + + +``` + num = extract64(value, 39, 4); + scale = extract64(value, 44, 2); + page_size_granule = extract64(value, 46, 2); + + page_shift = page_size_granule * 2 + 12; +``` + +As [Arm documentation](https://developer.arm.com/documentation/ddi0595/2021-06/AArch64-Instructions/TLBI-RVALE1--TLBI-RVALE1NXS--TLB-Range-Invalidate-by-VA--Last-level--EL1), NUM bits's length is 5, but the code above only extract 4bits. + +And `page_shift` also should be calculated as `(page_size_granule-1) <<1) + 12` rather than `page_size_granule * 2 + 12`. +Steps to reproduce: +1. +2. +3. +Additional information: +I found this issue while debugging a phenomenon that kernel panic occurs randomly in my qemu fork. + +I'm pretty sure this is one of the causes, but even if I roughly correct it, my problem has not been solved. + +I think my problem is TLB invalidate related issue, so if I find any more problems, I'll comment here. diff --git a/gitlab/issues_text/target_arm/host_missing/accel_TCG/735 b/gitlab/issues_text/target_arm/host_missing/accel_TCG/735 new file mode 100644 index 000000000..ab7528401 --- /dev/null +++ b/gitlab/issues_text/target_arm/host_missing/accel_TCG/735 @@ -0,0 +1,26 @@ +softmmu 'at' not behaving +Description of problem: +This looks like a bug to me, please correct if I'm wrong. The execution context is EL2 here and we run KVM vms on top of the system emulation. Anyway, here we have stopped in the EL2 and want to translate a virtual address '0' with 'at'. While the '0' itself is not mapped, something in the first gigabyte is, and the softmmu refuses to walk to it: + +0x0000000100004a3c <at_s12e1r+8>: 80 78 0c d5 at s12e1r, x0 +0x0000000100004a40 <at_s12e1r+12>: 01 74 38 d5 mrs x1, par_el1 + +(gdb) info registers x0 x1 +x0 0x0 0 +x1 0x809 2057 + +So that would be translation fault level 0, stage 1 if I'm not mistaken. + +(gdb) info all-registers TCR_EL1 VTCR_EL2 TTBR1_EL1 +TCR_EL1 0x400035b5503510 18014629184681232 +VTCR_EL2 0x623590 6436240 +TTBR1_EL1 0x304000041731001 217298683118686209 + +(gdb) p print_table(0x41731000) +000:0x000000ffff9803 256:0x000000fffff803 507:0x00000041fbc803 +508:0x000000ff9ef803 + +The first gigabyte is populated, yet the 'at' knows nothing about it. Did I miss something? This seems to be working fine on the hardware. +Steps to reproduce: +1. Stop in the EL2 while the linux is running (GDB) +2. Use something along the lines of this function to translate any kernel virtual address: https://github.com/jkrh/kvms/blob/4c26c786be9971613b3b7f56121c1a1aa3b9585a/core/helpers.h#L74 diff --git a/gitlab/issues_text/target_arm/host_missing/accel_TCG/788 b/gitlab/issues_text/target_arm/host_missing/accel_TCG/788 new file mode 100644 index 000000000..1ce1ec885 --- /dev/null +++ b/gitlab/issues_text/target_arm/host_missing/accel_TCG/788 @@ -0,0 +1 @@ +FEAT_PAuth trapping behaviour incorrectly emulated on Secure-EL0/1 with Secure-EL2 disabled diff --git a/gitlab/issues_text/target_arm/host_missing/accel_TCG/790 b/gitlab/issues_text/target_arm/host_missing/accel_TCG/790 new file mode 100644 index 000000000..24cddd94f --- /dev/null +++ b/gitlab/issues_text/target_arm/host_missing/accel_TCG/790 @@ -0,0 +1 @@ +Attribute bits in stage 1/stage 2 block descriptors are not fully masked during AArch64 page table walks diff --git a/gitlab/issues_text/target_arm/host_missing/accel_TCG/799 b/gitlab/issues_text/target_arm/host_missing/accel_TCG/799 new file mode 100644 index 000000000..13c6c3083 --- /dev/null +++ b/gitlab/issues_text/target_arm/host_missing/accel_TCG/799 @@ -0,0 +1,47 @@ +TCG Optimizer crashes on AArch64 SVE2 instruction +Description of problem: +QEMU crashes due to an assertion in the TCG optimizer when optimizing an SVE2 instruction: +``` +Unrecognized operation 145 in do_constant_folding. +../tcg/optimize.c:458: tcg fatal error +``` +Steps to reproduce: +1. Compile the following minimized reproducer: (a pre-compiled image is provided for convenience - [reproducer.img](/uploads/0bddbfac55306a297fee59dd2f6923cf/reproducer.img)) +```asm +.org 0x0 +entry: + mrs x1, cptr_el3 + orr x9, x1, #0x100 + msr cptr_el3, x9 + + msr cptr_el2, xzr + + mov x1, #0x3 + mrs x9, cpacr_el1 + bfi x9, x1, #16, #2 + bfi x9, x1, #20, #2 + msr cpacr_el1, x9 + + mov x9, 512 + mov x0, x9 + asr x0, x0, 7 + sub x9, x0, #1 + msr zcr_el1, x9 + + mov x9, 512 + mov x0, x9 + asr x0, x0, 7 + sub x9, x0, #1 + msr zcr_el2, x9 + + mov x9, 512 + mov x0, x9 + asr x0, x0, 7 + sub x9, x0, #1 + msr zcr_el3, x9 + + uqxtnt z11.s, z22.d +``` +2. Execute it using the command line given above. +Additional information: +I tested latest master as well, and the problem persists. diff --git a/gitlab/issues_text/target_arm/host_missing/accel_TCG/826 b/gitlab/issues_text/target_arm/host_missing/accel_TCG/826 new file mode 100644 index 000000000..b20309dc9 --- /dev/null +++ b/gitlab/issues_text/target_arm/host_missing/accel_TCG/826 @@ -0,0 +1,16 @@ +AArch64 SVE2 LDNT1SB (vector plus scalar) load address incorrectly calculated +Description of problem: +During execution of the following SVE2 instruction: +`ldnt1sb {z6.d}, p3/z, [z14.d, x9]` +with the following register state: +``` +(gdb) p $p3 +$1 = {0x7, 0x0, 0x74, 0x0, 0x43, 0x0, 0x29, 0x0, 0x47, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x47, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x66, 0xe4, 0x64, 0x0, 0x0, 0x0, 0x0, 0x0, 0x20, 0x11, 0x31, 0x1, 0x0, 0x0, 0x0, 0x0, 0x20, 0x11, 0x31, 0x1, 0x0, 0x0, 0x0, 0x0, 0xb0, 0x8b, 0x49, 0x34, 0xfc, 0x7f, 0x0, 0x0, 0xe0, 0x71, 0x30, 0x1, 0x0, 0x0, 0x0, 0x0} +(gdb) p $z14.d.u +$2 = {0x3bdeaa30, 0x3bdeaa33, 0x3bdeaa36, 0x3bdeaa39, 0x3bdeaa3c, 0x3bdeaa3f, 0x3bdeaa42, 0x3bdeaa45} +(gdb) p $x9 +$3 = 0x0 +``` +QEMU produces a data abort due to an address fault on address `0x5EE45E4E`, which it clearly should not have tried to load. +Additional information: +A quick look at the implementation of the LDNT1SB instruction in QEMU points to the following commit: https://gitlab.com/qemu-project/qemu/-/commit/cf327449816d5643106445420a0b06b0f38d4f01 which simply redirects to SVE's LD1SB handler. As these instructions use a new flavor of SVE scatter/gather loads (vector plus scalar) which SVE LD1SB does not support, I wonder if the LD1SB handler simply decodes it as the wrong instruction and treats it as a (scalar plus vector) instruction, which LD1SB does support, but whose address calculation is completely different. diff --git a/gitlab/issues_text/target_arm/host_missing/accel_TCG/840 b/gitlab/issues_text/target_arm/host_missing/accel_TCG/840 new file mode 100644 index 000000000..bc9174c35 --- /dev/null +++ b/gitlab/issues_text/target_arm/host_missing/accel_TCG/840 @@ -0,0 +1,10 @@ +When O2 level is enabled raspi3b board crash randomly when creating abuffer of a differnt size +Description of problem: +Sometimes when running the code creating a framebuffer different from the default size ej:1024x768 qemu hangs and crash with a SIGV, making a weird screen that's painted with the original size and the background of the current window merged onto a large window. This happens when you resize a window without updating it's contents, so qemu is crashing before the first frame after reising the window. +Steps to reproduce: +1. Create a producedure similar to the one descrived below +2. Run qemu with O2 enabled(debuggind disabled) +3. You may need to run it multiple times to see the bug(like two or three times) +Additional information: +Here is the example procedure implemented on rust, the mailbox interface is test and it's sure that the procedure it's well implemented: +[code.rs](/uploads/a28fe33a856fb843d80ffeb078bc6729/code.rs) diff --git a/gitlab/issues_text/target_arm/host_missing/accel_TCG/876 b/gitlab/issues_text/target_arm/host_missing/accel_TCG/876 new file mode 100644 index 000000000..8d819538e --- /dev/null +++ b/gitlab/issues_text/target_arm/host_missing/accel_TCG/876 @@ -0,0 +1,34 @@ +snek-arm fails on s390x with qemu >6.1 +Description of problem: +snek is a language inspired by python for embedded. The tests run snek code natively (in this case on s390x) as well as in python3 as well as emulated for arm. +The latter is what fails... + +the Ubuntu testing has spotted this in: + +- https://autopkgtest.ubuntu.com/results/autopkgtest-jammy/jammy/s390x/s/snek/20220211_065108_2144a@/log.gz +- https://autopkgtest.ubuntu.com/results/autopkgtest-jammy/jammy/s390x/s/snek/20220212_050524_3b7ee@/log.gz +- https://autopkgtest.ubuntu.com/results/autopkgtest-jammy/jammy/s390x/s/snek/20220214_080226_46968@/log.gz + +In there all work, but one test fails reproducible, that is `test/pass-slice.py` + +When eliminating the automation in makefiles and all that it comes down to: +``` +$ qemu-system-arm -chardev stdio,mux=on,id=stdio0 -serial none -monitor none -semihosting-config enable=on,chardev=stdio0,arg='snek',arg=test/pass-slice.py -machine mps2-an385,accel=tcg -cpu cortex-m3 -kernel /usr/share/snek/snek-qemu-arm-1.7.elf -nographic -bios none +fail: [::-5] (model 'o' impl '') +``` + +To be clear: +- the test for python3 works on all platforms +- the test for snek-native works on all platforms +- the test for snek-arm work on all platforms except s390x +- with qemu 6.0 this worked, but the more recent qemu 6.2 makes it fail +- only some subtests of pass-slice.py fail (see below) + +I've gone into some details for the snek side of things in [the bug report there](https://github.com/keith-packard/snek/issues/58). +Steps to reproduce: +1. get an s390x system +2. get the snek elf file for arm +3. run qemu-system-arm as shown above + +P.S. I tried this on latest head (building qemu in an F35 container) and it fails there as well, hence I'm listing commit 2d88a3a595 as affected as well. +We know 6.0 was ok, so likely 6.0->6.1 brought the issue, I have not yet checked if a bisect is feasible for this. diff --git a/gitlab/issues_text/target_arm/host_missing/accel_TCG/890 b/gitlab/issues_text/target_arm/host_missing/accel_TCG/890 new file mode 100644 index 000000000..cd278b408 --- /dev/null +++ b/gitlab/issues_text/target_arm/host_missing/accel_TCG/890 @@ -0,0 +1 @@ +Misinterpretation of arm neon invalid insn diff --git a/gitlab/issues_text/target_arm/host_missing/accel_TCG/910 b/gitlab/issues_text/target_arm/host_missing/accel_TCG/910 new file mode 100644 index 000000000..0ca9796de --- /dev/null +++ b/gitlab/issues_text/target_arm/host_missing/accel_TCG/910 @@ -0,0 +1 @@ +Black screen in qemu 6.2 with wayland, weston, gtk, virgl, ivi shell, Aarch64 diff --git a/gitlab/issues_text/target_arm/host_missing/accel_TCG/925 b/gitlab/issues_text/target_arm/host_missing/accel_TCG/925 new file mode 100644 index 000000000..cdb2e2b1e --- /dev/null +++ b/gitlab/issues_text/target_arm/host_missing/accel_TCG/925 @@ -0,0 +1,18 @@ +AArch64 SVE2 LD/ST instructions segfault on MMIO addresses +Description of problem: +During execution of the following SVE2 instruction: `ld1b {z9.s}, p2/z, [x17, z26.s, sxtw]` with the following register state: +``` +(gdb) p $x17 +$1 = 0xffffffe2 +(gdb) p $z26.s.u +$2 = {0x0 <repeats 16 times>} +(gdb) p $p2 +$3 = {0xc4, 0x0, 0x9d, 0x0, 0xe5, 0x0, 0x83, 0x0, 0x80, 0xce, 0x3f, 0x3, 0x0, 0x0, 0x0, 0x0, 0x46, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x56, 0x1a, 0x6e, 0x0, 0x0, 0x0, 0x0, 0x0, 0xf0, 0xd8, 0x96, 0xee, 0xfc, 0x7f, 0x0, 0x0, 0x50, 0xce, 0x94, 0x1, 0x0, 0x0, 0x0, 0x0, 0xf0, 0xd8, 0x96, 0xee, 0xfc, 0x7f, 0x0, 0x0, 0x10, 0x38, 0x40, 0x3, 0x0, 0x0, 0x0, 0x0} +``` +QEMU segfaults due to a null pointer access. Note that after translation this address is an MMIO address that points to a UART device. +Additional information: +A quick look at the implementation of the SVE2 load/store host memory access functions I've noticed that the `TLB_MMIO` flag is ignored in `sve_probe_page`, which means that users use the (null) host address as if it was pointing to real memory. This function (or the ones above it) should (probably) throw the appropriate external data abort, otherwise this needs to be instrumented to support reading from MMIO mapped devices. + +<details><summary>Reproducer seed for my future self</summary> +S6008340160849309262|Q|cd4t|pq|w5|lK124 +</details> diff --git a/gitlab/issues_text/target_arm/host_missing/accel_TCG/953 b/gitlab/issues_text/target_arm/host_missing/accel_TCG/953 new file mode 100644 index 000000000..afe3f8993 --- /dev/null +++ b/gitlab/issues_text/target_arm/host_missing/accel_TCG/953 @@ -0,0 +1 @@ +qemu-system-aarch64 asserts trying to execute STXP on hosts without HAVE_CMPXCHG128 diff --git a/gitlab/issues_text/target_arm/host_missing/accel_TCG/964 b/gitlab/issues_text/target_arm/host_missing/accel_TCG/964 new file mode 100644 index 000000000..44a1767dd --- /dev/null +++ b/gitlab/issues_text/target_arm/host_missing/accel_TCG/964 @@ -0,0 +1,40 @@ +arm64 defconfig kernel (4.14.275) no longer boots after FEAT_LPA implementation in TCG +Description of problem: +I am not really sure if this is a bug or merely a scenario where this is not expected to work. After 7a928f43d8724bdf0777d7fc67a5ad973a0bf4bf, the attached `Image.gz` (`ARCH=arm64 defconfig`, based on the latest `linux-4.14.y`) just hangs with no output when using `-cpu max` (or `-cpu max,lpa2=off` due to 69b2265d5fe8e0f401d75e175e0a243a7d505e53). At 0af312b6edd231e1c8d0dec12494a80bc39ac761, `-cpu max` works just fine, as shown by the bisect log below. + +``` +$ git bisect log +# bad: [99eb313ddbbcf73c1adcdadceba1423b691c6d05] ui/cocoa: Use the standard about panel +# good: [44f28df24767cf9dca1ddc9b23157737c4cbb645] Update version for v6.2.0 release +git bisect start '99eb313ddbbcf73c1adcdadceba1423b691c6d05' 'v6.2.0' +# good: [2fc1b44dd0e7ea9ad5920352fd04179e4d6836d9] target/riscv: rvv-1.0: Allow Zve32f extension to be turned on +git bisect good 2fc1b44dd0e7ea9ad5920352fd04179e4d6836d9 +# good: [e64e27d5cb103b7764f1a05b6eda7e7fedd517c5] 9pfs: Fix segfault in do_readdir_many caused by struct dirent overread +git bisect good e64e27d5cb103b7764f1a05b6eda7e7fedd517c5 +# good: [747ffe28cad7129e1d326d943228fdcbe109530d] pnv/xive2: Add support XIVE2 P9-compat mode (or Gen1) +git bisect good 747ffe28cad7129e1d326d943228fdcbe109530d +# bad: [4377683df969e715e3cb2dbd258e44f9ff51f788] edid: Fix clock of Detailed Timing Descriptor +git bisect bad 4377683df969e715e3cb2dbd258e44f9ff51f788 +# good: [755e8d7cb6ce2ba62d282ffbb367de391fe0cc3d] migration: Move static var in ram_block_from_stream() into global +git bisect good 755e8d7cb6ce2ba62d282ffbb367de391fe0cc3d +# bad: [6629bf78aac7e53f83fd0bcbdbe322e2302dfd1f] Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20220302' into staging +git bisect bad 6629bf78aac7e53f83fd0bcbdbe322e2302dfd1f +# good: [0af312b6edd231e1c8d0dec12494a80bc39ac761] target/arm: Implement FEAT_LVA +git bisect good 0af312b6edd231e1c8d0dec12494a80bc39ac761 +# bad: [dc8bc9d6574aa563ed2fcc0ff495e77a2a2a8faa] target/arm: Report KVM's actual PSCI version to guest in dtb +git bisect bad dc8bc9d6574aa563ed2fcc0ff495e77a2a2a8faa +# bad: [d976de218c534735e307fc4a6c03e3ae764fd419] target/arm: Fix TLBIRange.base for 16k and 64k pages +git bisect bad d976de218c534735e307fc4a6c03e3ae764fd419 +# bad: [13e481c9335582fc7eed12e24e8d4d7068b24ff8] target/arm: Extend arm_fi_to_lfsc to level -1 +git bisect bad 13e481c9335582fc7eed12e24e8d4d7068b24ff8 +# bad: [7a928f43d8724bdf0777d7fc67a5ad973a0bf4bf] target/arm: Implement FEAT_LPA +git bisect bad 7a928f43d8724bdf0777d7fc67a5ad973a0bf4bf +# first bad commit: [7a928f43d8724bdf0777d7fc67a5ad973a0bf4bf] target/arm: Implement FEAT_LPA +``` + +A `4.19.237` kernel boots right up with `-cpu max`/`-cpu max,lpa2=off`. Is this expected behavior given the age of the kernel or is there something else going on here? If this is expected, should we be using something like `-cpu cortex-a72` for these older kernels? +Steps to reproduce: +Run the above command with the attached `Image.gz` and `rootfs.cpio`. +Additional information: +[Image.gz](/uploads/7b25b70f210354663b8e391290d3f39c/Image.gz) +[rootfs.cpio](/uploads/4793be1a500bdf615e212d3379c4c175/rootfs.cpio) diff --git a/gitlab/issues_text/target_arm/host_missing/accel_TCG/998 b/gitlab/issues_text/target_arm/host_missing/accel_TCG/998 new file mode 100644 index 000000000..be2ff866b --- /dev/null +++ b/gitlab/issues_text/target_arm/host_missing/accel_TCG/998 @@ -0,0 +1,60 @@ +AArch64: SCTLR_EL1.BT0 set incorrectly in user mode +Description of problem: +PACIASP normally acts as a BTI landing pad, but not in every situation. When SCTLR_EL1.BT is set, PACIASP checks that the indirect branch originates from X16 or X17 when the indirect branch is taken from a BTI guarded area. Linux sets this bit, ideally QEMU-user should too. This sample program should crash with a SIGILL if QEMU is working correctly, otherwise it will crash with a SIGSEGV. + + #include <stdint.h> + #include <stdlib.h> + #include <unistd.h> + #include <string.h> + #include <stdio.h> + #include <sys/mman.h> + + // PACIASP is a valid BTI landing pad, but there are some conditions + // under Linux which sets SCTLR_ELx.BT0 = 1. In this mode, a branch + // onto a PACIASP landing pad is only valid if it originates from + // x16 or x17 (i.e. br x17 is OK, br x3 is not). + // More info on page D5-4851 of the Arm Architecture Reference Manual (ARM DDI 0487H.a). + + // Sample function which starts with a paciasp instruction + // (comes from -mbranch-protection=pac-ret+leaf) + void indirect_fn(int i) { + // paciasp instruction inserted here - should crash with SIGILL here if everything's operating OK. + i = i+1; + // Can't access this function from the copied location, so will segfault. + fprintf(stderr, "reachable\n"); + } + + int main(int argc, char **argv) { + // It's difficult to get a whole binary BTI compatible without the appropriate crtbegin etc + // so instead map a page and copy the sample function there. + void *e = mmap(0, getpagesize(), PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); + if (e == MAP_FAILED) { + return 1; + } + memcpy(e, (void*)indirect_fn, 64); + mprotect(e, getpagesize(), PROT_READ | PROT_EXEC | PROT_BTI); + + // paciasp is a valid landing pad if the branch comes from an unprotected area, + // so to ensure that we're protected - assemble an intermediate shim that's also PROT_BTI. + void *f = mmap(0, getpagesize(), PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); + if (f == MAP_FAILED) { + return 1; + } + uint32_t *x = (uint32_t*)f; + x[0] = 0xd503245fuL; // bti c + x[1] = 0xd61f0060uL; // br x1 - n.b. must be BR + mprotect(f, getpagesize(), PROT_READ | PROT_EXEC | PROT_BTI); + + // Jump to the shim + asm volatile ( + "mov x3, %0\n" + "mov x2, %1\n" + "blr x2\n" + : : "p"(e), "p"(f) : "x2", "x3"); + + // Execution should not reach here + return 1; + } +Steps to reproduce: +1. Compile with `clang-12 -g --sysroot=/work/home/fedora-rootfs/fedora_aarch64 -o sample --target=aarch64-linux-gnu -mbranch-protection=pac-ret+leaf -march=armv8-a -O1 -g sample.c` or similar. +2. Run with `../qemu/build/qemu-aarch64 --cpu max -L ~/fedora-rootfs/fedora_aarch64 sample` |