summary refs log tree commit diff stats
path: root/docs/sphinx/kerneldoc.py (unfollow)
Commit message (Collapse)AuthorFilesLines
2025-02-11target/arm: Handle FPCR.AH in FMLSL (by element and vector)Richard Henderson1-25/+46
Handle FPCR.AH's requirement to not negate the sign of a NaN in FMLSL by element and vector, using the usual trick of negating by XOR when AH=0 and by muladd flags when AH=1. Since we have the CPUARMState* in the helper anyway, we can look directly at env->vfp.fpcr and don't need toa pass in the FPCR.AH value via the SIMD data word. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20250129013857.135256-31-richard.henderson@linaro.org [PMM: commit message tweaked] Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2025-02-11target/arm: Handle FPCR.AH in SVE FCMLARichard Henderson2-28/+43
The negation step in SVE FCMLA mustn't negate a NaN when FPCR.AH is set. Use the same approach as we did for A64 FCMLA of passing in FPCR.AH and using it to select whether to negate by XOR or by the muladd negate_product flag. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20250129013857.135256-28-richard.henderson@linaro.org Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2025-02-11target/arm: Handle FPCR.AH in FCMLA by indexRichard Henderson2-19/+27
The negation step in FCMLA by index mustn't negate a NaN when FPCR.AH is set. Use the same approach as vector FCMLA of passing in FPCR.AH and using it to select whether to negate by XOR or by the muladd negate_product flag. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20250129013857.135256-27-richard.henderson@linaro.org [PMM: Expanded commit message] Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2025-02-11target/arm: Handle FPCR.AH in vector FCMLARichard Henderson2-28/+40
The negation step in FCMLA mustn't negate a NaN when FPCR.AH is set. Handle this by passing FPCR.AH to the helper via the SIMD data field, and use this to select whether to do the negation via XOR or via the muladd negate_product flag. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20250129013857.135256-26-richard.henderson@linaro.org [PMM: Expanded commit message] Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2025-02-11target/arm: Handle FPCR.AH in SVE FTMADPeter Maydell2-10/+35
The negation step in the SVE FTMAD insn mustn't negate a NaN when FPCR.AH is set. Pass FPCR.AH to the helper via the SIMD data field, so we can select the correct behaviour. Because the operand is known to be negative, negating the operand is the same as taking the absolute value. Defer this to the muladd operation via flags, so that it happens after NaN detection, which is correct for FPCR.AH. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2025-02-11target/arm: Handle FPCR.AH in SVE FTSSELPeter Maydell2-5/+17
The negation step in the SVE FTSSEL insn mustn't negate a NaN when FPCR.AH is set. Pass FPCR.AH to the helper via the SIMD data field and use that to determine whether to do the negation. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2025-02-11target/arm: Handle FPCR.AH in negation step in SVE FMLS (vector)Peter Maydell3-24/+114
Handle the FPCR.AH "don't negate the sign of a NaN" semantics fro the SVE FMLS (vector) insns, by providing new helpers for the AH=1 case which end up passing fpcr_ah = true to the do_fmla_zpzzz_* functions that do the work. The float*_muladd functions have a flags argument that can perform optional negation of various operand. We don't use that for "normal" arm fmla, because the muladd flags are not applied when an input is a NaN. But since FEAT_AFP does not negate NaNs, this behaviour is exactly what we need. The non-AH helpers pass in a zero flags argument and control the negation via the neg1 and neg3 arguments; the AH helpers always pass in neg1 and neg3 as zero and control the negation via the flags argument. This allows us to avoid conditional branches within the inner loop. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2025-02-11target/arm: Handle FPCR.AH in negation in FMLS (vector)Peter Maydell3-1/+32
Handle the FPCR.AH "don't negate the sign of a NaN" semantics in FMLS (vector), by implementing a new set of helpers for the AH=1 case. The float_muladd_negate_product flag produces the same result as negating either of the multiplication operands, assuming neither of the operands are NaNs. But since FEAT_AFP does not negate NaNs, this behaviour is exactly what we need. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2025-02-11target/arm: Handle FPCR.AH in negation step in FMLS (indexed)Peter Maydell4-29/+57
Handle the FPCR.AH "don't negate the sign of a NaN" semantics in FMLS (indexed). We do this by creating 6 new helpers, which allow us to do the negation either by XOR (for AH=0) or by muladd flags (for AH=1). Signed-off-by: Peter Maydell <peter.maydell@linaro.org> [PMM: Mostly from RTH's patch; error in index order into fns[][] fixed] Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2025-02-11target/arm: Handle FPCR.AH in FRECPS and FRSQRTS vector insnsPeter Maydell4-6/+44
Handle the FPCR.AH "don't negate the sign of a NaN" semantics in the vector versions of FRECPS and FRSQRTS, by implementing new vector wrappers that call the _ah_ scalar helpers. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2025-02-11target/arm: Handle FPCR.AH in FRECPS and FRSQRTS scalar insnsPeter Maydell4-82/+84
Handle the FPCR.AH semantics that we do not change the sign of an input NaN in the FRECPS and FRSQRTS scalar insns, by providing new helper functions that do the CHS part of the operation differently. Since the extra helper functions would be very repetitive if written out longhand, we condense them and the existing non-AH helpers into being emitted via macros. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2025-02-11target/arm: Handle FPCR.AH in negation steps in FCADDPeter Maydell2-26/+38
The negation steps in FCADD must honour FPCR.AH's "don't change the sign of a NaN" semantics. Implement this by encoding FPCR.AH into the SIMD data field passed to the helper and using that to decide whether to negate the values. The construction of neg_imag and neg_real were done to make it easy to apply both in parallel with two simple logical operations. This changed with FPCR.AH, which is more complex than that. Switch to an approach closer to the pseudocode, where we extract the rot parameter from the SIMD data word and negate the appropriate input value. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2025-02-11target/arm: Handle FPCR.AH in negation steps in SVE FCADDPeter Maydell3-13/+48
The negation steps in FCADD must honour FPCR.AH's "don't change the sign of a NaN" semantics. Implement this in the same way we did for the base ASIMD FCADD, by encoding FPCR.AH into the SIMD data field passed to the helper and using that to decide whether to negate the values. The construction of neg_imag and neg_real were done to make it easy to apply both in parallel with two simple logical operations. This changed with FPCR.AH, which is more complex than that. Switch to an approach that follows the pseudocode more closely, by extracting the 'rot=1' parameter from the SIMD data field and changing the sign of the appropriate input value. Note that there was a naming issue with neg_imag and neg_real. They were named backward, with neg_imag being non-zero for rot=1, and vice versa. This was combined with reversed usage within the loop, so that the negation in the end turned out correct. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2025-02-11target/arm: Handle FPCR.AH in SVE FABDPeter Maydell3-1/+30
Make the SVE FABD insn honour the FPCR.AH "don't negate the sign of a NaN" semantics. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2025-02-11target/arm: Handle FPCR.AH in SVE FABSPeter Maydell3-1/+18
Make SVE FABS honour the FPCR.AH "don't negate the sign of a NaN" semantics. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2025-02-11target/arm: Handle FPCR.AH in SVE FNEGPeter Maydell3-1/+18
Make SVE FNEG honour the FPCR.AH "don't negate the sign of a NaN" semantics. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2025-02-11target/arm: Handle FPCR.AH in vector FABDPeter Maydell3-1/+33
Split the handling of vector FABD so that it calls a different set of helpers when FPCR.AH is 1, which implement the "no negation of the sign of a NaN" semantics. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2025-02-11target/arm: Implement FPCR.AH handling for scalar FABS and FABDPeter Maydell1-2/+67
FPCR.AH == 1 mandates that taking the absolute value of a NaN should not change its sign bit. This means we can no longer use gen_vfp_abs*() everywhere but must instead generate slightly more complex code when FPCR.AH is set. Implement these semantics for scalar FABS and FABD. This change also affects all other instructions whose psuedocode calls FPAbs(); we will extend the change to those instructions in following commits. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2025-02-11target/arm: Implement FPCR.AH handling of negation of NaNPeter Maydell1-11/+114
FPCR.AH == 1 mandates that negation of a NaN value should not flip its sign bit. This means we can no longer use gen_vfp_neg*() everywhere but must instead generate slightly more complex code when FPCR.AH is set. Make this change for the scalar FNEG and for those places in translate-a64.c which were previously directly calling gen_vfp_neg*(). This change in semantics also affects any other instruction whose pseudocode calls FPNeg(); in following commits we extend this change to the other affected instructions. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2025-02-11target/arm: Implement FPCR.AH semantics for SVE FMIN/FMAX vectorPeter Maydell3-2/+37
Implement the FPCR.AH semantics for the SVE FMAX and FMIN operations that take two vector operands. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2025-02-11target/arm: Implement FPCR.AH semantics for SVE FMIN/FMAX immediatePeter Maydell3-2/+45
Implement the FPCR.AH semantics for the SVE FMAX and FMIN operations that take an immediate as the second operand. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2025-02-11target/arm: Implement FPCR.AH semantics for SVE FMAXV and FMINVPeter Maydell3-21/+58
Implement the FPCR.AH semantics for the SVE FMAXV and FMINV vector-reduction-to-scalar max/min operations. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2025-02-11target/arm: Implement FPCR.AH semantics for FMINP and FMAXPPeter Maydell3-4/+45
Implement the FPCR.AH semantics for the pairwise floating point minimum/maximum insns FMINP and FMAXP. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2025-02-11target/arm: Implement FPCR.AH semantics for FMAXV and FMINVPeter Maydell1-10/+18
Implement the FPCR.AH semantics for FMAXV and FMINV. These are the "recursively reduce all lanes of a vector to a scalar result" insns; we just need to use the _ah_ helper for the reduction step when FPCR.AH == 1. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2025-02-11target/arm: Implement FPCR.AH semantics for vector FMIN/FMAXPeter Maydell3-2/+41
Implement the FPCR.AH == 1 semantics for vector FMIN/FMAX, by creating new _ah_ versions of the gvec helpers which invoke the scalar fmin_ah and fmax_ah helpers on each element. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2025-02-11target/arm: Implement FPCR.AH semantics for scalar FMIN/FMAXPeter Maydell3-2/+64
When FPCR.AH == 1, floating point FMIN and FMAX have some odd special cases: * comparing two zeroes (even of different sign) or comparing a NaN with anything always returns the second argument (possibly squashed to zero) * denormal outputs are not squashed to zero regardless of FZ or FZ16 Implement these semantics in new helper functions and select them at translate time if FPCR.AH is 1 for the scalar FMAX and FMIN insns. (We will convert the other FMAX and FMIN insns in subsequent commits.) Note that FMINNM and FMAXNM are not affected. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2025-02-11target/arm: Handle FPCR.NEP for NEP for FMUL, FMULX scalar by elementPeter Maydell1-3/+3
do_fp3_scalar_idx() is used only for the FMUL and FMULX scalar by element instructions; these both need to merge the result with the Rn register when FPCR.NEP is set. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2025-02-11target/arm: Handle FPCR.NEP for FCVTXN (scalar)Peter Maydell1-15/+28
Unlike the other users of do_2misc_narrow_scalar(), FCVTXN (scalar) is always double-to-single and must honour FPCR.NEP. Implement this directly in a trans function rather than using do_2misc_narrow_scalar(). We still need gen_fcvtxn_sd() and the f_scalar_fcvtxn[] array for the FCVTXN (vector) insn, so we move those down in the file to where they are used. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2025-02-11target/arm: Handle FPCR.NEP for scalar FABS and FNEGPeter Maydell1-7/+20
Handle FPCR.NEP merging for scalar FABS and FNEG; this requires an extra parameter to do_fp1_scalar_int(), since FMOV scalar does not have the merging behaviour. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2025-02-11target/arm: Handle FPCR.NEP in do_cvtf_scalar()Peter Maydell1-3/+3
Handle FPCR.NEP in the operations handled by do_cvtf_scalar(). Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2025-02-11target/arm: Handle FPCR.NEP for 1-input scalar operationsPeter Maydell1-12/+14
Handle FPCR.NEP for the 1-input scalar operations. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2025-02-11target/arm: Handle FPCR.NEP for BFCVT scalarPeter Maydell1-4/+21
Currently we implement BFCVT scalar via do_fp1_scalar(). This works even though BFCVT is a narrowing operation from 32 to 16 bits, because we can use write_fp_sreg() for float16. However, FPCR.NEP support requires that we use write_fp_hreg_merging() for float16 outputs, so we can't continue to borrow the non-narrowing do_fp1_scalar() function for this. Split out trans_BFCVT_s() into its own implementation that honours FPCR.NEP. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2025-02-11target/arm: Handle FPCR.NEP for 3-input scalar operationsPeter Maydell1-6/+6
Handle FPCR.NEP for the 3-input scalar operations which use do_fmla_scalar_idx() and do_fmadd(), by making them call the appropriate write_fp_*reg_merging() functions. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2025-02-11target/arm: Define and use new write_fp_*reg_merging() functionsPeter Maydell1-26/+91
For FEAT_AFP's FPCR.NEP bit, we need to programmatically change the behaviour of the writeback of the result for most SIMD scalar operations, so that instead of zeroing the upper part of the result register it merges the upper elements from one of the input registers. Provide new functions write_fp_*reg_merging() which can be used instead of the existing write_fp_*reg() functions when we want this "merge the result with one of the input registers if FPCR.NEP is enabled" handling, and use them in do_fp3_scalar_with_fpsttype(). Note that (as documented in the description of the FPCR.NEP bit) which input register to use as the merge source varies by instruction: for these 2-input scalar operations, the comparison instructions take from Rm, not Rn. We'll extend this to also provide the merging behaviour for the remaining scalar insns in subsequent commits. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2025-02-11target/arm: Add FPCR.NEP to TBFLAGSPeter Maydell4-0/+13
For FEAT_AFP, we want to emit different code when FPCR.NEP is set, so that instead of zeroing the high elements of a vector register when we write the output of a scalar operation to it, we instead merge in those elements from one of the source registers. Since this affects the generated code, we need to put FPCR.NEP into the TBFLAGS. FPCR.NEP is treated as 0 when in streaming SVE mode and FEAT_SME_FA64 is not implemented or not enabled; we can implement this logic in rebuild_hflags_a64(). Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2025-02-11target/arm: Use FPST_FPCR_AH for BFMLAL*, BFMLSL* insnsPeter Maydell2-9/+17
When FPCR.AH is 1, use FPST_FPCR_AH for: * AdvSIMD BFMLALB, BFMLALT * SVE BFMLALB, BFMLALT, BFMLSLB, BFMLSLT so that they get the required behaviour changes. We do this by making gen_gvec_op4_fpst() take an ARMFPStatusFlavour rather than a bool is_fp16; existing callsites now select FPST_FPCR_F16_A64 vs FPST_FPCR_A64 themselves rather than passing in the boolean. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2025-02-11target/arm: Use FPST_FPCR_AH for BFCVT* insnsPeter Maydell2-8/+25
When FPCR.AH is 1, use FPST_FPCR_AH for: * AdvSIMD BFCVT, BFCVTN, BFCVTN2 * SVE BFCVT, BFCVTNT so that they get the required behaviour changes. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2025-02-11target/arm: Use FPST_FPCR_AH for FRECPE, FRECPS, FRECPX, FRSQRTE, FRSQRTSPeter Maydell3-35/+127
For the instructions FRECPE, FRECPS, FRECPX, FRSQRTE, FRSQRTS, use FPST_FPCR_AH or FPST_FPCR_AH_F16 when FPCR.AH is 1, so that they get the required behaviour changes. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2025-02-11target/arm: Set up float_status to use for FPCR.AH=1 behaviourPeter Maydell5-1/+47
When FPCR.AH is 1, the behaviour of some instructions changes: * AdvSIMD BFCVT, BFCVTN, BFCVTN2, BFMLALB, BFMLALT * SVE BFCVT, BFCVTNT, BFMLALB, BFMLALT, BFMLSLB, BFMLSLT * SME BFCVT, BFCVTN, BFMLAL, BFMLSL (these are all in SME2 which QEMU does not yet implement) * FRECPE, FRECPS, FRECPX, FRSQRTE, FRSQRTS The behaviour change is: * the instructions do not update the FPSR cumulative exception flags * trapped floating point exceptions are disabled (a no-op for QEMU, which doesn't implement FPCR.{IDE,IXE,UFE,OFE,DZE,IOE}) * rounding is always round-to-nearest-even regardless of FPCR.RMode * denormalized inputs and outputs are always flushed to zero, as if FPCR.{FZ,FIZ} is {1,1} * FPCR.FZ16 is still honoured for half-precision inputs (See the Arm ARM DDI0487L.a section A1.5.9.) We can provide all these behaviours with another pair of float_status fields which we use only for these insns, when FPCR.AH is 1. These float_status fields will always have: * flush_to_zero and flush_inputs_to_zero set for the non-F16 field * rounding mode set to round-to-nearest-even and so the only FPCR fields they need to honour are DN and FZ16. In this commit we only define the new fp_status fields and give them the required behaviour when FPSR is updated. In subsequent commits we will arrange to use this new fp_status field for the instructions that should be affected by FPCR.AH in this way. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2025-02-11target/arm: Add FPCR.AH to tbflagsPeter Maydell5-1/+9
We are going to need to generate different code in some cases when FPCR.AH is 1. For example: * Floating point neg and abs must not flip the sign bit of NaNs * some insns (FRECPE, FRECPS, FRECPX, FRSQRTE, FRSQRTS, and various BFCVT and BFM bfloat16 ops) need to use a different float_status to the usual one Encode FPCR.AH into the A64 tbflags, so we can refer to it at translate time. Because we now have a bit in FPCR that affects codegen, we can't mark the AArch64 FPCR register as being SUPPRESS_TB_END any more; writes to it will now end the TB and trigger a regeneration of hflags. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2025-02-11target/arm: Adjust exception flag handling for AH = 1Peter Maydell1-3/+14
When FPCR.AH = 1, some of the cumulative exception flags in the FPSR behave slightly differently for A64 operations: * IDC is set when a denormal input is used without flushing * IXC (Inexact) is set when an output denormal is flushed to zero Update vfp_get_fpsr_from_host() to do this. Note that because half-precision operations never set IDC, we now need to add float_flag_input_denormal_used to the set we mask out of fp_status_f16_a64. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2025-02-11target/arm: Adjust FP behaviour for FPCR.AH = 1Peter Maydell3-24/+61
When FPCR.AH is set, various behaviours of AArch64 floating point operations which are controlled by softfloat config settings change: * tininess and ftz detection before/after rounding * NaN propagation order * result of 0 * Inf + NaN * default NaN value When the guest changes the value of the AH bit, switch these config settings on the fp_status_a64 and fp_status_f16_a64 float_status fields. This requires us to make the arm_set_default_fp_behaviours() function global, since we now need to call it from cpu.c and vfp_helper.c; we move it to vfp_helper.c so it can be next to the new arm_set_ah_fp_behaviours(). Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2025-02-11target/arm: Implement FPCR.FIZ handlingPeter Maydell1-10/+50
Part of FEAT_AFP is the new control bit FPCR.FIZ. This bit affects flushing of single and double precision denormal inputs to zero for AArch64 floating point instructions. (For half-precision, the existing FPCR.FZ16 control remains the only one.) FPCR.FIZ differs from FPCR.FZ in that if we flush an input denormal only because of FPCR.FIZ then we should *not* set the cumulative exception bit FPSR.IDC. FEAT_AFP also defines that in AArch64 the existing FPCR.FZ only applies when FPCR.AH is 0. We can implement this by setting the "flush inputs to zero" state appropriately when FPCR is written, and by not reflecting the float_flag_input_denormal status flag into FPSR reads when it is the result only of FPSR.FIZ. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2025-02-11target/arm: Define FPCR AH, FIZ, NEP bitsPeter Maydell3-3/+16
The Armv8.7 FEAT_AFP feature defines three new control bits in the FPCR: * FPCR.AH: "alternate floating point mode"; this changes floating point behaviour in a variety of ways, including: - the sign of a default NaN is 1, not 0 - if FPCR.FZ is also 1, denormals detected after rounding with an unbounded exponent has been applied are flushed to zero - FPCR.FZ does not cause denormalized inputs to be flushed to zero - miscellaneous other corner-case behaviour changes * FPCR.FIZ: flush denormalized numbers to zero on input for most instructions * FPCR.NEP: makes scalar SIMD operations merge the result with higher vector elements in one of the source registers, instead of zeroing the higher elements of the destination This commit defines the new bits in the FPCR, and allows them to be read or written when FEAT_AFP is implemented. Actual behaviour changes will be implemented in subsequent commits. Note that these are the first FPCR bits which don't appear in the AArch32 FPSCR view of the register, and which share bit positions with FPSR bits. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2025-02-11fpu: allow flushing of output denormals to be after roundingPeter Maydell14-6/+107
Currently we handle flushing of output denormals in uncanon_normal always before we deal with rounding. This works for architectures that detect tininess before rounding, but is usually not the right place when the architecture detects tininess after rounding. For example, for x86 the SDM states that the MXCSR FTZ control bit causes outputs to be flushed to zero "when it detects a floating-point underflow condition". This means that we mustn't flush to zero if the input is such that after rounding it is no longer tiny. At least one of our guest architectures does underflow detection after rounding but flushing of denormals before rounding (MIPS MSA); this means we need to have a config knob for this that is separate from our existing tininess_before_rounding setting. Add an ftz_detection flag. For consistency with tininess_before_rounding, we make it default to "detect ftz after rounding"; this means that we need to explicitly set the flag to "detect ftz before rounding" on every existing architecture that sets flush_to_zero, so that this commit has no behaviour change. (This means more code change here but for the long term a less confusing API.) For several architectures the current behaviour is either definitely or possibly wrong; annotate those with TODO comments. These architectures are definitely wrong (and should detect ftz after rounding): * x86 * Alpha For these architectures the spec is unclear: * MIPS (for non-MSA) * RX * SH4 PA-RISC makes ftz detection IMPDEF, but we aren't setting the "tininess before rounding" setting that we ought to. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2025-02-11fpu: Implement float_flag_input_denormal_usedPeter Maydell3-6/+107
For the x86 and the Arm FEAT_AFP semantics, we need to be able to tell the target code that the FPU operation has used an input denormal. Implement this; when it happens we set the new float_flag_denormal_input_used. Note that we only set this when an input denormal is actually used by the operation: if the operation results in Invalid Operation or Divide By Zero or the result is a NaN because some other input was a NaN then we never needed to look at the input denormal and do not set denormal_input_used. We mostly do not need to adjust the hardfloat codepaths to deal with this flag, because almost all hardfloat operations are already gated on the input not being a denormal, and will fall back to softfloat for a denormal input. The only exception is the comparison operations, where we need to add the check for input denormals, which must now fall back to softfloat where they did not before. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2025-02-11fpu: Add float_class_denormalPeter Maydell2-18/+54
Currently in softfloat we canonicalize input denormals and so the code that implements floating point operations does not need to care whether the input value was originally normal or denormal. However, both x86 and Arm FEAT_AFP require that an exception flag is set if: * an input is denormal * that input is not squashed to zero * that input is actually used in the calculation (e.g. we did not find the other input was a NaN) So we need to track that the input was a non-squashed denormal. To do this we add a new value to the FloatClass enum. In this commit we add the value and adjust the code everywhere that looks at FloatClass values so that the new float_class_denormal behaves identically to float_class_normal. We will add the code that does the "raise a new float exception flag if an input was an unsquashed denormal and we used it" in a subsequent commit. There should be no behavioural change in this commit. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2025-02-11target/alpha: Don't corrupt error_code with unknown softfloat flagsPeter Maydell1-0/+2
In do_cvttq() we set env->error_code with what is supposed to be a set of FPCR exception bit values. However, if the set of float exception flags we get back from softfloat for the conversion includes a flag which is not one of the three we expect here (invalid_cvti, invalid, inexact) then we will fall through the if-ladder and set env->error_code to the unconverted softfloat exception_flag value. This will then cause us to take a spurious exception. This is harmless now, but when we add new floating point exception flags to softfloat it will cause problems. Add an else clause to the if-ladder to make it ignore any float exception flags it doesn't care about. Specifically, without this fix, 'make check-tcg' will fail for Alpha when the commit adding float_flag_input_denormal_used lands. Fixes: aa3bad5b59e7 ("target/alpha: Use float64_to_int64_modulo for CVTTQ") Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2025-02-10qapi: expose all schema features to codeDaniel P. Berrangé13-7/+110
This replaces use of the constants from the QapiSpecialFeatures enum, with constants from the auto-generate QapiFeatures enum in qapi-features.h The 'deprecated' and 'unstable' features still have a little bit of special handling, being force defined to be the 1st + 2nd features in the enum, regardless of whether they're used in the schema. This retains compatibility with common code that references the features via the QapiSpecialFeatures constants. Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Message-ID: <20250205123550.2754387-5-berrange@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> [Imports tidied up with isort] Signed-off-by: Markus Armbruster <armbru@redhat.com>
2025-02-10qapi: rename 'special_features' to 'features'Daniel P. Berrangé4-26/+18
This updates the QAPI code generation to refer to 'features' instead of 'special_features', in preparation for generalizing their exposure. Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Message-ID: <20250205123550.2754387-4-berrange@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> [Imports tidied up with isort] Signed-off-by: Markus Armbruster <armbru@redhat.com>