LDBFMAX, LDBFMAXA, LDBFMAXAL, LDBFMAXL

BFloat16 floating-point atomic maximum in memory

This instruction atomically loads a 16-bit value from memory, computes the BFloat16 maximum with the value held in a register, and stores the result back to memory. The value initially loaded from memory is returned in the destination register.

  • LDBFMAXA and LDBFMAXAL load from memory with acquire semantics.
  • LDBFMAXL and LDBFMAXAL store to memory with release semantics.
  • LDBFMAX has neither acquire nor release semantics.
  • This instruction:

  • Disables alternative floating-point behaviors, as if FPCR.AH is 0.
  • Generates only the default NaN, as if FPCR.DN is 1.
  • Does not modify the cumulative FPSR exception bits (IDC, IXC, UFC, OFC, DZC, and IOC).
  • Disables trapped floating-point exceptions, as if the FPCR trap enable bits (IDE, IXE, UFE, OFE, DZE, and IOE) are all zero.
  • For more information about memory ordering semantics, see Load-Acquire, Store-Release.

    For information about addressing modes, see Load/Store addressing modes.

    Encoding: Floating-point

    Variants: FEAT_LSFE (ARMv9.6)

    313029282726252423222120191817161514131211109876543210
    001111001010000
    sizeVRARRso3opcRnRt

    No memory ordering (A == 0 && R == 0)

    LDBFMAX <Hs>, <Ht>, [<Xn|SP>]

    Acquire (A == 1 && R == 0)

    LDBFMAXA <Hs>, <Ht>, [<Xn|SP>]

    Acquire-release (A == 1 && R == 1)

    LDBFMAXAL <Hs>, <Ht>, [<Xn|SP>]

    Release (A == 0 && R == 1)

    LDBFMAXL <Hs>, <Ht>, [<Xn|SP>]

    Decoding algorithm

    if !IsFeatureImplemented(FEAT_LSFE) then EndOfDecode(Decode_UNDEF);
    
    constant integer t = UInt(Rt);
    constant integer n = UInt(Rn);
    constant integer s = UInt(Rs);
    
    constant integer datasize = 16;
    constant boolean acquire = A == '1';
    constant boolean release = R == '1';
    constant boolean tagchecked = n != 31;

    Operation

    CheckFPEnabled64();
    bits(64) address;
    bits(datasize) value;
    bits(datasize) data;
    constant AccessDescriptor accdesc = CreateAccDescFPAtomicOp(MemAtomicOp_BFMAX, acquire,
                                                                release, tagchecked);
    
    value = V[s, datasize];
    if n == 31 then
        CheckSPAlignment();
        address = SP[64];
    else
        address = X[n, 64];
    
    constant bits(datasize) comparevalue = bits(datasize) UNKNOWN; // Irrelevant when not executing CAS
    data = MemAtomic(address, comparevalue, value, accdesc);
    
    V[t, datasize] = data;

    Explanations

    <Hs>: Is the 16-bit name of the SIMD&FP register holding the data value to be operated on with the contents of the memory location, encoded in the "Rs" field.
    <Ht>: Is the 16-bit name of the SIMD&FP register to be loaded, encoded in the "Rt" field.
    <Xn|SP>: Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field.