LDFF1SW (scalar plus scalar)

Contiguous load first-fault signed words to vector (scalar index)

Contiguous load with first-faulting behavior of signed words to elements of a vector register from the memory address generated by a 64-bit scalar base and scalar index which is multiplied by 4 and added to the base address. After each element access the index value is incremented, but the index register is not updated. Inactive elements will not cause a read from Device memory or signal a fault, and are set to zero in the destination vector.

This instruction is illegal when executed in Streaming SVE mode, unless FEAT_SME_FA64 is implemented and enabled.

Encoding: SVE

Variants: FEAT_SVE (PROFILE_A)

313029282726252423222120191817161514131211109876543210
10100100100011
dtypeRmPgRnZt

LDFF1SW { <Zt>.D }, <Pg>/Z, [<Xn|SP>{, <Xm>, LSL #2}]

Decoding algorithm

if !IsFeatureImplemented(FEAT_SVE) then EndOfDecode(Decode_UNDEF);
constant integer t = UInt(Zt);
constant integer n = UInt(Rn);
constant integer m = UInt(Rm);
constant integer g = UInt(Pg);
constant integer esize = 64;
constant integer msize = 32;
constant boolean unsigned = FALSE;

Operation

CheckNonStreamingSVEEnabled();
constant integer VL = CurrentVL;
constant integer PL = VL DIV 8;
constant integer elements = VL DIV esize;
bits(64) base;
constant bits(PL) mask = P[g, PL];
bits(VL) result;
constant bits(VL) orig = Z[t, VL];
bits(msize) data;
bits(64) offset;
bits(64) addr;
constant integer mbytes = msize DIV 8;
boolean fault = FALSE;
boolean faulted = FALSE;
boolean unknown = FALSE;
constant boolean contiguous = TRUE;
constant boolean tagchecked = TRUE;
AccessDescriptor accdesc = CreateAccDescSVEFF(contiguous, tagchecked);

if !AnyActiveElement(mask, esize) then
    if n == 31 && ConstrainUnpredictableBool(Unpredictable_CHECKSPNONEACTIVE) then
        CheckSPAlignment();
else
    if n == 31 then CheckSPAlignment();

base = if n == 31 then SP[64] else X[n, 64];
offset = X[m, 64];
addr = AddressAdd(base, UInt(offset) * mbytes, accdesc);

assert accdesc.first;

for e = 0 to elements-1
    if ActivePredicateElement(mask, e, esize) then
        if accdesc.first then
            // Mem[] will not return if a fault is detected for the first active element
            data = Mem[addr, mbytes, accdesc];
            accdesc.first = FALSE;
        else
            // MemNF[] will return fault=TRUE if access is not performed for any reason
            (data, fault) = MemNF[addr, mbytes, accdesc];
    else
        (data, fault) = (Zeros(msize), FALSE);
    addr = AddressIncrement(addr, mbytes, accdesc);

    // FFR elements set to FALSE following a suppressed access/fault
    faulted = faulted || fault;
    if faulted then
        ElemFFR[e, esize] = '0';

    // Value becomes CONSTRAINED UNPREDICTABLE after an FFR element is FALSE
    unknown = unknown || ElemFFR[e, esize] == '0';
    if unknown then
        if !fault && ConstrainUnpredictableBool(Unpredictable_SVELDNFDATA) then
            Elem[result, e, esize] = Extend(data, esize, unsigned);
        elsif ConstrainUnpredictableBool(Unpredictable_SVELDNFZERO) then
            Elem[result, e, esize] = Zeros(esize);
        else  // merge
            Elem[result, e, esize] = Elem[orig, e, esize];
    else
        Elem[result, e, esize] = Extend(data, esize, unsigned);

Z[t, VL] = result;

Explanations

<Zt>: Is the name of the scalable vector register to be transferred, encoded in the "Zt" field.
<Pg>: Is the name of the governing scalable predicate register P0-P7, encoded in the "Pg" field.
<Xn|SP>: Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field.
<Xm>: Is the optional 64-bit name of the general-purpose offset register, defaulting to XZR, encoded in the "Rm" field.