LDR (array vector)

Load ZA array vector

This instruction performs a contiguous load of bytes to a ZA array vector from the memory address generated by a 64-bit scalar base plus an optional immediate offset multiplied by the current vector length in bytes. The ZA array vector is selected by the sum of the vector select register and the same immediate offset, modulo the number of bytes in a Streaming SVE vector.

This instruction is unpredicated.

The load is performed as contiguous byte accesses, with no endian conversion and no guarantee of single-copy atomicity larger than a byte. However, if alignment is checked, then the base register must be aligned to 16 bytes.

This instruction does not require the PE to be in Streaming SVE mode, and it is expected that this instruction will not experience a significant slowdown due to contention with other PEs that are executing in Streaming SVE mode.

Encoding: SME

Variants: FEAT_SME (PROFILE_A)

313029282726252423222120191817161514131211109876543210
111000010000000000000
opRvRnoff4

LDR ZA[<Wv>, <offs>], [<Xn|SP>{, #<offs>, MUL VL}]

Decoding algorithm

if !IsFeatureImplemented(FEAT_SME) then EndOfDecode(Decode_UNDEF);
constant integer n = UInt(Rn);
constant integer v = UInt('011':Rv);
constant integer offset = UInt(off4);

Operation

CheckSMEAndZAEnabled();
constant  integer SVL = CurrentSVL;
constant  integer dim = SVL DIV 8;
bits(64)  base;
constant integer moffs = offset * dim;
bits(SVL) result;
constant bits(32)  vbase = X[v, 32];
constant integer   vec = (UInt(vbase) + offset) MOD dim;
constant  boolean contiguous = TRUE;
constant  boolean nontemporal = FALSE;
constant  boolean tagchecked = n != 31;
constant AccessDescriptor accdesc = CreateAccDescSME(MemOp_LOAD, nontemporal, contiguous,
                                                     tagchecked);

if IsFeatureImplemented(FEAT_TME) && TSTATE.depth > 0 then
    FailTransaction(TMFailure_ERR, FALSE);

if n == 31 then
    CheckSPAlignment();
    base = SP[64];
else
    base = X[n, 64];

bits(64) addr = AddressAdd(base, moffs, accdesc);

constant boolean aligned = IsAligned(addr, 16);

if !aligned && AlignmentEnforced() then
    constant FaultRecord fault = AlignmentFault(accdesc, addr);
    AArch64.Abort(fault);

for e = 0 to dim-1
    Elem[result, e, 8] = AArch64.MemSingle[addr, 1, accdesc, aligned];
    addr = AddressIncrement(addr, 1, accdesc);

ZAvector[vec, SVL] = result;

Explanations

<Wv>: Is the 32-bit name of the vector select register W12-W15, encoded in the "Rv" field.
<offs>: Is the vector select offset and optional memory offset, in the range 0 to 15, defaulting to 0, encoded in the "off4" field.
<Xn|SP>: Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field.

Operational Notes

If PSTATE.DIT is 1, the timing of this instruction is insensitive to the value of the data being loaded or stored.