FADDQV

Floating-point add recursive reduction of quadword vector segments

Floating-point addition of the same element numbers from each 128-bit source vector segment using a recursive pairwise reduction, placing each result into the corresponding element number of the 128-bit SIMD&FP destination register. Inactive elements in the source vector are treated as +0.0.

Encoding: SVE2

Variants: FEAT_SVE2p1 || FEAT_SME2p1 (FEAT_SVE2p1 || FEAT_SME2p1)

313029282726252423222120191817161514131211109876543210
01100100010000101
sizeopcPgZnVd

FADDQV <Vd>.<T>, <Pg>, <Zn>.<Tb>

Decoding algorithm

if !IsFeatureImplemented(FEAT_SVE2p1) && !IsFeatureImplemented(FEAT_SME2p1) then
    EndOfDecode(Decode_UNDEF);
if size == '00' then EndOfDecode(Decode_UNDEF);
constant integer esize = 8 << UInt(size);
constant integer g = UInt(Pg);
constant integer n = UInt(Zn);
constant integer d = UInt(Vd);

Operation

CheckSVEEnabled();
constant integer VL = CurrentVL;
constant integer PL = VL DIV 8;
constant integer segments = VL DIV 128;
constant integer elempersegment = 128 DIV esize;
constant integer segbits = segments*esize;
constant bits(PL) mask = P[g, PL];
constant bits(VL) operand = if AnyActiveElement(mask, esize) then Z[n, VL] else Zeros(VL);
constant bits(esize) identity = FPZero('0', esize);
bits(128) result = Zeros(128);

for e = 0 to elempersegment-1
    bits(segbits) stmp;
    for s = 0 to segments-1
        if ActivePredicateElement(mask, s * elempersegment + e, esize) then
            Elem[stmp, s, esize] = Elem[operand, s * elempersegment + e, esize];
        else
            Elem[stmp, s, esize] = identity;
    Elem[result, e, esize] = FPReduce(ReduceOp_FADD, stmp, esize, FPCR);
V[d, 128] = result;

Explanations

<Vd>: Is the name of the destination SIMD&FP register, encoded in the "Vd" field.
<T>: <Pg>: Is the name of the governing scalable predicate register P0-P7, encoded in the "Pg" field.
<Zn>: Is the name of the source scalable vector register, encoded in the "Zn" field.
<Tb>: