Floating-point add recursive reduction of quadword vector segments
Floating-point addition of the same element numbers from each 128-bit source vector segment using a recursive pairwise reduction, placing each result into the corresponding element number of the 128-bit SIMD&FP destination register. Inactive elements in the source vector are treated as +0.0.
Variants: FEAT_SVE2p1 || FEAT_SME2p1 (FEAT_SVE2p1 || FEAT_SME2p1)
31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
0 | 1 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | |||||||||||||||
size | opc | Pg | Zn | Vd |
---|
FADDQV <Vd>.<T>, <Pg>, <Zn>.<Tb>
if !IsFeatureImplemented(FEAT_SVE2p1) && !IsFeatureImplemented(FEAT_SME2p1) then EndOfDecode(Decode_UNDEF); if size == '00' then EndOfDecode(Decode_UNDEF); constant integer esize = 8 << UInt(size); constant integer g = UInt(Pg); constant integer n = UInt(Zn); constant integer d = UInt(Vd);
CheckSVEEnabled(); constant integer VL = CurrentVL; constant integer PL = VL DIV 8; constant integer segments = VL DIV 128; constant integer elempersegment = 128 DIV esize; constant integer segbits = segments*esize; constant bits(PL) mask = P[g, PL]; constant bits(VL) operand = if AnyActiveElement(mask, esize) then Z[n, VL] else Zeros(VL); constant bits(esize) identity = FPZero('0', esize); bits(128) result = Zeros(128); for e = 0 to elempersegment-1 bits(segbits) stmp; for s = 0 to segments-1 if ActivePredicateElement(mask, s * elempersegment + e, esize) then Elem[stmp, s, esize] = Elem[operand, s * elempersegment + e, esize]; else Elem[stmp, s, esize] = identity; Elem[result, e, esize] = FPReduce(ReduceOp_FADD, stmp, esize, FPCR); V[d, 128] = result;