8-bit floating-point dot product to half-precision (vector)
This instruction computes the fused sum-of-products of a group of two 8-bit floating-point values held in each 16-bit element of the first and second source vectors. The half-precision sum-of-products are scaled by 2-UInt(FPMR.LSCALE[3:0]), before being destructively added without intermediate rounding to the corresponding half-precision elements of the destination vector.
The 8-bit floating-point encoding format for the elements of the first source vector is selected by FPMR.F8S1. The 8-bit floating-point encoding format for the elements of the second source vector is selected by FPMR.F8S2.
Variants: FEAT_FP8DOT2 (ARMv9.5)
31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
0 | 0 | 0 | 1 | 1 | 1 | 0 | 0 | 1 | 0 | 1 | 1 | 1 | 1 | 1 | 1 | ||||||||||||||||
Q | U | size | Rm | opcode | Rn | Rd |
---|
FDOT <Vd>.<Ta>, <Vn>.<Tb>, <Vm>.<Tb>
if !IsFeatureImplemented(FEAT_FP8DOT2) then EndOfDecode(Decode_UNDEF); constant integer d = UInt(Rd); constant integer n = UInt(Rn); constant integer m = UInt(Rm); constant integer datasize = if Q == '1' then 128 else 64; constant integer esize = 16; constant integer elements = datasize DIV esize;
CheckFPMREnabled(); CheckFPAdvSIMDEnabled64(); constant bits(datasize) operand1 = V[n, datasize]; constant bits(datasize) operand2 = V[m, datasize]; constant bits(datasize) operand3 = V[d, datasize]; bits(datasize) result; for e = 0 to elements-1 constant bits(esize) op1 = Elem[operand1, e, esize]; constant bits(esize) op2 = Elem[operand2, e, esize]; bits(esize) sum = Elem[operand3, e, esize]; sum = FP8DotAddFP(sum, op1, op2, FPCR, FPMR); Elem[result, e, esize] = sum; V[d, datasize] = result;