BFCVTNT

Single-precision down convert and narrow to BFloat16 (top, predicated)

Convert to BFloat16 from single-precision in each active floating-point element of the source vector, and place the results in the odd-numbered 16-bit elements of the destination vector, leaving the even-numbered elements unchanged. Inactive elements in the destination vector register remain unmodified or are set to zero, depending on whether merging or zeroing predication is selected.

ID_AA64ZFR0_EL1.BF16 indicates whether this instruction is implemented.

Encoding: Merging

Variants: (FEAT_SVE || FEAT_SME) && FEAT_BF16 ((FEAT_SVE || FEAT_SME) && FEAT_BF16)

313029282726252423222120191817161514131211109876543210
0110010010001010101
opcopc2PgZnZd

BFCVTNT <Zd>.H, <Pg>/M, <Zn>.S

Decoding algorithm

if ((!IsFeatureImplemented(FEAT_SVE) && !IsFeatureImplemented(FEAT_SME)) ||
    !IsFeatureImplemented(FEAT_BF16)) then EndOfDecode(Decode_UNDEF);
constant integer g = UInt(Pg);
constant integer n = UInt(Zn);
constant integer d = UInt(Zd);
constant boolean merging = TRUE;

Encoding: Zeroing

Variants: FEAT_SVE2p2 || FEAT_SME2p2 (FEAT_SVE2p2 || FEAT_SME2p2)

313029282726252423222120191817161514131211109876543210
0110010010000010101
opcopc2PgZnZd

BFCVTNT <Zd>.H, <Pg>/Z, <Zn>.S

Decoding algorithm

if !IsFeatureImplemented(FEAT_SVE2p2) && !IsFeatureImplemented(FEAT_SME2p2) then
    EndOfDecode(Decode_UNDEF);
constant integer g = UInt(Pg);
constant integer n = UInt(Zn);
constant integer d = UInt(Zd);
constant boolean merging = FALSE;

Operation

CheckSVEEnabled();
constant integer VL = CurrentVL;
constant integer PL = VL DIV 8;
constant integer elements = VL DIV 32;
constant bits(PL) mask = P[g, PL];
constant bits(VL) operand = if AnyActiveElement(mask, 32) then Z[n, VL] else Zeros(VL);
bits(VL) result = Z[d, VL];

for e = 0 to elements-1
    if ActivePredicateElement(mask, e, 32) then
        constant bits(32) element = Elem[operand, e, 32];
        Elem[result, 2*e+1, 16] = FPConvertBF(element, FPCR);
    elsif !merging then
        Elem[result, 2*e+1, 16] = Zeros(16);

Z[d, VL] = result;

Explanations

<Zd>: Is the name of the destination scalable vector register, encoded in the "Zd" field.
<Pg>: Is the name of the governing scalable predicate register P0-P7, encoded in the "Pg" field.
<Zn>: Is the name of the source scalable vector register, encoded in the "Zn" field.