ZIP (four registers)

Interleave elements from four vectors

This instruction places the four-way interleaved elements from the four source vectors in the corresponding elements of the four destination vectors.

This instruction is unpredicated.

Encoding: 8-bit to 64-bit elements

Variants: FEAT_SME2 (ARMv9.3)

313029282726252423222120191817161514131211109876543210
110000011101101110000000
sizeZnZdop

ZIP { <Zd1>.<T>-<Zd4>.<T> }, { <Zn1>.<T>-<Zn4>.<T> }

Decoding algorithm

if !IsFeatureImplemented(FEAT_SME2) then EndOfDecode(Decode_UNDEF);
if size == '11' && MaxImplementedSVL() < 256 then EndOfDecode(Decode_UNDEF);
constant integer esize = 8 << UInt(size);
constant integer n = UInt(Zn:'00');
constant integer d = UInt(Zd:'00');

Encoding: 128-bit element

Variants: FEAT_SME2 (ARMv9.3)

313029282726252423222120191817161514131211109876543210
11000001001101111110000000
ZnZdop

ZIP { <Zd1>.Q-<Zd4>.Q }, { <Zn1>.Q-<Zn4>.Q }

Decoding algorithm

if !IsFeatureImplemented(FEAT_SME2) then EndOfDecode(Decode_UNDEF);
if MaxImplementedSVL() < 512 then EndOfDecode(Decode_UNDEF);
constant integer esize = 128;
constant integer n = UInt(Zn:'00');
constant integer d = UInt(Zd:'00');

Operation

CheckStreamingSVEEnabled();
constant integer VL = CurrentVL;
if VL < esize * 4 then EndOfDecode(Decode_UNDEF);
constant integer quads = VL DIV (esize * 4);
constant bits(VL) operand0 = Z[n, VL];
constant bits(VL) operand1 = Z[n+1, VL];
constant bits(VL) operand2 = Z[n+2, VL];
constant bits(VL) operand3 = Z[n+3, VL];
bits(VL) result;

for r = 0 to 3
    constant integer base = r * quads;
    for q = 0 to quads-1
        Elem[result, 4*q+0, esize] = Elem[operand0, base+q, esize];
        Elem[result, 4*q+1, esize] = Elem[operand1, base+q, esize];
        Elem[result, 4*q+2, esize] = Elem[operand2, base+q, esize];
        Elem[result, 4*q+3, esize] = Elem[operand3, base+q, esize];
    Z[d+r, VL] = result;

Explanations

<Zd1>: Is the name of the first scalable vector register of the destination multi-vector group, encoded as "Zd" times 4.
<T>: <Zd4>: Is the name of the fourth scalable vector register of the destination multi-vector group, encoded as "Zd" times 4 plus 3.
<Zn1>: Is the name of the first scalable vector register of the source multi-vector group, encoded as "Zn" times 4.
<Zn4>: Is the name of the fourth scalable vector register of the source multi-vector group, encoded as "Zn" times 4 plus 3.

Operational Notes

If PSTATE.DIT is 1: