SETGPT, SETGMT, SETGET

Memory set with tag setting, unprivileged

These instructions set a requested number of bytes in memory to the value in the least significant byte of the source data register and store an Allocation Tag to memory for each Tag Granule written. The Allocation Tag is calculated from the Logical Address Tag in the register that holds the first address to be set. The prologue, main, and epilogue instructions are expected to be run in succession and to appear consecutively in memory: SETGPT, then SETGMT, and then SETGET.

SETGPT performs some preconditioning of the arguments suitable for using the SETGMT instruction, and sets an IMPLEMENTATION DEFINED portion of the requested number of bytes. SETGMT sets a further IMPLEMENTATION DEFINED portion of the remaining bytes. SETGET sets any final remaining bytes.

The ability to set an IMPLEMENTATION DEFINED number of bytes allows an implementation to optimize how the bytes being set are divided between the different instructions.

For more information on exceptions specific to memory set instructions, see Memory Copy and Memory Set exceptions.

The architecture supports two algorithms for the memory set: option A and option B. Which algorithm is used is IMPLEMENTATION DEFINED.

Portable software should not assume that the choice of algorithm is constant.

For SETGPT:

  • If Xn<63> == 1, the set size is saturated to 0x7FFFFFFFFFFFFFF0.
  • On completion of SETGPT, option A:

  • Xn holds -1 times the number of bytes in the saturated set size remaining to be set.
  • Xd holds the original Xd + saturated set size.
  • PSTATE.{N,Z,C,V} are set to {0,0,0,0}.
  • On completion of SETGPT, option B:

  • Xn holds the number of bytes in the saturated set size remaining to be set.
  • Xd holds the lowest address that has not been set.
  • PSTATE.{N,Z,C,V} are set to {0,0,1,0}.
  • For SETGMT, option A, when PSTATE.C = 0:

  • Xn holds a signed 64-bit integer.
  • Xn holds -1 times the number of bytes remaining to be set.
  • Xd holds the lowest address to be set - Xn.
  • On completion of the instruction, Xn holds -1 times the number of bytes remaining to be set.
  • For SETGMT, option B, when PSTATE.C = 1:

  • Xn holds the number of bytes remaining to be set.
  • Xd holds the lowest address to be set.
  • On completion of the instruction:
  • For SETGET, option A, when PSTATE.C = 0:

  • Xn holds a signed 64-bit integer.
  • Xn holds -1 times the number of bytes remaining to be set.
  • Xd holds the lowest address to be set - Xn.
  • On completion of the instruction, Xn holds 0.
  • For SETGET, option B, when PSTATE.C = 1:

  • Xn holds the number of bytes remaining to be set.
  • Xd holds the lowest address to be set.
  • On completion of the instruction:
  • Explicit Memory Write effects produced by the instruction behave as if the instruction was executed at EL0 if the Effective value of PSTATE.UAO is 0 and either:

  • The instruction is executed at EL1.
  • The instruction is executed at EL2 when the Effective value of HCR_EL2.{E2H, TGE} is {1, 1}.
  • Otherwise, the Explicit Memory Write effects operate with the restrictions determined by the Exception level at which the instruction is executed.

    Encoding: Integer

    Variants: FEAT_MOPS && FEAT_MTE (FEAT_MOPS && FEAT_MTE)

    313029282726252423222120191817161514131211109876543210
    011101110xx0101
    szo0op1Rsop2RnRd

    Prologue (op2 == 0001)

    SETGPT [<Xd>]!, <Xn>!, <Xs>

    Main (op2 == 0101)

    SETGMT [<Xd>]!, <Xn>!, <Xs>

    Epilogue (op2 == 1001)

    SETGET [<Xd>]!, <Xn>!, <Xs>

    Decoding algorithm

    if !IsFeatureImplemented(FEAT_MOPS) || !IsFeatureImplemented(FEAT_MTE) || sz != '00' then
        EndOfDecode(Decode_UNDEF);
    
    SETParams memset;
    memset.d = UInt(Rd);
    memset.s = UInt(Rs);
    memset.n = UInt(Rn);
    constant bits(2) options = op2<1:0>;
    constant boolean nontemporal = options<1> == '1';
    
    case op2<3:2> of
        when '00' memset.stage = MOPSStage_Prologue;
        when '01' memset.stage = MOPSStage_Main;
        when '10' memset.stage = MOPSStage_Epilogue;
        otherwise EndOfDecode(Decode_UNDEF);

    Operation

    CheckMOPSEnabled();
    
    CheckSETConstrainedUnpredictable(memset.n, memset.d, memset.s);
    
    constant bits(8) data = X[memset.s, 8];
    MOPSBlockSize B;
    
    memset.is_setg = TRUE;
    memset.nzcv = PSTATE.;
    memset.toaddress = X[memset.d, 64];
    if memset.stage == MOPSStage_Prologue then
        memset.setsize = UInt(X[memset.n, 64]);
    else
        memset.setsize = SInt(X[memset.n, 64]);
    memset.implements_option_a = SETGOptionA();
    
    constant boolean privileged = (if options<0> == '1' then AArch64.IsUnprivAccessPriv()
                                   else PSTATE.EL != EL0);
    
    constant AccessDescriptor accdesc = CreateAccDescSTGMOPS(privileged, nontemporal);
    
    if memset.stage == MOPSStage_Prologue then
        if memset.setsize > ArchMaxMOPSSETGSize then
            memset.setsize = ArchMaxMOPSSETGSize;
    
        if ((memset.setsize != 0 && !IsAligned(memset.toaddress, TAG_GRANULE)) ||
                !IsAligned(memset.setsize<63:0>, TAG_GRANULE)) then
            constant FaultRecord fault = AlignmentFault(accdesc, memset.toaddress);
            AArch64.Abort(fault);
    
        if memset.implements_option_a then
            memset.nzcv = '0000';
            memset.toaddress = memset.toaddress + memset.setsize;
            memset.setsize   = 0 - memset.setsize;
        else
            memset.nzcv = '0010';
    
    memset.stagesetsize = MemSetStageSize(memset);
    
    if memset.stage != MOPSStage_Prologue then
        CheckMemSetParams(memset, options);
    
        bits(64) fault_address;
        if memset.implements_option_a then
            fault_address = memset.toaddress + memset.setsize;
        else
            fault_address = memset.toaddress;
    
        if (memset.setsize != 0 && (memset.stagesetsize != 0 || MemStageSetZeroSizeCheck()) &&
              !IsAligned(memset.toaddress, TAG_GRANULE)) then
            constant FaultRecord fault = AlignmentFault(accdesc, fault_address);
            AArch64.Abort(fault);
        if ((memset.stagesetsize != 0 || MemStageSetZeroSizeCheck()) &&
               !IsAligned(memset.setsize<63:0>, TAG_GRANULE)) then
            constant FaultRecord fault = AlignmentFault(accdesc, fault_address);
            AArch64.Abort(fault);
    
    integer tagstep;
    bits(4) tag;
    bits(64) tagaddr;
    AddressDescriptor memaddrdesc;
    PhysMemRetStatus  memstatus;
    integer memory_set;
    boolean fault = FALSE;
    
    if memset.implements_option_a then
        while memset.stagesetsize < 0 && !fault do
            // IMP DEF selection of the block size that is worked on. While many
            // implementations might make this constant, that is not assumed.
            B = SETSizeChoice(memset, TAG_GRANULE);
            assert B <= -1 * memset.stagesetsize && B<3:0> == '0000';
    
            (memory_set, memaddrdesc, memstatus) = MemSetBytes(memset.toaddress + memset.setsize,
                                                               data, B, accdesc);
    
            if memory_set != B then
                fault = TRUE;
            else
                tagstep = B DIV TAG_GRANULE;
                tag = AArch64.AllocationTagFromAddress(memset.toaddress + memset.setsize);
    
                while tagstep > 0 do
                    tagaddr = memset.toaddress + memset.setsize + (tagstep - 1) * TAG_GRANULE;
                    AArch64.MemTag[tagaddr, accdesc] = tag;
                    tagstep = tagstep - 1;
    
                memset.setsize      = memset.setsize      + B;
                memset.stagesetsize = memset.stagesetsize + B;
    
    else
        while memset.stagesetsize > 0 && !fault do
            // IMP DEF selection of the block size that is worked on. While many
            // implementations might make this constant, that is not assumed.
            B = SETSizeChoice(memset, TAG_GRANULE);
            assert B <= memset.stagesetsize && B<3:0> == '0000';
    
            (memory_set, memaddrdesc, memstatus) = MemSetBytes(memset.toaddress, data, B, accdesc);
    
            if memory_set != B then
                fault = TRUE;
            else
                tagstep = B DIV TAG_GRANULE;
                tag = AArch64.AllocationTagFromAddress(memset.toaddress);
                while tagstep > 0 do
                    tagaddr = memset.toaddress + (tagstep - 1) * TAG_GRANULE;
                    AArch64.MemTag[tagaddr, accdesc] = tag;
                    tagstep = tagstep - 1;
    
                memset.toaddress    = memset.toaddress    + B;
                memset.setsize      = memset.setsize      - B;
                memset.stagesetsize = memset.stagesetsize - B;
    
    UpdateSetRegisters(memset, fault, memory_set);
    
    if fault then
        if IsFault(memaddrdesc) then
            AArch64.Abort(memaddrdesc.fault);
        else
            constant boolean iswrite = TRUE;
            HandleExternalAbort(memstatus, iswrite, memaddrdesc, B, accdesc);
    
    if memset.stage == MOPSStage_Prologue then
        PSTATE. = memset.nzcv;

    Explanations

    <Xd>: For the "Prologue" variant: is the 64-bit name of the general-purpose register that holds an encoding of the destination address (an integer multiple of 16) and is updated by the instruction, encoded in the "Rd" field.
    <Xd>: For the "Epilogue" and "Main" variants: is the 64-bit name of the general-purpose register that holds an encoding of the destination address (an integer multiple of 16) and for option B is updated by the instruction, encoded in the "Rd" field.
    <Xn>: For the "Prologue" variant: is the 64-bit name of the general-purpose register that holds the number of bytes to be set (an integer multiple of 16) and is updated by the instruction, encoded in the "Rn" field.
    <Xn>: For the "Main" variant: is the 64-bit name of the general-purpose register that holds an encoding of the number of bytes to be set (an integer multiple of 16) and is updated by the instruction, encoded in the "Rn" field.
    <Xn>: For the "Epilogue" variant: is the 64-bit name of the general-purpose register that holds an encoding of the number of bytes to be set (an integer multiple of 16) and is set to zero on completion of the instruction, encoded in the "Rn" field.
    <Xs>: For the "Main" and "Prologue" variants: is the 64-bit name of the general-purpose register that holds the source data in bits<7:0>, encoded in the "Rs" field.
    <Xs>: For the "Epilogue" variant: is the 64-bit name of the general-purpose register that holds the source data, encoded in the "Rs" field.