List of x86 cryptographic instructions
| Part of a series on | 
| x86 instruction listings | 
|---|
| 
 | 
Instructions that have been added to the x86 instruction set in order to assist efficient calculation of cryptographic primitives, such as e.g. AES encryption, SHA hash calculation and random number generation.
Intel AES instructions
6 new instructions.
| Instruction | Encoding | Description | Added in | 
|---|---|---|---|
| AESENC xmm1,xmm2/m128 | 66 0F 38 DC /r | Perform one round of an AES encryption flow. Performs the SubBytes,ShiftRows,MixColumnsandAddRoundKeysteps of an AES encryption round, in that order.[a]The first source argument provides a 128-bit data-block to perform an encryption round on, the second source argument provides a round key for the AddRoundKeystage. | 
 | 
| AESENCLAST xmm1,xmm2/m128 | 66 0F 38 DD /r | Perform the last round of an AES encryption flow. Performs the SubBytes,ShiftRowsandAddRoundKeysteps of an AES encryption round, in that order.[a] | |
| AESDEC xmm1,xmm2/m128 | 66 0F 38 DE /r | Perform one round of an AES decryption flow. Performs the InvShiftRows,InvSubBytes,InvMixColumnsandAddRoundKeysteps of an AES decryption round, in that order.[a][b] | |
| AESDECLAST xmm1,xmm2/m128 | 66 0F 38 DF /r | Perform the last round of an AES decryption flow. Performs the InvShiftRows,InvSubBytesandAddRoundKeysteps of an AES decryption round, in that order.[a] | |
| AESKEYGENASSIST xmm1,xmm2/m128,imm8 | 66 0F 3A DF /r ib | Assist in AES round key generation. The operation performed is: temp[127: 0] := SubBytes( src[127:0] ) // AES SubBytes step dest[ 31: 0] := temp[63:32] dest[ 63:32] := rotate_left( temp[63:32], 8 ) XOR RCON dest[ 95:64] := temp[127:96] dest[127:96] := rotate_left( temp[127:96], 8 ) XOR RCON where RCON is the instruction's imm8 argument zero-extended to 32 bits. | |
| AESIMC xmm1,xmm2/m128 | 66 0F 38 DB /r | Perform the InvMixColumnsstep of an AES decryption round on one 128-bit block.Mainly used to help prepare an AES key for use with the AESDECinstruction.[b] | 
- ^ a b c d The SubBytesandShiftRowssteps of an AES encryption round may be performed in either order - the result of the instruction is the same either way.[1] (Intel documentation describes theShiftRowsstep as being performed first, while AMD documentation describesSubBytesas being performed first.) This also applies to theInvShiftRows/InvSubBytessteps of an AES decryption round.
- ^ a b For the intended AES decode flow under AES-NI (a series of AESDECinstructions followed by anAESDECLAST), theAESDECinstruction performs theInvMixColumnsandAddRoundKeysteps in the opposite order of what the AES specification (FIPS 197) indicates.
 As a result of this, the AES round key provided as the second source argument toAESDECcannot just be taken from the Rijndael key schedule directly, but instead has to be postprocessed by performing anInvMixColumnon the round key after the key schedule and before it's used withAESDEC[1] (this can be done with theAESIMCinstruction or by doing anAESENCLAST+AESDECsequence with the round key set to 0.)
 This issue is specific to(V)AESDECand does not apply to round keys used with theAESENC,AESENCLASTorAESDECLASTinstructions.
CLMUL instructions
| Instruction | Opcode | Description | 
|---|---|---|
| PCLMULQDQ xmm1,xmm2,imm8 | 66 0F 3A 44 /r ib | Perform a carry-less multiplication of two 64-bit polynomials over the finite field GF(2k). | 
| PCLMULLQLQDQ xmm1,xmm2/m128 | 66 0F 3A 44 /r 00 | Multiply the low halves of the two 128-bit operands. | 
| PCLMULHQLQDQ xmm1,xmm2/m128 | 66 0F 3A 44 /r 01 | Multiply the high half of the destination register by the low half of the source operand. | 
| PCLMULLQHQDQ xmm1,xmm2/m128 | 66 0F 3A 44 /r 10 | Multiply the low half of the destination register by the high half of the source operand. | 
| PCLMULHQHQDQ xmm1,xmm2/m128 | 66 0F 3A 44 /r 11 | Multiply the high halves of the two 128-bit operands. | 
RDRAND and RDSEED
| Instruction | Encoding | Description | Added in | 
|---|---|---|---|
| RDRAND r16RDRAND r32 | NFx 0F C7 /6 | Return a random number that has been generated with a CSPRNG (Cryptographically Secure Pseudo-Random Number Generator) compliant with NIST SP 800-90A.[a] | Ivy Bridge, Silvermont, Excavator, Puma, ZhangJiang, Knights Landing, | 
| RDRAND r64 | NFx REX.W 0F C7 /6 | ||
| RDSEED r16RDSEED r32 | NFx 0F C7 /7 | Return a random number that has been generated with a HRNG/TRNG (Hardware/"True" Random Number Generator) compliant with NIST SP 800-90B and C.[a] | Broadwell, ZhangJiang, Knights Landing, Zen 1, Gracemont | 
| RDSEED r64 | NFx REX.W 0F C7 /7 | 
- ^ a b The RDRANDandRDSEEDinstructions may fail to obtain and return a random number if the CPU's random number generators cannot keep up with the issuing of these instructions – if this happens, then software may retry the instructions (although the number of retries should be limited, in order to ensure forward progress[2]). The instructions setEFLAGS.CFto 1 if a random number was successfully obtained and 0 otherwise. ForRDSEED, failure to obtain a random number will also set the instruction's destination register to 0.
Intel SHA and SM3 instructions
These instructions provide support for cryptographic hash functions such as SHA-1, SHA-256, SHA-512 and SM3. Each of these hash functions works on fixed-size data blocks, where the processing of each data-block mostly consists of two major phases:[3]
- First expand the data-block using a message schedule (that is specific to each hash function)
- Then perform a series of rounds of a compression function to combine the expanded data into a hash state.
For each of the supported hash functions, separate instructions are provided to help compute the message schedule (instructions with "MSG" in their names) and to help perform the compression function rounds (instructions with "RND" in their names).
| Hash function extension | Instructions | Encoding[a] | Description | Added in | 
|---|---|---|---|---|
| SHA1RNDS4 xmm1,xmm2/m128,imm8 | NP 0F 3A CC /r ib | Perform Four Rounds of SHA-1 Operation | Goldmont, Zen 1, Cannon Lake, LuJiaZui, Rocket Lake | |
| SHA1NEXTE xmm1,xmm2/m128 | NP 0F 38 C8 /r | Calculate SHA-1 State Variable E after Four Rounds | ||
| SHA1MSG1 xmm1,xmm2/m128 | NP 0F 38 C9 /r | Perform an Intermediate Calculation for the Next Four SHA-1 Message Dwords | ||
| SHA1MSG2 xmm1,xmm2/m128 | NP 0F 38 CA /r | Perform a Final Calculation for the Next Four SHA-1 Message Dwords | ||
| SHA256RNDS2 xmm1,xmm2/m128SHA256RNDS2 xmm1,xmm2/m128,XMM0[b] | NP 0F 38 CB /r | Perform Two Rounds of SHA256 Operation | ||
| SHA256MSG1 xmm1,xmm2/m128 | NP 0F 38 CC /r | Perform an Intermediate Calculation for the Next Four SHA-256 Message Dwords | ||
| SHA256MSG2 xmm1,xmm2/m128 | NP 0F 38 CD /r | Perform a Final Calculation for the Next Four SHA-256 Message Dwords | ||
| 
 | VSHA512RNDS2 ymm1,ymm2,xmm3 | VEX.256.F2.0F38.W0 CB /r | Perform Two Rounds of SHA-512 operation | Lunar Lake, Arrow Lake | 
| VSHA512MSG1 ymm1,xmm2 | VEX.256.F2.0F38.W0 CC /r | Perform an Intermediate Calculation for the Next Four SHA-512 Message Qwords | ||
| VSHA512MSG2 ymm1,ymm2 | VEX.256.F2.0F38.W0 CD /r | Perform a Final Calculation for the Next Four SHA-512 Message Qwords | ||
| 
 | VSM3RNDS2 xmm1,xmm2,xmm3/m128,imm8 | VEX.128.66.0F3A.W0 DE /r ib | Perform Two Rounds of SM3 Operation | Lunar Lake, Arrow Lake | 
| VSM3MSG1 xmm1,xmm2,xmm3/m128 | VEX.128.NP.0F38.W0 DA /r | Perform Initial Calculation for the Next Four SM3 Message Words | ||
| VSM3MSG2 xmm1,xmm2,xmm3/m128 | VEX.128.66.0F38-W0 DA /r | Perform Final Calculation for the Next Four SM3 Message Words | ||
Intel Key Locker instructions
These instructions, available in Tiger Lake and later Intel processors, are designed to enable encryption/decryption with an AES key without having access to any unencrypted copies of the key during the actual encryption/decryption process.
| Key Locker subset | Instruction | Encoding[a] | Description | |
|---|---|---|---|---|
| 
 | LOADIWKEY xmm1,xmm2 | F3 0F 38 DC /r | Load internal wrapping key ("IWKey") from xmm1, xmm2 and XMM0.The two explicit operands (which must be register operands) specify a 256-bit encryption key. The implicit operand in  
 | |
| 
 | ENCODEKEY128 r32,r32 | F3 0F 38 FA /r | Wrap a 128-bit AES key from XMM0into a 384-bit key handle - and output this handle toXMM0-2. | Source operand specifies handle restrictions to build into the handle.[c] Destination operand is initialized with information about the source and attributes of the key (this matches the value that was provided in EAX for the most recent invocation of  These instructions may also modify  | 
| ENCODEKEY256 r32,r32 | F3 0F 3A FB /r | Wrap a 256-bit AES key from XMM1:XMM0into a 512-bit key handle - and output this handle toXMM0-3. | ||
| AESENC128KL xmm,m384 | F3 0F 38 DC /r | Encrypt xmm using 128-bit AES key indicated by handle at m384and store result in xmm.[d] | ||
| AESDEC128KL xmm,m384 | F3 0F 38 DD /r | Decrypt xmm using 128-bit AES key indicated by handle at m384and store result in xmm.[d] | ||
| AESENC256KL xmm,m512 | F3 0F 38 DE /r | Encrypt xmm using 256-bit AES key indicated by handle at m512and store result in xmm.[d] | ||
| AESDEC256KL xmm,m512 | F3 0F 38 DF /r | Decrypt xmm using 256-bit AES key indicated by handle at m512and store result in xmm.[d] | ||
| 
 | AESENCWIDE128KL m384 | F3 0F 38 D8 /0 | Encrypt XMM0-7using 128-bit AES key indicated by handle atm384and store each resultant block back to its corresponding register.[d] | |
| AESDECWIDE128KL m384 | F3 0F 38 D8 /1 | Decrypt XMM0-7using 128-bit AES key indicated by handle atm384and store each resultant block back to its corresponding register.[d] | ||
| AESENCWIDE256KL m512 | F3 0F 38 D8 /2 | Encrypt XMM0-7using 256-bit AES key indicated by handle atm512and store each resultant block back to its corresponding register.[d] | ||
| AESDECWIDE256KL m512 | F3 0F 38 D8 /3 | Decrypt XMM0-7using 256-bit AES key indicated by handle atm512and store each resultant block back to its corresponding register.[d] | ||
- ^ Under Intel APX, none of the Key Locker instructions can be encoded with the EVEX prefix - this prevents the use of the r16-r31andxmm16-xmm31registers with these instructions.
- ^ The flags available for the LOADIWKEYinstruction in the EAX register are:Bits Flags 0 1=Do not permit the wrapping key to be backed up to platform-scoped storage 4:1 KeySource field. The following values are supported: - 0: use key input operands directly
- 1: XOR the key input operands with 384 bits from hardware RNG
 31:5 Reserved, must be set to 0 
- ^ The handle restrictions available for the explicit source argument to ENCODEKEY128andENCODEKEY256are:Bits Flags 0 CPL0-only restriction 1 No-encrypt restriction 2 No-decrypt restriction 31:3 Reserved, must be set to 0 
- ^ a b c d e f g h All of the AES Key Locker encode/decode instructions will check whether the handle is valid for the current IWKey and encode/decode data only if the handle is valid. These instructions will set the ZF flag to indicate whether the provided handle was valid (ZF=0) or not (ZF=1).
VIA/Zhaoxin PadLock instructions
The VIA/Zhaoxin PadLock instructions are instructions designed to apply cryptographic primitives in bulk, similar to the 8086 repeated string instructions. As such, unless otherwise specified, they take, as applicable, pointers to source data in ES:rSI and destination data in ES:rDI, and a data-size or count in rCX. Like the old string instructions, they are all designed to be interruptible.[4][5]
| PadLock subset | Instruction mnemonics[a] | Encoding | Description | Added in | 
|---|---|---|---|---|
| 
 | XSTORE,XSTORE-RNG | NFx 0F A7 C0 | Store random bytes to ES:[rDI], and increment ES:rDI accordingly. XSTOREwill store currently-available bytes, which may be from 0 to 8 bytes.REP XSTOREandREP XRNG2will write the number of random bytes specified by rCX, waiting for the random number generator when needed.[b] EDX specifies a "quality factor".[c] | Nehemiah (stepping 3) | 
| REP XSTORE,REP XSTORE-RNG | F3 0F A7 C0 | |||
| REP XRNG2 | F3 0F A7 F8 | ZhangJiang[d] | ||
| 
 | REP XCRYPT-ECB | F3 0F A7 C8 | Encrypt/Decrypt data, using the AES cipher in various block modes (ECB, CBC, CFB, OFB and CTR, respectively). rCX contains the number of 16-byte blocks to encrypt/decrypt, rBX contains a pointer to an encryption key, ES:rAX a pointer to an initialization vector for block modes that need it, and ES:rDX a pointer to a control word.[e] | Nehemiah (stepping 8) | 
| REP XCRYPT-CBC | F3 0F A7 D0 | |||
| REP XCRYPT-CFB | F3 0F A7 E0 | |||
| REP XCRYPT-OFB | F3 0F A7 E8 | |||
| 
 | REP XCRYPT-CTR | F3 0F A7 D8 | C7 "Esther"[9] | |
| 
 | REP XSHA1 | F3 0F A6 C8 | Compute a cryptographic hash (using the SHA-1 and SHA-256 functions, respectively). ES:rSI points to data to compute a hash for, ES:rDI points to a message digest and rCX specifies the number of bytes. rAX should be set to 0 at the start of a calculation.[g] | Esther | 
| REP XSHA256 | F3 0F A6 D0 | |||
| REP XSHA384 | F3 0F A6 D8 | Perform computation of a SHA-384/SHA-512 cryptographic hash. ES:rSI points to a series of 128-byte data chunks to perform hash computation for, ES:rDI points to a 64-byte digest to update, and ECX specifies the number of chunks to process.[h] | ZhangJiang[d] | |
| REP XSHA512 | F3 0F A6 E0 | |||
| 
 | REP MONTMUL | F3 0F A6 C0[i] | Perform Montgomery Multiplication. Takes an operand width in ECX (given as a number of bits – must be in range 256..32768 and divisible by 128) and pointer to a data structure in ES:ESI.[j] When starting a new Montgomery Multiplication, EAX and the result buffer in memory must be filled with all-0s before executing the  | Esther | 
| REP MONTMUL2 | F3 0F A6 F0 | Perform modular multiplication/exponentiation. Takes pointers (all using the ES: segment) to bignum integers  in registers rAX, rBX, rDX, rDI, respectively, where  and  are input numbers,  is a modulus,[k] and  will be overwritten with the result. The operation performed is: 
 ECX provides the size of the bignums, in number of bits (256..32768, must be divisble by 128), and ES:rSI provides a pointer to a scratchpad area to use during the calculation.[l] | ZhangJiang[d] | |
| REP XMODEXP | F3 0F A6 F8 | |||
| CCS_HASH,CCS_SM3[m] | F3 0F A6 E8 | Compute SM3 hash, similar to the REP XSHA*instructions. The rBX register is used to specify hash function (20hfor SM3 being the only documented value). | ZhangJiang | |
| CCS_ENCRYPT,CCS_SM4[m] | F3 0F A7 F0 | Encrypt/Decrypt data, using the SM4 cipher in various block modes. rCX contains the number of 16-byte blocks to encrypt/decrypt, rBX contains a pointer to an encryption key, rDX a pointer to an initialization vector for block modes that need it, and rAX contains a control word.[n] | ||
| SM2[14] | F2 0F A6 C0 | Perform SM2 (public key cryptographic algorithm) function. The function to perform is specified in bits 5:0 of EDX[o] - depending on function, rAX/rBX/rCX/rSI/rDI may provide additional input arguments. The instruction returns a status bit in EDX bit 6 (0=success, 1=failure) - depending on function, rAX, rCX and rDI may be modified as well. | KX-6000G | |
Footnotes
- ^ For instruction mnemonics that are listed with a hyphen, different VIA PadLock documents differ with respect to whether the instruction names have a hyphen or not (e.g. version 1.0 of the ACE programming guide uses the hyphens,[6] while v1.66 does not.[4]) and assemblers may accept instruction mnemonics with or without the hyphen - e.g. GNU Binutils rev 2.17 and later accepts both.
 Some assemblers may also consider theREPprefix optional for instructions other thanXSTORE- with such assemblers, the PadLock instructions will be assembled with oneF3(REP) prefix byte regardless of whether the assembly instruction is written withREPor not. (TheF3prefix is mandatory for all PadLock instructions exceptXSTORE.)
- ^ On some processors that support PadLock, the REP XSTOREinstruction (but notREP XRNG2) may write not just the number of bytes specified in ECX, but up to 7 additional bytes as well.[7]
- ^ For the REP XRNG2instruction, bits 1:0 of EDX are used to indicate whether the instruction should return hardware random numbers directly (EDX[1:0]==0) or return postprocessed numbers (EDX[1:0] ≠ 0).
- ^ a b c As of 2024, the REP XRNG2,REP XSHA384,REP XSHA512,REP MONTMUL2andREP XMODEXPinstructions exist as documented instructions only on Zhaoxin processors.[5]
 A VIA-provided OpenSSL patch from 2011[8] indicates that these instructions were present on the VIA Nano, however VIA has not published documentation for these instructions.
- ^ The control word for REP XCRYPT*is a 16-byte (128-bit) data structure with the following layout:
 If bit 5 is set in order to allow unaligned data, then theBits Usage 3:0 AES round count 4 Digest mode enable (ACE2 only) 5 1=allow data that are not 16-byte aligned (ACE2 only) 6 Cipher: 0=AES, 1=undefined 7 Key schedule: 0=compute (128-bit key only), 1=load from memory 8 0=normal, 1=intermediate-result 9 0=encrypt, 1=decrypt 11:10 Key size: 00=128-bit, 01=192-bit, 10=256-bit, 11=reserved 127:12 Reserved, must be set to 0 REP XCRYPT*instructions will use the 112 bytes directly after the control word as a scratchpad memory area for data realignment.
- ^ In addition to the new REP XCRYPT-CTRinstruction, ACE2 also adds extra features to the otherREP XCRYPTinstructions: a digest mode for the CBC and CFB instructions, and the ability to use input/output data that are not 16-byte aligned for the non-ECB instructions.
- ^ On VIA Nano and later processors, setting rAX to an all-1s value for the REP XSHA*instructions will enable an alternate operation mode, where rCX specifies the number of 64-byte blocks, and where the standard FIPS-180-2 length extension procedure at the end of the hash calculation is omitted. This makes for a variant more suitable for data streaming than the original EAX=0 variant.[10] This functionality also exists forCCS_HASH.
 
- ^ The per-chunk calculation is identical for SHA-384 and SHA-512 - as a result of this, the REP XSHA384andREP XSHA512instructions perform identical operations.
- ^ The REP MONTMULinstruction is only supported with an AddressSize of 32 bits - for this reason, the address-size override prefix (67h) is required in 16-bit and 64-bit modes, but disallowed in 32-bit mode.
- ^ The data structure to REP MONTMULcontains six 32-bit elements, where the first one is a negated modular inverse of the bottom 32 bits of the modulus and the remaining 5 are pointers to various memory buffers (each of which uses the ES segment and must be 16-byte aligned):Offset Data item 0 Negated modular inverse 4 Pointer to first multiplicand 8 Pointer to second multiplicand 12 Pointer to result buffer 16 Pointer to modulus 20 Pointer to 32-byte scratchpad 
- ^ For REP MONTMUL2andREP XMODEXP, the modulus  is required to be greater than both  and , and is also required to be odd. The instructions will produce a #GP exception if this is not the case.
- ^ Given a bignum size of N bits, the scratchpad memory area pointed to by ES:rSI for the REP MONTMUL2andREP XMODEXPmust have a size of at least bytes (e.g. for a 2048-bit bignum size, the scratchpad must be at least 808 bytes). Also, before starting either of these instructions, the 8 first bytes of this scratchpad must be zeroed out and the bignum size given in ECX must also be written as a 64-bit integer to the next 8 bytes.
- ^ a b The CCS instructions are listed with different mnemonics in different Zhaoxin sources - e.g. the CCS_SM3/CCS_SM4mnemomics are used in a 2019 article,[13] whileCCS_HASH/CCS_ENCRYPTare used in a 2020 article.[11]
- ^ The CCS_ENCRYPTcontrol word in rAX has the following format:Bits Usage 0 0=Encrypt, 1=Decrypt 5:1 Must be 10000b for SM4. 6 ECB block mode 7 CBC block mode 8 CFB block mode 9 OFB block mode 10 CTR block mode 11 Digest enable Remaining bits in rAX must be set to all-0s. Of bits 10:6 in rAX (block mode selection), exactly one bit must be set, or else behavior is undefined. 
- ^ The supported functions in bits 5:0 of EDX for the SM2instruction are:Value Meaning 0x01 Encryption 0x02 Decryption 0x04 Signature 0x08 Verify signature 0x10 Key exchange 1 0x11 Key exchange 2 without hash 0x12 Key exchange 3 without hash 0x15 Key exchange 2 with hash 0x16 Key exchange 3 with hash 0x20 Preprocess1 to calculate hash value Z of user’s identification 0x21 Preprocess2 to calculate hash value e of hash value Z and message M 
References
- ^ a b Intel, Advanced Encryption Standard (AES) New Instructions Set, order no. 323641-001, rev 3.01, Sep 2012, pages 16-17. Archived on 19 Jan 2022.
- ^ Intel, Digital Random Number Generator (DRNG) Software Implementation Guide rev 2.1, oct 17, 2018, sections 5.2 and 5.3. Archived on nov 19, 2021.
- ^ Intel, Intel SHA Extensions: New Instructions Supporting the Secure Hash Algorithm on Intel Architecture Processors, order. no. 402097, July 2013. Archived from the original on 19 Mar 2025.
- ^ a b VIA, PadLock Programming Guide, rev 1.66, 4 Aug 2005. Archived from the original on 26 May 2010.
- ^ a b Binutils mailing list, (PATCH v1) x86: Support ZHAOXIN padlock instructions, 13 Dec 2024, see "padlock instruction set reference.pdf" attachment for Zhaoxin-provided documentation of the PadLock instructions. Archived on 19 Dec 2024; attachment archived on 19 Dec 2024.
- ^ VIA, Nehemiah Advanced Cryptography Engine Programming Guide, v1.0, 2004. Archived from the original on 17 Sep 2004.
- ^ VIA, Nehemiah Random Number Generator Programming Guide, v1.0, 2003, page 9. Archived from the original on 17 Sep 2004.
- ^ openssl-dev mailing list, (PATCH) Update PadLock engine for VIA C7 and Nano CPUs, 10 Jun 2011. Archived on 30 Jan 2022.
- ^ Michal Ludvig, VIA PadLock—Wicked Fast Encryption, Linux Journal, Apr 6, 2005. Archived on Jun 20, 2005.
- ^ Stack Overflow, Streaming SHA calculation using VIA's Padlock Hashing Engine?, Aug 11, 2014. Archived on Jun 14, 2019.
 The PadLock SDK (v3.1) referenced in the Stack Overflow answer can be downloaded from the Crypto++ wiki (accessed on Aug 11, 2023) or the Wayback Machine.
- ^ a b Zhaoxin, Core Technology | Instructions for the use of accelerated instructions for national encryption algorithm based on Zhaoxin processor (in Chinese), 8 Aug 2020. Archived on Jan 5, 2022.
- ^ Zhaoxin, GMI User Manual v1.0 (in Chinese), 23 May 2016. Archived on Feb 28, 2022.
- ^ a b Zhaoxin, Research on hardware acceleration and application of national cryptographic algorithms based on Zhaoxin CPU (in Chinese), 3 Sep 2019. Archived on 11 Aug 2020.
- ^ Binutils mailing list, (PATCH v1) x86: Support ZHAOXIN GMI instructions, 14 Oct 2024, see "ZX_GMI_Reference.docx" attachment for Zhaoxin-provided documentation of the SM2instruction. Archived on 9 Nov 2024; attachment archived on 9 Nov 2024.