AAA AAS AAM AAD AAA ; 37 [8086]
AAS ; 3F [8086]
AAD ; D5 0A [8086]
AAD imm ; D5 ib [8086]
AAM ; D4 0A [8086]
AAM imm ; D4 ib [8086]
These instructions are used in conjunction with the add, subtract,
multiply and divide instructions to perform binary-coded decimal
arithmetic in unpacked (one BCD digit per byte - easy to
translate to and from 
AAA ADD AL AL AF ADD AAA ADC AAA AAS AAA SUB ADD AAM AL AL AH AL AAM 16 AL AH AL AAD AAM AH AL AH ADC ADC r/m8,reg8 ; 10 /r [8086]
ADC r/m16,reg16 ; o16 11 /r [8086]
ADC r/m32,reg32 ; o32 11 /r [386]
ADC reg8,r/m8 ; 12 /r [8086]
ADC reg16,r/m16 ; o16 13 /r [8086]
ADC reg32,r/m32 ; o32 13 /r [386]
ADC r/m8,imm8 ; 80 /2 ib [8086]
ADC r/m16,imm16 ; o16 81 /2 iw [8086]
ADC r/m32,imm32 ; o32 81 /2 id [386]
ADC r/m16,imm8 ; o16 83 /2 ib [8086]
ADC r/m32,imm8 ; o32 83 /2 ib [386]
ADC AL,imm8 ; 14 ib [8086]
ADC AX,imm16 ; o16 15 iw [8086]
ADC EAX,imm32 ; o32 15 id [386]
The flags are set according to the result of the operation: in
particular, the carry flag is affected and can be used by a subsequent 
In the forms with an 8-bit immediate second operand and a longer
first operand, the second operand is considered to be signed, and is
sign-extended to the length of the first operand. In these cases, the 
To add two numbers without also adding the contents of the carry
flag, use 
ADD ADD r/m8,reg8 ; 00 /r [8086]
ADD r/m16,reg16 ; o16 01 /r [8086]
ADD r/m32,reg32 ; o32 01 /r [386]
ADD reg8,r/m8 ; 02 /r [8086]
ADD reg16,r/m16 ; o16 03 /r [8086]
ADD reg32,r/m32 ; o32 03 /r [386]
ADD r/m8,imm8 ; 80 /0 ib [8086]
ADD r/m16,imm16 ; o16 81 /0 iw [8086]
ADD r/m32,imm32 ; o32 81 /0 id [386]
ADD r/m16,imm8 ; o16 83 /0 ib [8086]
ADD r/m32,imm8 ; o32 83 /0 ib [386]
ADD AL,imm8 ; 04 ib [8086]
ADD AX,imm16 ; o16 05 iw [8086]
ADD EAX,imm32 ; o32 05 id [386]
The flags are set according to the
result of the operation: in particular, the carry flag is affected and
can be used by a subsequent 
In the forms with an 8-bit
immediate second operand and a longer first operand, the second operand
is considered to be signed, and is sign-extended to the length of the
first operand. In these cases, the 
ADDPD ADDPD xmm1,xmm2/mem128 ; 66 0F 58 /r [WILLAMETTE,SSE2]
dst[0-63] := dst[0-63] + src[0-63],
dst[64-127] := dst[64-127] + src[64-127].
The destination is an 
ADDPS ADDPS xmm1,xmm2/mem128 ; 0F 58 /r [KATMAI,SSE]
dst[0-31] := dst[0-31] + src[0-31],
dst[32-63] := dst[32-63] + src[32-63],
dst[64-95] := dst[64-95] + src[64-95],
dst[96-127] := dst[96-127] + src[96-127].
The destination is an 
ADDSD ADDSD xmm1,xmm2/mem64 ; F2 0F 58 /r [KATMAI,SSE]
dst[0-63] := dst[0-63] + src[0-63],
dst[64-127) remains unchanged.
The destination is an 
ADDSS ADDSS xmm1,xmm2/mem32 ; F3 0F 58 /r [WILLAMETTE,SSE2]
dst[0-31] := dst[0-31] + src[0-31],
dst[32-127] remains unchanged.
The destination is an 
AND AND r/m8,reg8 ; 20 /r [8086]
AND r/m16,reg16 ; o16 21 /r [8086]
AND r/m32,reg32 ; o32 21 /r [386]
AND reg8,r/m8 ; 22 /r [8086]
AND reg16,r/m16 ; o16 23 /r [8086]
AND reg32,r/m32 ; o32 23 /r [386]
AND r/m8,imm8 ; 80 /4 ib [8086]
AND r/m16,imm16 ; o16 81 /4 iw [8086]
AND r/m32,imm32 ; o32 81 /4 id [386]
AND r/m16,imm8 ; o16 83 /4 ib [8086]
AND r/m32,imm8 ; o32 83 /4 ib [386]
AND AL,imm8 ; 24 ib [8086]
AND AX,imm16 ; o16 25 iw [8086]
AND EAX,imm32 ; o32 25 id [386]
In the forms with an 8-bit immediate second operand and a longer
first operand, the second operand is considered to be signed, and is
sign-extended to the length of the first operand. In these cases, the 
The 
ANDNPD ANDNPD xmm1,xmm2/mem128 ; 66 0F 55 /r [WILLAMETTE,SSE2]
dst[0-63] := src[0-63] AND NOT dst[0-63],
dst[64-127] := src[64-127] AND NOT dst[64-127].
The destination is an 
ANDNPS ANDNPS xmm1,xmm2/mem128 ; 0F 55 /r [KATMAI,SSE]
dst[0-31] := src[0-31] AND NOT dst[0-31],
dst[32-63] := src[32-63] AND NOT dst[32-63],
dst[64-95] := src[64-95] AND NOT dst[64-95],
dst[96-127] := src[96-127] AND NOT dst[96-127].
The destination is an 
ANDPD ANDPD xmm1,xmm2/mem128 ; 66 0F 54 /r [WILLAMETTE,SSE2]
dst[0-63] := src[0-63] AND dst[0-63],
dst[64-127] := src[64-127] AND dst[64-127].
The destination is an 
ANDPS ANDPS xmm1,xmm2/mem128 ; 0F 54 /r [KATMAI,SSE]
dst[0-31] := src[0-31] AND dst[0-31],
dst[32-63] := src[32-63] AND dst[32-63],
dst[64-95] := src[64-95] AND dst[64-95],
dst[96-127] := src[96-127] AND dst[96-127].
The destination is an 
ARPL ARPL r/m16,reg16 ; 63 /r [286,PRIV]
BOUND BOUND reg16,mem ; o16 62 /r [186]
BOUND reg32,mem ; o32 62 /r [386]
BSF BSR BSF reg16,r/m16 ; o16 0F BC /r [386]
BSF reg32,r/m32 ; o32 0F BC /r [386]
BSR reg16,r/m16 ; o16 0F BD /r [386]
BSR reg32,r/m32 ; o32 0F BD /r [386]
BSF BSR Bit indices are from 0 (least significant) to 15 or 31 (most significant). The destination operand can only be a register. The source operand can be a register or a memory location.
BSWAP BSWAP reg32 ; o32 0F C8+r [486]
BT BTC BTR BTS BT r/m16,reg16 ; o16 0F A3 /r [386]
BT r/m32,reg32 ; o32 0F A3 /r [386]
BT r/m16,imm8 ; o16 0F BA /4 ib [386]
BT r/m32,imm8 ; o32 0F BA /4 ib [386]
BTC r/m16,reg16 ; o16 0F BB /r [386]
BTC r/m32,reg32 ; o32 0F BB /r [386]
BTC r/m16,imm8 ; o16 0F BA /7 ib [386]
BTC r/m32,imm8 ; o32 0F BA /7 ib [386]
BTR r/m16,reg16 ; o16 0F B3 /r [386]
BTR r/m32,reg32 ; o32 0F B3 /r [386]
BTR r/m16,imm8 ; o16 0F BA /6 ib [386]
BTR r/m32,imm8 ; o32 0F BA /6 ib [386]
BTS r/m16,reg16 ; o16 0F AB /r [386]
BTS r/m32,reg32 ; o32 0F AB /r [386]
BTS r/m16,imm ; o16 0F BA /5 ib [386]
BTS r/m32,imm ; o32 0F BA /5 ib [386]
These instructions all test one bit of their first operand, whose index is given by the second operand, and store the value of that bit into the carry flag. Bit indices are from 0 (least significant) to 15 or 31 (most significant).
In addition to storing the original value of the bit into the carry
flag, 
The destination can be a register or a memory location. The source can be a register or an immediate value.
If the destination operand is a register, the bit offset should be in the range 0-15 (for 16-bit operands) or 0-31 (for 32-bit operands). An immediate value outside these ranges will be taken modulo 16/32 by the processor.
If the destination operand is a memory location, then an immediate bit offset follows the same rules as for a register. If the bit offset is in a register, then it can be anything within the signed range of the register used (ie, for a 32-bit operand, it can be (-2^31) to (2^31 - 1)
CALL CALL imm ; E8 rw/rd [8086]
CALL imm:imm16 ; o16 9A iw iw [8086]
CALL imm:imm32 ; o32 9A id iw [386]
CALL FAR mem16 ; o16 FF /3 [8086]
CALL FAR mem32 ; o32 FF /3 [386]
CALL r/m16 ; o16 FF /2 [8086]
CALL r/m32 ; o32 FF /2 [386]
The immediate near call takes one of two forms (
You can choose between the two immediate far call forms (
The 
The 
As a convenience, NASM does not require you to call a far procedure
symbol by coding the cumbersome 
The 
CBW CWD CDQ CWDE CBW ; o16 98 [8086]
CWDE ; o32 98 [386]
CWD ; o16 99 [8086]
CDQ ; o32 99 [386]
All these instructions sign-extend a short value into a longer one, by replicating the top bit of the original value to fill the extended one.
CLC CLD CLI CLTS CLC ; F8 [8086]
CLD ; FC [8086]
CLI ; FA [8086]
CLTS ; 0F 06 [286,PRIV]
These instructions clear various flags. 
To set the carry, direction, or interrupt flags, use the 
CLFLUSH CLFLUSH mem ; 0F AE /7 [WILLAMETTE,SSE2]
Although 
CMC CMC ; F5 [8086]
CMOVcc CMOVcc reg16,r/m16 ; o16 0F 40+cc /r [P6]
CMOVcc reg32,r/m32 ; o32 0F 40+cc /r [P6]
For a list of condition codes, see section B.2.2.
Although the 
CMP CMP r/m8,reg8 ; 38 /r [8086]
CMP r/m16,reg16 ; o16 39 /r [8086]
CMP r/m32,reg32 ; o32 39 /r [386]
CMP reg8,r/m8 ; 3A /r [8086]
CMP reg16,r/m16 ; o16 3B /r [8086]
CMP reg32,r/m32 ; o32 3B /r [386]
CMP r/m8,imm8 ; 80 /0 ib [8086]
CMP r/m16,imm16 ; o16 81 /0 iw [8086]
CMP r/m32,imm32 ; o32 81 /0 id [386]
CMP r/m16,imm8 ; o16 83 /0 ib [8086]
CMP r/m32,imm8 ; o32 83 /0 ib [386]
CMP AL,imm8 ; 3C ib [8086]
CMP AX,imm16 ; o16 3D iw [8086]
CMP EAX,imm32 ; o32 3D id [386]
In the forms with an 8-bit immediate second operand and a longer
first operand, the second operand is considered to be signed, and is
sign-extended to the length of the first operand. In these cases, the 
The destination operand can be a register or a memory location. The source can be a register, memory location or an immediate value of the same size as the destination.
CMPccPD CMPPD xmm1,xmm2/mem128,imm8 ; 66 0F C2 /r ib [WILLAMETTE,SSE2]
CMPEQPD xmm1,xmm2/mem128 ; 66 0F C2 /r 00 [WILLAMETTE,SSE2]
CMPLTPD xmm1,xmm2/mem128 ; 66 0F C2 /r 01 [WILLAMETTE,SSE2]
CMPLEPD xmm1,xmm2/mem128 ; 66 0F C2 /r 02 [WILLAMETTE,SSE2]
CMPUNORDPD xmm1,xmm2/mem128 ; 66 0F C2 /r 03 [WILLAMETTE,SSE2]
CMPNEQPD xmm1,xmm2/mem128 ; 66 0F C2 /r 04 [WILLAMETTE,SSE2]
CMPNLTPD xmm1,xmm2/mem128 ; 66 0F C2 /r 05 [WILLAMETTE,SSE2]
CMPNLEPD xmm1,xmm2/mem128 ; 66 0F C2 /r 06 [WILLAMETTE,SSE2]
CMPORDPD xmm1,xmm2/mem128 ; 66 0F C2 /r 07 [WILLAMETTE,SSE2]
The 
The destination is an 
The third operand is an 8-bit immediate value, of which the low 3
bits define the type of comparison. For ease of programming, the 8
two-operand pseudo-instructions are provided, with the third operand
already filled in. The 
EQ 0 Equal
LT 1 Less-than
LE 2 Less-than-or-equal
UNORD 3 Unordered
NE 4 Not-equal
NLT 5 Not-less-than
NLE 6 Not-less-than-or-equal
ORD 7 Ordered
For more details of the comparison predicates, and details of how to emulate the "greater-than" equivalents, see section B.2.3
CMPccPS CMPPS xmm1,xmm2/mem128,imm8 ; 0F C2 /r ib [KATMAI,SSE]
CMPEQPS xmm1,xmm2/mem128 ; 0F C2 /r 00 [KATMAI,SSE]
CMPLTPS xmm1,xmm2/mem128 ; 0F C2 /r 01 [KATMAI,SSE]
CMPLEPS xmm1,xmm2/mem128 ; 0F C2 /r 02 [KATMAI,SSE]
CMPUNORDPS xmm1,xmm2/mem128 ; 0F C2 /r 03 [KATMAI,SSE]
CMPNEQPS xmm1,xmm2/mem128 ; 0F C2 /r 04 [KATMAI,SSE]
CMPNLTPS xmm1,xmm2/mem128 ; 0F C2 /r 05 [KATMAI,SSE]
CMPNLEPS xmm1,xmm2/mem128 ; 0F C2 /r 06 [KATMAI,SSE]
CMPORDPS xmm1,xmm2/mem128 ; 0F C2 /r 07 [KATMAI,SSE]
The 
The destination is an 
The third operand is an 8-bit immediate value, of which the low 3
bits define the type of comparison. For ease of programming, the 8
two-operand pseudo-instructions are provided, with the third operand
already filled in. The 
EQ 0 Equal
LT 1 Less-than
LE 2 Less-than-or-equal
UNORD 3 Unordered
NE 4 Not-equal
NLT 5 Not-less-than
NLE 6 Not-less-than-or-equal
ORD 7 Ordered
For more details of the comparison predicates, and details of how to emulate the "greater-than" equivalents, see section B.2.3
CMPSB CMPSW CMPSD CMPSB ; A6 [8086]
CMPSW ; o16 A7 [8086]
CMPSD ; o32 A7 [386]
The registers used are 
The segment register used to load from 
The 
CMPccSD CMPSD xmm1,xmm2/mem64,imm8 ; F2 0F C2 /r ib [WILLAMETTE,SSE2]
CMPEQSD xmm1,xmm2/mem64 ; F2 0F C2 /r 00 [WILLAMETTE,SSE2]
CMPLTSD xmm1,xmm2/mem64 ; F2 0F C2 /r 01 [WILLAMETTE,SSE2]
CMPLESD xmm1,xmm2/mem64 ; F2 0F C2 /r 02 [WILLAMETTE,SSE2]
CMPUNORDSD xmm1,xmm2/mem64 ; F2 0F C2 /r 03 [WILLAMETTE,SSE2]
CMPNEQSD xmm1,xmm2/mem64 ; F2 0F C2 /r 04 [WILLAMETTE,SSE2]
CMPNLTSD xmm1,xmm2/mem64 ; F2 0F C2 /r 05 [WILLAMETTE,SSE2]
CMPNLESD xmm1,xmm2/mem64 ; F2 0F C2 /r 06 [WILLAMETTE,SSE2]
CMPORDSD xmm1,xmm2/mem64 ; F2 0F C2 /r 07 [WILLAMETTE,SSE2]
The 
The destination is an 
The third operand is an 8-bit immediate value, of which the low 3
bits define the type of comparison. For ease of programming, the 8
two-operand pseudo-instructions are provided, with the third operand
already filled in. The 
EQ 0 Equal
LT 1 Less-than
LE 2 Less-than-or-equal
UNORD 3 Unordered
NE 4 Not-equal
NLT 5 Not-less-than
NLE 6 Not-less-than-or-equal
ORD 7 Ordered
For more details of the comparison predicates, and details of how to emulate the "greater-than" equivalents, see section B.2.3
CMPccSS CMPSS xmm1,xmm2/mem32,imm8 ; F3 0F C2 /r ib [KATMAI,SSE]
CMPEQSS xmm1,xmm2/mem32 ; F3 0F C2 /r 00 [KATMAI,SSE]
CMPLTSS xmm1,xmm2/mem32 ; F3 0F C2 /r 01 [KATMAI,SSE]
CMPLESS xmm1,xmm2/mem32 ; F3 0F C2 /r 02 [KATMAI,SSE]
CMPUNORDSS xmm1,xmm2/mem32 ; F3 0F C2 /r 03 [KATMAI,SSE]
CMPNEQSS xmm1,xmm2/mem32 ; F3 0F C2 /r 04 [KATMAI,SSE]
CMPNLTSS xmm1,xmm2/mem32 ; F3 0F C2 /r 05 [KATMAI,SSE]
CMPNLESS xmm1,xmm2/mem32 ; F3 0F C2 /r 06 [KATMAI,SSE]
CMPORDSS xmm1,xmm2/mem32 ; F3 0F C2 /r 07 [KATMAI,SSE]
The 
The destination is an 
The third operand is an 8-bit immediate value, of which the low 3
bits define the type of comparison. For ease of programming, the 8
two-operand pseudo-instructions are provided, with the third operand
already filled in. The 
EQ 0 Equal
LT 1 Less-than
LE 2 Less-than-or-equal
UNORD 3 Unordered
NE 4 Not-equal
NLT 5 Not-less-than
NLE 6 Not-less-than-or-equal
ORD 7 Ordered
For more details of the comparison predicates, and details of how to emulate the "greater-than" equivalents, see section B.2.3
CMPXCHG CMPXCHG486 CMPXCHG r/m8,reg8 ; 0F B0 /r [PENT]
CMPXCHG r/m16,reg16 ; o16 0F B1 /r [PENT]
CMPXCHG r/m32,reg32 ; o32 0F B1 /r [PENT]
CMPXCHG486 r/m8,reg8 ; 0F A6 /r [486,UNDOC]
CMPXCHG486 r/m16,reg16 ; o16 0F A7 /r [486,UNDOC]
CMPXCHG486 r/m32,reg32 ; o32 0F A7 /r [486,UNDOC]
These two instructions perform exactly the same operation; however,
apparently some (not all) 486 processors support it under a non-standard
opcode, so NASM provides the undocumented 
The destination can be either a register or a memory location. The source is a register.
CMPXCHG8B CMPXCHG8B mem ; 0F C7 /1 [PENT]
This is a larger and more unwieldy version of 
COMISD COMISD xmm1,xmm2/mem64 ; 66 0F 2F /r [WILLAMETTE,SSE2]
The destination operand is an 
The flags are set according to the following rules:
Result Flags Values
UNORDERED: ZF,PF,CF <-- 111;
GREATER_THAN: ZF,PF,CF <-- 000;
LESS_THAN: ZF,PF,CF <-- 001;
EQUAL: ZF,PF,CF <-- 100;
COMISS COMISS xmm1,xmm2/mem32 ; 66 0F 2F /r [KATMAI,SSE]
The destination operand is an 
The flags are set according to the following rules:
Result Flags Values
UNORDERED: ZF,PF,CF <-- 111;
GREATER_THAN: ZF,PF,CF <-- 000;
LESS_THAN: ZF,PF,CF <-- 001;
EQUAL: ZF,PF,CF <-- 100;
CPUID CPUID ; 0F A2 [PENT]
The information returned is as follows:
EAX EAX EAX EBX:EDX:ECX "GenuineIntel" EBX "Genu" EDX "ineI" ECX "ntel" EAX EAX EDX CMPXCHG8B MMX EAX EAX EBX ECX EDX For more information on the data returned from 
CVTDQ2PD CVTDQ2PD xmm1,xmm2/mem64 ; F3 0F E6 /r [WILLAMETTE,SSE2]
The destination operand is an 
CVTDQ2PS CVTDQ2PS xmm1,xmm2/mem128 ; 0F 5B /r [WILLAMETTE,SSE2]
The destination operand is an 
For more details of this instruction, see the Intel Processor manuals.
CVTPD2DQ CVTPD2DQ xmm1,xmm2/mem128 ; F2 0F E6 /r [WILLAMETTE,SSE2]
The destination operand is an 
For more details of this instruction, see the Intel Processor manuals.
CVTPD2PI CVTPD2PI mm,xmm/mem128 ; 66 0F 2D /r [WILLAMETTE,SSE2]
The destination operand is an 
For more details of this instruction, see the Intel Processor manuals.
CVTPD2PS CVTPD2PS xmm1,xmm2/mem128 ; 66 0F 5A /r [WILLAMETTE,SSE2]
The destination operand is an 
For more details of this instruction, see the Intel Processor manuals.
CVTPI2PD CVTPI2PD xmm,mm/mem64 ; 66 0F 2A /r [WILLAMETTE,SSE2]
The destination operand is an 
For more details of this instruction, see the Intel Processor manuals.
CVTPI2PS CVTPI2PS xmm,mm/mem64 ; 0F 2A /r [KATMAI,SSE]
The destination operand is an 
For more details of this instruction, see the Intel Processor manuals.
CVTPS2DQ CVTPS2DQ xmm1,xmm2/mem128 ; 66 0F 5B /r [WILLAMETTE,SSE2]
The destination operand is an 
For more details of this instruction, see the Intel Processor manuals.
CVTPS2PD CVTPS2PD xmm1,xmm2/mem64 ; 0F 5A /r [WILLAMETTE,SSE2]
The destination operand is an 
For more details of this instruction, see the Intel Processor manuals.
CVTPS2PI CVTPS2PI mm,xmm/mem64 ; 0F 2D /r [KATMAI,SSE]
The destination operand is an 
For more details of this instruction, see the Intel Processor manuals.
CVTSD2SI CVTSD2SI reg32,xmm/mem64 ; F2 0F 2D /r [WILLAMETTE,SSE2]
The destination operand is a general purpose register. The source
can be either an 
For more details of this instruction, see the Intel Processor manuals.
CVTSD2SS CVTSD2SS xmm1,xmm2/mem64 ; F2 0F 5A /r [KATMAI,SSE]
The destination operand is an 
For more details of this instruction, see the Intel Processor manuals.
CVTSI2SD CVTSI2SD xmm,r/m32 ; F2 0F 2A /r [WILLAMETTE,SSE2]
The destination operand is an 
For more details of this instruction, see the Intel Processor manuals.
CVTSI2SS CVTSI2SS xmm,r/m32 ; F3 0F 2A /r [KATMAI,SSE]
The destination operand is an 
For more details of this instruction, see the Intel Processor manuals.
CVTSS2SD CVTSS2SD xmm1,xmm2/mem32 ; F3 0F 5A /r [WILLAMETTE,SSE2]
The destination operand is an 
For more details of this instruction, see the Intel Processor manuals.
CVTSS2SI CVTSS2SI reg32,xmm/mem32 ; F3 0F 2D /r [KATMAI,SSE]
The destination operand is a general purpose register. The source
can be either an 
For more details of this instruction, see the Intel Processor manuals.
CVTTPD2DQ CVTTPD2DQ xmm1,xmm2/mem128 ; 66 0F E6 /r [WILLAMETTE,SSE2]
The destination operand is an 
For more details of this instruction, see the Intel Processor manuals.
CVTTPD2PI CVTTPD2PI mm,xmm/mem128 ; 66 0F 2C /r [WILLAMETTE,SSE2]
The destination operand is an 
For more details of this instruction, see the Intel Processor manuals.
CVTTPS2DQ CVTTPS2DQ xmm1,xmm2/mem128 ; F3 0F 5B /r [WILLAMETTE,SSE2]
The destination operand is an 
For more details of this instruction, see the Intel Processor manuals.
CVTTPS2PI CVTTPS2PI mm,xmm/mem64 ; 0F 2C /r [KATMAI,SSE]
The destination operand is an 
For more details of this instruction, see the Intel Processor manuals.
CVTTSD2SI CVTTSD2SI reg32,xmm/mem64 ; F2 0F 2C /r [WILLAMETTE,SSE2]
The destination operand is a general purpose register. The source
can be either an 
For more details of this instruction, see the Intel Processor manuals.
CVTTSS2SI CVTTSD2SI reg32,xmm/mem32 ; F3 0F 2C /r [KATMAI,SSE]
The destination operand is a general purpose register. The source
can be either an 
For more details of this instruction, see the Intel Processor manuals.
DAA DAS DAA ; 27 [8086]
DAS ; 2F [8086]
These instructions are used in conjunction with the add and subtract instructions to perform binary-coded decimal arithmetic in packed (one BCD digit per nibble) form. For the unpacked equivalents, see section B.4.1.
DEC DEC reg16 ; o16 48+r [8086]
DEC reg32 ; o32 48+r [386]
DEC r/m8 ; FE /1 [8086]
DEC r/m16 ; o16 FF /1 [8086]
DEC r/m32 ; o32 FF /1 [386]
This instruction can be used with a 
See also 
DIV DIV r/m8 ; F6 /6 [8086]
DIV r/m16 ; o16 F7 /6 [8086]
DIV r/m32 ; o32 F7 /6 [386]
DIV r/m8 AX AL AH DIV r/m16 DX:AX AX DX DIV r/m32 EDX:EAX EAX EDX Signed integer division is performed by the 
DIVPD DIVPD xmm1,xmm2/mem128 ; 66 0F 5E /r [WILLAMETTE,SSE2]
The destination is an 
dst[0-63] := dst[0-63] / src[0-63],
dst[64-127] := dst[64-127] / src[64-127].
DIVPS DIVPS xmm1,xmm2/mem128 ; 0F 5E /r [KATMAI,SSE]
The destination is an 
dst[0-31] := dst[0-31] / src[0-31],
dst[32-63] := dst[32-63] / src[32-63],
dst[64-95] := dst[64-95] / src[64-95],
dst[96-127] := dst[96-127] / src[96-127].
DIVSD DIVSD xmm1,xmm2/mem64 ; F2 0F 5E /r [WILLAMETTE,SSE2]
The destination is an 
dst[0-63] := dst[0-63] / src[0-63],
dst[64-127] remains unchanged.
DIVSS DIVSS xmm1,xmm2/mem32 ; F3 0F 5E /r [KATMAI,SSE]
The destination is an 
dst[0-31] := dst[0-31] / src[0-31],
dst[32-127] remains unchanged.
EMMS EMMS ; 0F 77 [PENT,MMX]
ENTER ENTER imm,imm ; C8 iw ib [186]
The function of 
PUSH EBP ; or PUSH BP in 16 bits
MOV EBP,ESP ; or MOV BP,SP in 16 bits
SUB ESP,operand1 ; or SUB SP,operand1 in 16 bits
This creates a stack frame with the procedure parameters accessible
upwards from 
With a nesting level of one, the stack frame created is 4 (or 2)
bytes bigger, and the value of the final frame pointer 
This allows 
Stack frames created by 
F2XM1 F2XM1 ; D9 F0 [8086,FPU]
FABS FABS ; D9 E1 [8086,FPU]
FADD FADDP FADD mem32 ; D8 /0 [8086,FPU]
FADD mem64 ; DC /0 [8086,FPU]
FADD fpureg ; D8 C0+r [8086,FPU]
FADD ST0,fpureg ; D8 C0+r [8086,FPU]
FADD TO fpureg ; DC C0+r [8086,FPU]
FADD fpureg,ST0 ; DC C0+r [8086,FPU]
FADDP fpureg ; DE C0+r [8086,FPU]
FADDP fpureg,ST0 ; DE C0+r [8086,FPU]
FADD ST0 ST0 TO ST0 FADDP FADD
TO The given two-operand forms are synonyms for the one-operand forms.
To add an integer value to 
FBLD FBSTP FBLD mem80 ; DF /4 [8086,FPU]
FBSTP mem80 ; DF /6 [8086,FPU]
FCHS FCHS ; D9 E0 [8086,FPU]
FCLEX FNCLEX FCLEX ; 9B DB E2 [8086,FPU]
FNCLEX ; DB E2 [8086,FPU]
FCMOVcc FCMOVB fpureg ; DA C0+r [P6,FPU]
FCMOVB ST0,fpureg ; DA C0+r [P6,FPU]
FCMOVE fpureg ; DA C8+r [P6,FPU]
FCMOVE ST0,fpureg ; DA C8+r [P6,FPU]
FCMOVBE fpureg ; DA D0+r [P6,FPU]
FCMOVBE ST0,fpureg ; DA D0+r [P6,FPU]
FCMOVU fpureg ; DA D8+r [P6,FPU]
FCMOVU ST0,fpureg ; DA D8+r [P6,FPU]
FCMOVNB fpureg ; DB C0+r [P6,FPU]
FCMOVNB ST0,fpureg ; DB C0+r [P6,FPU]
FCMOVNE fpureg ; DB C8+r [P6,FPU]
FCMOVNE ST0,fpureg ; DB C8+r [P6,FPU]
FCMOVNBE fpureg ; DB D0+r [P6,FPU]
FCMOVNBE ST0,fpureg ; DB D0+r [P6,FPU]
FCMOVNU fpureg ; DB D8+r [P6,FPU]
FCMOVNU ST0,fpureg ; DB D8+r [P6,FPU]
The 
The conditions are not the same as the standard condition codes used
with conditional jump instructions. The conditions 
The 
Although the 
FCOM FCOMP FCOMPP FCOMI FCOMIP FCOM mem32 ; D8 /2 [8086,FPU]
FCOM mem64 ; DC /2 [8086,FPU]
FCOM fpureg ; D8 D0+r [8086,FPU]
FCOM ST0,fpureg ; D8 D0+r [8086,FPU]
FCOMP mem32 ; D8 /3 [8086,FPU]
FCOMP mem64 ; DC /3 [8086,FPU]
FCOMP fpureg ; D8 D8+r [8086,FPU]
FCOMP ST0,fpureg ; D8 D8+r [8086,FPU]
FCOMPP ; DE D9 [8086,FPU]
FCOMI fpureg ; DB F0+r [P6,FPU]
FCOMI ST0,fpureg ; DB F0+r [P6,FPU]
FCOMIP fpureg ; DF F0+r [P6,FPU]
FCOMIP ST0,fpureg ; DF F0+r [P6,FPU]
The 
FCOS FCOS ; D9 FF [386,FPU]
See also 
FDECSTP FDECSTP ; D9 F6 [8086,FPU]
FxDISI FxENI FDISI ; 9B DB E1 [8086,FPU]
FNDISI ; DB E1 [8086,FPU]
FENI ; 9B DB E0 [8086,FPU]
FNENI ; DB E0 [8086,FPU]
FDIV FDIVP FDIVR FDIVRP FDIV mem32 ; D8 /6 [8086,FPU]
FDIV mem64 ; DC /6 [8086,FPU]
FDIV fpureg ; D8 F0+r [8086,FPU]
FDIV ST0,fpureg ; D8 F0+r [8086,FPU]
FDIV TO fpureg ; DC F8+r [8086,FPU]
FDIV fpureg,ST0 ; DC F8+r [8086,FPU]
FDIVR mem32 ; D8 /0 [8086,FPU]
FDIVR mem64 ; DC /0 [8086,FPU]
FDIVR fpureg ; D8 F8+r [8086,FPU]
FDIVR ST0,fpureg ; D8 F8+r [8086,FPU]
FDIVR TO fpureg ; DC F0+r [8086,FPU]
FDIVR fpureg,ST0 ; DC F0+r [8086,FPU]
FDIVP fpureg ; DE F8+r [8086,FPU]
FDIVP fpureg,ST0 ; DE F8+r [8086,FPU]
FDIVRP fpureg ; DE F0+r [8086,FPU]
FDIVRP fpureg,ST0 ; DE F0+r [8086,FPU]
FDIV ST0 ST0 TO ST0 FDIVR TO ST0 ST0 TO ST0 FDIVP FDIV TO FDIVRP FDIVR
TO For FP/Integer divisions, see 
FEMMS FEMMS ; 0F 0E [PENT,3DNOW]
FFREE FFREE fpureg ; DD C0+r [8086,FPU]
FFREEP fpureg ; DF C0+r [286,FPU,UNDOC]
FIADD FIADD mem16 ; DE /0 [8086,FPU]
FIADD mem32 ; DA /0 [8086,FPU]
FICOM FICOMP FICOM mem16 ; DE /2 [8086,FPU]
FICOM mem32 ; DA /2 [8086,FPU]
FICOMP mem16 ; DE /3 [8086,FPU]
FICOMP mem32 ; DA /3 [8086,FPU]
FIDIV FIDIVR FIDIV mem16 ; DE /6 [8086,FPU]
FIDIV mem32 ; DA /6 [8086,FPU]
FIDIVR mem16 ; DE /7 [8086,FPU]
FIDIVR mem32 ; DA /7 [8086,FPU]
FILD FIST FISTP FILD mem16 ; DF /0 [8086,FPU]
FILD mem32 ; DB /0 [8086,FPU]
FILD mem64 ; DF /5 [8086,FPU]
FIST mem16 ; DF /2 [8086,FPU]
FIST mem32 ; DB /2 [8086,FPU]
FISTP mem16 ; DF /3 [8086,FPU]
FISTP mem32 ; DB /3 [8086,FPU]
FISTP mem64 ; DF /7 [8086,FPU]
FIMUL FIMUL mem16 ; DE /1 [8086,FPU]
FIMUL mem32 ; DA /1 [8086,FPU]
FINCSTP FINCSTP ; D9 F7 [8086,FPU]
FINIT FNINIT FINIT ; 9B DB E3 [8086,FPU]
FNINIT ; DB E3 [8086,FPU]
FISUB FISUB mem16 ; DE /4 [8086,FPU]
FISUB mem32 ; DA /4 [8086,FPU]
FISUBR mem16 ; DE /5 [8086,FPU]
FISUBR mem32 ; DA /5 [8086,FPU]
FLD FLD mem32 ; D9 /0 [8086,FPU]
FLD mem64 ; DD /0 [8086,FPU]
FLD mem80 ; DB /5 [8086,FPU]
FLD fpureg ; D9 C0+r [8086,FPU]
FLDxx FLD1 ; D9 E8 [8086,FPU]
FLDL2E ; D9 EA [8086,FPU]
FLDL2T ; D9 E9 [8086,FPU]
FLDLG2 ; D9 EC [8086,FPU]
FLDLN2 ; D9 ED [8086,FPU]
FLDPI ; D9 EB [8086,FPU]
FLDZ ; D9 EE [8086,FPU]
These instructions push specific standard constants on the FPU register stack.
Instruction Constant pushed
FLD1 1
FLDL2E base-2 logarithm of e
FLDL2T base-2 log of 10
FLDLG2 base-10 log of 2
FLDLN2 base-e log of 2
FLDPI pi
FLDZ zero
FLDCW FLDCW mem16 ; D9 /5 [8086,FPU]
FLDENV FLDENV mem ; D9 /4 [8086,FPU]
FMUL FMULP FMUL mem32 ; D8 /1 [8086,FPU]
FMUL mem64 ; DC /1 [8086,FPU]
FMUL fpureg ; D8 C8+r [8086,FPU]
FMUL ST0,fpureg ; D8 C8+r [8086,FPU]
FMUL TO fpureg ; DC C8+r [8086,FPU]
FMUL fpureg,ST0 ; DC C8+r [8086,FPU]
FMULP fpureg ; DE C8+r [8086,FPU]
FMULP fpureg,ST0 ; DE C8+r [8086,FPU]
FNOP FNOP ; D9 D0 [8086,FPU]
FPATAN FPTAN FPATAN ; D9 F3 [8086,FPU]
FPTAN ; D9 F2 [8086,FPU]
The absolute value of 
FPREM FPREM1 FPREM ; D9 F8 [8086,FPU]
FPREM1 ; D9 F5 [386,FPU]
These instructions both produce the remainder obtained by dividing 
The two instructions differ in the way the notional round-to-integer
operation is performed. 
Both instructions calculate partial remainders, meaning
that they may not manage to provide the final result, but might leave
intermediate results in 
FRNDINT FRNDINT ; D9 FC [8086,FPU]
FSAVE FRSTOR FSAVE mem ; 9B DD /6 [8086,FPU]
FNSAVE mem ; DD /6 [8086,FPU]
FRSTOR mem ; DD /4 [8086,FPU]
FSCALE FSCALE ; D9 FD [8086,FPU]
FSETPM FSETPM ; DB E4 [286,FPU]
This instruction initialises protected mode on the 287 floating-point coprocessor. It is only meaningful on that processor: the 387 and above treat the instruction as a no-operation.
FSIN FSINCOS FSIN ; D9 FE [386,FPU]
FSINCOS ; D9 FB [386,FPU]
The absolute value of 
FSQRT FSQRT ; D9 FA [8086,FPU]
FST FSTP FST mem32 ; D9 /2 [8086,FPU]
FST mem64 ; DD /2 [8086,FPU]
FST fpureg ; DD D0+r [8086,FPU]
FSTP mem32 ; D9 /3 [8086,FPU]
FSTP mem64 ; DD /3 [8086,FPU]
FSTP mem80 ; DB /7 [8086,FPU]
FSTP fpureg ; DD D8+r [8086,FPU]
FSTCW FSTCW mem16 ; 9B D9 /7 [8086,FPU]
FNSTCW mem16 ; D9 /7 [8086,FPU]
FSTENV FSTENV mem ; 9B D9 /6 [8086,FPU]
FNSTENV mem ; D9 /6 [8086,FPU]
FSTSW FSTSW mem16 ; 9B DD /7 [8086,FPU]
FSTSW AX ; 9B DF E0 [286,FPU]
FNSTSW mem16 ; DD /7 [8086,FPU]
FNSTSW AX ; DF E0 [286,FPU]
FSUB FSUBP FSUBR FSUBRP FSUB mem32 ; D8 /4 [8086,FPU]
FSUB mem64 ; DC /4 [8086,FPU]
FSUB fpureg ; D8 E0+r [8086,FPU]
FSUB ST0,fpureg ; D8 E0+r [8086,FPU]
FSUB TO fpureg ; DC E8+r [8086,FPU]
FSUB fpureg,ST0 ; DC E8+r [8086,FPU]
FSUBR mem32 ; D8 /5 [8086,FPU]
FSUBR mem64 ; DC /5 [8086,FPU]
FSUBR fpureg ; D8 E8+r [8086,FPU]
FSUBR ST0,fpureg ; D8 E8+r [8086,FPU]
FSUBR TO fpureg ; DC E0+r [8086,FPU]
FSUBR fpureg,ST0 ; DC E0+r [8086,FPU]
FSUBP fpureg ; DE E8+r [8086,FPU]
FSUBP fpureg,ST0 ; DE E8+r [8086,FPU]
FSUBRP fpureg ; DE E0+r [8086,FPU]
FSUBRP fpureg,ST0 ; DE E0+r [8086,FPU]
FSUB ST0 ST0 TO ST0 FSUBR TO ST0 ST0 TO ST0 FSUBP FSUB TO FSUBRP FSUBR
TO FTST ST0 FTST ; D9 E4 [8086,FPU]
FUCOMxx FUCOM fpureg ; DD E0+r [386,FPU]
FUCOM ST0,fpureg ; DD E0+r [386,FPU]
FUCOMP fpureg ; DD E8+r [386,FPU]
FUCOMP ST0,fpureg ; DD E8+r [386,FPU]
FUCOMPP ; DA E9 [386,FPU]
FUCOMI fpureg ; DB E8+r [P6,FPU]
FUCOMI ST0,fpureg ; DB E8+r [P6,FPU]
FUCOMIP fpureg ; DF E8+r [P6,FPU]
FUCOMIP ST0,fpureg ; DF E8+r [P6,FPU]
FUCOM ST0 ST0 ST0 FUCOMP FUCOM FUCOMPP ST0 ST1 FUCOMI FUCOMIP FUCOM FUCOMP The 
FXAM ST0 FXAM ; D9 E5 [8086,FPU]
Register contents Flags
Unsupported format 000
NaN 001
Finite number 010
Infinity 011
Zero 100
Empty register 101
Denormal 110
Additionally, the 
FXCH FXCH ; D9 C9 [8086,FPU]
FXCH fpureg ; D9 C8+r [8086,FPU]
FXCH fpureg,ST0 ; D9 C8+r [8086,FPU]
FXCH ST0,fpureg ; D9 C8+r [8086,FPU]
FXRSTOR FP MMX SSE FXRSTOR memory ; 0F AE /1 [P6,SSE,FPU]
The 
FXSAVE FP MMX SSE FXSAVE memory ; 0F AE /0 [P6,SSE,FPU]
Unlike the 
FXTRACT FXTRACT ; D9 F4 [8086,FPU]
FYL2X FYL2XP1 FYL2X ; D9 F1 [8086,FPU]
FYL2XP1 ; D9 F9 [8086,FPU]
HLT HLT ; F4 [8086,PRIV]
On the 286 and later processors, this is a privileged instruction.
IBTS IBTS r/m16,reg16 ; o16 0F A7 /r [386,UNDOC]
IBTS r/m32,reg32 ; o32 0F A7 /r [386,UNDOC]
The implied operation of this instruction is:
IBTS r/m16,AX,CL,reg16
IBTS r/m32,EAX,CL,reg32
Writes a bit string from the source operand to the destination. 
IDIV IDIV r/m8 ; F6 /7 [8086]
IDIV r/m16 ; o16 F7 /7 [8086]
IDIV r/m32 ; o32 F7 /7 [386]
IDIV r/m8 AX AL AH IDIV r/m16 DX:AX AX DX IDIV r/m32 EDX:EAX EAX EDX Unsigned integer division is performed by the 
IMUL IMUL r/m8 ; F6 /5 [8086]
IMUL r/m16 ; o16 F7 /5 [8086]
IMUL r/m32 ; o32 F7 /5 [386]
IMUL reg16,r/m16 ; o16 0F AF /r [386]
IMUL reg32,r/m32 ; o32 0F AF /r [386]
IMUL reg16,imm8 ; o16 6B /r ib [186]
IMUL reg16,imm16 ; o16 69 /r iw [186]
IMUL reg32,imm8 ; o32 6B /r ib [386]
IMUL reg32,imm32 ; o32 69 /r id [386]
IMUL reg16,r/m16,imm8 ; o16 6B /r ib [186]
IMUL reg16,r/m16,imm16 ; o16 69 /r iw [186]
IMUL reg32,r/m32,imm8 ; o32 6B /r ib [386]
IMUL reg32,r/m32,imm32 ; o32 69 /r id [386]
IMUL r/m8 AL AX IMUL r/m16 AX DX:AX IMUL r/m32 EAX EDX:EAX The two-operand form multiplies its two operands and stores the result in the destination (first) operand. The three-operand form multiplies its last two operands and stores the result in the first operand.
The two-operand form with an immediate second operand is in fact a
shorthand for the three-operand form, as can be seen by examining the
opcode descriptions: in the two-operand form, the code 
In the forms with an 8-bit immediate operand and another longer
source operand, the immediate operand is considered to be signed, and is
sign-extended to the length of the other source operand. In these
cases, the 
Unsigned integer multiplication is performed by the 
IN IN AL,imm8 ; E4 ib [8086]
IN AX,imm8 ; o16 E5 ib [8086]
IN EAX,imm8 ; o32 E5 ib [386]
IN AL,DX ; EC [8086]
IN AX,DX ; o16 ED [8086]
IN EAX,DX ; o32 ED [386]
INC INC reg16 ; o16 40+r [8086]
INC reg32 ; o32 40+r [386]
INC r/m8 ; FE /0 [8086]
INC r/m16 ; o16 FF /0 [8086]
INC r/m32 ; o32 FF /0 [386]
This instruction can be used with a 
See also 
INSB INSW INSD INSB ; 6C [186]
INSW ; o16 6D [186]
INSD ; o32 6D [386]
The register used is 
Segment override prefixes have no effect for this instruction: the
use of 
The 
See also 
INT INT imm8 ; CD ib [8086]
The code generated by the 
INT3 INT1 ICEBP INT01 INT1 ; F1 [P6]
ICEBP ; F1 [P6]
INT01 ; F1 [P6]
INT3 ; CC [8086]
INT03 ; CC [8086]
INT1 INT01 ICEBP INT3 INT3 INT03 INT 3 IOPL INTO INTO ; CE [8086]
INVD INVD ; 0F 08 [486]
INVLPG INVLPG mem ; 0F 01 /7 [486]
IRET IRETW IRETD IRET ; CF [8086]
IRETW ; o16 CF [8086]
IRETD ; o32 CF [386]
Jcc Jcc imm ; 70+cc rb [8086]
Jcc NEAR imm ; 0F 80+cc rw/rd [386]
The conditional jump instructions execute a near (same segment) jump
if and only if their conditions are satisfied. For example, 
The ordinary form of the instructions has only a 128-byte range; the 
The 
For details of the condition codes, see section B.2.2.
JCXZ JECXZ JCXZ imm ; a16 E3 rb [8086]
JECXZ imm ; a32 E3 rb [386]
JMP JMP imm ; E9 rw/rd [8086]
JMP SHORT imm ; EB rb [8086]
JMP imm:imm16 ; o16 EA iw iw [8086]
JMP imm:imm32 ; o32 EA id iw [386]
JMP FAR mem ; o16 FF /5 [8086]
JMP FAR mem32 ; o32 FF /5 [386]
JMP r/m16 ; o16 FF /4 [8086]
JMP r/m32 ; o32 FF /4 [386]
You can choose between the two immediate far jump forms (
The 
The 
As a convenience, NASM does not require you to jump to a far symbol
by coding the cumbersome 
The 
LAHF LAHF ; 9F [8086]
The operation of 
AH <-- SF:ZF:0:AF:0:PF:1:CF
See also 
LAR LAR reg16,r/m16 ; o16 0F 02 /r [286,PRIV]
LAR reg32,r/m32 ; o32 0F 02 /r [286,PRIV]
LDMXCSR LDMXCSR mem32 ; 0F AE /2 [KATMAI,SSE]
For details of the 
See also 
LDS LES LFS LGS LSS LDS reg16,mem ; o16 C5 /r [8086]
LDS reg32,mem ; o32 C5 /r [386]
LES reg16,mem ; o16 C4 /r [8086]
LES reg32,mem ; o32 C4 /r [386]
LFS reg16,mem ; o16 0F B4 /r [386]
LFS reg32,mem ; o32 0F B4 /r [386]
LGS reg16,mem ; o16 0F B5 /r [386]
LGS reg32,mem ; o32 0F B5 /r [386]
LSS reg16,mem ; o16 0F B2 /r [386]
LSS reg32,mem ; o32 0F B2 /r [386]
These instructions load an entire far pointer (16 or 32 bits of
offset, plus 16 bits of segment) out of memory in one go. 
LEA LEA reg16,mem ; o16 8D /r [8086]
LEA reg32,mem ; o32 8D /r [386]
The size of the calculation is the current address size, and the size that the result is stored as is the current operand size. If the address and operand size are not the same, then if the addressing mode was 32-bits, the low 16-bits are stored, and if the address was 16-bits, it is zero-extended to 32-bits before storing.
LEAVE LEAVE ; C9 [186]
LFENCE LFENCE ; 0F AE /5 [WILLAMETTE,SSE2]
Weakly ordered memory types can be used to achieve higher processor
performance through such techniques as out-of-order issue and
speculative reads. The degree to which a consumer of data recognizes or
knows that the data is weakly ordered varies among applications and may
be unknown to the producer of this data. The 
Mod (7:6) = 11B
Reg/Opcode (5:3) = 101B
R/M (2:0) = 000B
All other ModRM encodings are defined to be reserved, and use of these encodings risks incompatibility with future processors.
See also 
LGDT LIDT LLDT LGDT mem ; 0F 01 /2 [286,PRIV]
LIDT mem ; 0F 01 /3 [286,PRIV]
LLDT r/m16 ; 0F 00 /2 [286,PRIV]
See also 
LMSW LMSW r/m16 ; 0F 01 /6 [286,PRIV]
LOADALL LOADALL286 LOADALL ; 0F 07 [386,UNDOC]
LOADALL286 ; 0F 05 [286,UNDOC]
This instruction, in its two different-opcode forms, is apparently supported on most 286 processors, some 386 and possibly some 486. The opcode differs between the 286 and the 386.
The function of the instruction is to load all information relating
to the state of the processor out of a block of memory: on the 286, this
block is located implicitly at absolute address 
LODSB LODSW LODSD LODSB ; AC [8086]
LODSW ; o16 AD [8086]
LODSD ; o32 AD [386]
The register used is 
The segment register used to load from 
LOOP LOOPE LOOPZ LOOPNE LOOPNZ LOOP imm ; E2 rb [8086]
LOOP imm,CX ; a16 E2 rb [8086]
LOOP imm,ECX ; a32 E2 rb [386]
LOOPE imm ; E1 rb [8086]
LOOPE imm,CX ; a16 E1 rb [8086]
LOOPE imm,ECX ; a32 E1 rb [386]
LOOPZ imm ; E1 rb [8086]
LOOPZ imm,CX ; a16 E1 rb [8086]
LOOPZ imm,ECX ; a32 E1 rb [386]
LOOPNE imm ; E0 rb [8086]
LOOPNE imm,CX ; a16 E0 rb [8086]
LOOPNE imm,ECX ; a32 E0 rb [386]
LOOPNZ imm ; E0 rb [8086]
LOOPNZ imm,CX ; a16 E0 rb [8086]
LOOPNZ imm,ECX ; a32 E0 rb [386]
LSL LSL reg16,r/m16 ; o16 0F 03 /r [286,PRIV]
LSL reg32,r/m32 ; o32 0F 03 /r [286,PRIV]
LTR LTR r/m16 ; 0F 00 /3 [286,PRIV]
MASKMOVDQU MASKMOVDQU xmm1,xmm2 ; 66 0F F7 /r [WILLAMETTE,SSE2]
MASKMOVQ MASKMOVQ mm1,mm2 ; 0F F7 /r [KATMAI,MMX]
MAXPD MAXPD xmm1,xmm2/m128 ; 66 0F 5F /r [WILLAMETTE,SSE2]
MAXPS MAXPS xmm1,xmm2/m128 ; 0F 5F /r [KATMAI,SSE]
MAXSD MAXSD xmm1,xmm2/m64 ; F2 0F 5F /r [WILLAMETTE,SSE2]
MAXSS MAXSS xmm1,xmm2/m32 ; F3 0F 5F /r [KATMAI,SSE]
MFENCE MFENCE ; 0F AE /6 [WILLAMETTE,SSE2]
Weakly ordered memory types can be used to achieve higher processor
performance through such techniques as out-of-order issue, speculative
reads, write-combining, and write-collapsing. The degree to which a
consumer of data recognizes or knows that the data is weakly ordered
varies among applications and may be unknown to the producer of this
data. The 
Mod (7:6) = 11B
Reg/Opcode (5:3) = 110B
R/M (2:0) = 000B
All other ModRM encodings are defined to be reserved, and use of these encodings risks incompatibility with future processors.
See also 
MINPD MINPD xmm1,xmm2/m128 ; 66 0F 5D /r [WILLAMETTE,SSE2]
MINPS MINPS xmm1,xmm2/m128 ; 0F 5D /r [KATMAI,SSE]
MINSD MINSD xmm1,xmm2/m64 ; F2 0F 5D /r [WILLAMETTE,SSE2]
MINSS MINSS xmm1,xmm2/m32 ; F3 0F 5D /r [KATMAI,SSE]
MOV MOV r/m8,reg8 ; 88 /r [8086]
MOV r/m16,reg16 ; o16 89 /r [8086]
MOV r/m32,reg32 ; o32 89 /r [386]
MOV reg8,r/m8 ; 8A /r [8086]
MOV reg16,r/m16 ; o16 8B /r [8086]
MOV reg32,r/m32 ; o32 8B /r [386]
MOV reg8,imm8 ; B0+r ib [8086]
MOV reg16,imm16 ; o16 B8+r iw [8086]
MOV reg32,imm32 ; o32 B8+r id [386]
MOV r/m8,imm8 ; C6 /0 ib [8086]
MOV r/m16,imm16 ; o16 C7 /0 iw [8086]
MOV r/m32,imm32 ; o32 C7 /0 id [386]
MOV AL,memoffs8 ; A0 ow/od [8086]
MOV AX,memoffs16 ; o16 A1 ow/od [8086]
MOV EAX,memoffs32 ; o32 A1 ow/od [386]
MOV memoffs8,AL ; A2 ow/od [8086]
MOV memoffs16,AX ; o16 A3 ow/od [8086]
MOV memoffs32,EAX ; o32 A3 ow/od [386]
MOV r/m16,segreg ; o16 8C /r [8086]
MOV r/m32,segreg ; o32 8C /r [386]
MOV segreg,r/m16 ; o16 8E /r [8086]
MOV segreg,r/m32 ; o32 8E /r [386]
MOV reg32,CR0/2/3/4 ; 0F 20 /r [386]
MOV reg32,DR0/1/2/3/6/7 ; 0F 21 /r [386]
MOV reg32,TR3/4/5/6/7 ; 0F 24 /r [386]
MOV CR0/2/3/4,reg32 ; 0F 22 /r [386]
MOV DR0/1/2/3/6/7,reg32 ; 0F 23 /r [386]
MOV TR3/4/5/6/7,reg32 ; 0F 26 /r [386]
In all forms of the 
Test registers are supported on 386/486 processors and on some non-Intel Pentium class processors.
MOVAPD MOVAPD xmm1,xmm2/mem128 ; 66 0F 28 /r [WILLAMETTE,SSE2]
MOVAPD xmm1/mem128,xmm2 ; 66 0F 29 /r [WILLAMETTE,SSE2]
To move data in and out of memory locations that are not known to be
on 16-byte boundaries, use the 
MOVAPS MOVAPS xmm1,xmm2/mem128 ; 0F 28 /r [KATMAI,SSE]
MOVAPS xmm1/mem128,xmm2 ; 0F 29 /r [KATMAI,SSE]
To move data in and out of memory locations that are not known to be
on 16-byte boundaries, use the 
MOVD MOVD mm,r/m32 ; 0F 6E /r [PENT,MMX]
MOVD r/m32,mm ; 0F 7E /r [PENT,MMX]
MOVD xmm,r/m32 ; 66 0F 6E /r [WILLAMETTE,SSE2]
MOVD r/m32,xmm ; 66 0F 7E /r [WILLAMETTE,SSE2]
MOVDQ2Q MOVDQ2Q mm,xmm ; F2 OF D6 /r [WILLAMETTE,SSE2]
MOVDQA MOVDQA xmm1,xmm2/m128 ; 66 OF 6F /r [WILLAMETTE,SSE2]
MOVDQA xmm1/m128,xmm2 ; 66 OF 7F /r [WILLAMETTE,SSE2]
To move a double quadword to or from unaligned memory locations, use
the 
MOVDQU MOVDQU xmm1,xmm2/m128 ; F3 OF 6F /r [WILLAMETTE,SSE2]
MOVDQU xmm1/m128,xmm2 ; F3 OF 7F /r [WILLAMETTE,SSE2]
To move a double quadword to or from known aligned memory locations,
use the 
MOVHLPS MOVHLPS xmm1,xmm2 ; OF 12 /r [KATMAI,SSE]
The operation of this instruction is:
dst[0-63] := src[64-127],
dst[64-127] remains unchanged.
MOVHPD MOVHPD xmm,m64 ; 66 OF 16 /r [WILLAMETTE,SSE2]
MOVHPD m64,xmm ; 66 OF 17 /r [WILLAMETTE,SSE2]
The operation of this instruction is:
mem[0-63] := xmm[64-127];
or
xmm[0-63] remains unchanged;
xmm[64-127] := mem[0-63].
MOVHPS MOVHPS xmm,m64 ; 0F 16 /r [KATMAI,SSE]
MOVHPS m64,xmm ; 0F 17 /r [KATMAI,SSE]
The operation of this instruction is:
mem[0-63] := xmm[64-127];
or
xmm[0-63] remains unchanged;
xmm[64-127] := mem[0-63].
MOVLHPS MOVLHPS xmm1,xmm2 ; OF 16 /r [KATMAI,SSE]
The operation of this instruction is:
dst[0-63] remains unchanged;
dst[64-127] := src[0-63].
MOVLPD MOVLPD xmm,m64 ; 66 OF 12 /r [WILLAMETTE,SSE2]
MOVLPD m64,xmm ; 66 OF 13 /r [WILLAMETTE,SSE2]
The operation of this instruction is:
mem(0-63) := xmm(0-63);
or
xmm(0-63) := mem(0-63);
xmm(64-127) remains unchanged.
MOVLPS MOVLPS xmm,m64 ; OF 12 /r [KATMAI,SSE]
MOVLPS m64,xmm ; OF 13 /r [KATMAI,SSE]
The operation of this instruction is:
mem(0-63) := xmm(0-63);
or
xmm(0-63) := mem(0-63);
xmm(64-127) remains unchanged.
MOVMSKPD MOVMSKPD reg32,xmm ; 66 0F 50 /r [WILLAMETTE,SSE2]
MOVMSKPS MOVMSKPS reg32,xmm ; 0F 50 /r [KATMAI,SSE]
MOVNTDQ MOVNTDQ m128,xmm ; 66 0F E7 /r [WILLAMETTE,SSE2]
MOVNTI MOVNTI m32,reg32 ; 0F C3 /r [WILLAMETTE,SSE2]
MOVNTPD MOVNTPD m128,xmm ; 66 0F 2B /r [WILLAMETTE,SSE2]
MOVNTPS MOVNTPS m128,xmm ; 0F 2B /r [KATMAI,SSE]
MOVNTQ MOVNTQ m64,mm ; 0F E7 /r [KATMAI,MMX]
MOVQ MOVQ mm1,mm2/m64 ; 0F 6F /r [PENT,MMX]
MOVQ mm1/m64,mm2 ; 0F 7F /r [PENT,MMX]
MOVQ xmm1,xmm2/m64 ; F3 0F 7E /r [WILLAMETTE,SSE2]
MOVQ xmm1/m64,xmm2 ; 66 0F D6 /r [WILLAMETTE,SSE2]
MOVQ2DQ MOVQ2DQ xmm,mm ; F3 OF D6 /r [WILLAMETTE,SSE2]
MOVSB MOVSW MOVSD MOVSB ; A4 [8086]
MOVSW ; o16 A5 [8086]
MOVSD ; o32 A5 [386]
The registers used are 
The segment register used to load from 
The 
MOVSD MOVSD xmm1,xmm2/m64 ; F2 0F 10 /r [WILLAMETTE,SSE2]
MOVSD xmm1/m64,xmm2 ; F2 0F 11 /r [WILLAMETTE,SSE2]
MOVSS MOVSS xmm1,xmm2/m32 ; F3 0F 10 /r [KATMAI,SSE]
MOVSS xmm1/m32,xmm2 ; F3 0F 11 /r [KATMAI,SSE]
MOVSX MOVZX MOVSX reg16,r/m8 ; o16 0F BE /r [386]
MOVSX reg32,r/m8 ; o32 0F BE /r [386]
MOVSX reg32,r/m16 ; o32 0F BF /r [386]
MOVZX reg16,r/m8 ; o16 0F B6 /r [386]
MOVZX reg32,r/m8 ; o32 0F B6 /r [386]
MOVZX reg32,r/m16 ; o32 0F B7 /r [386]
MOVUPD MOVUPD xmm1,xmm2/mem128 ; 66 0F 10 /r [WILLAMETTE,SSE2]
MOVUPD xmm1/mem128,xmm2 ; 66 0F 11 /r [WILLAMETTE,SSE2]
To move data in and out of memory locations that are known to be on
16-byte boundaries, use the 
MOVUPS MOVUPS xmm1,xmm2/mem128 ; 0F 10 /r [KATMAI,SSE]
MOVUPS xmm1/mem128,xmm2 ; 0F 11 /r [KATMAI,SSE]
To move data in and out of memory locations that are known to be on
16-byte boundaries, use the 
MUL MUL r/m8 ; F6 /4 [8086]
MUL r/m16 ; o16 F7 /4 [8086]
MUL r/m32 ; o32 F7 /4 [386]
MUL r/m8 AL AX MUL r/m16 AX DX:AX MUL r/m32 EAX EDX:EAX Signed integer multiplication is performed by the 
MULPD MULPD xmm1,xmm2/mem128 ; 66 0F 59 /r [WILLAMETTE,SSE2]
MULPS MULPS xmm1,xmm2/mem128 ; 0F 59 /r [KATMAI,SSE]
MULSD MULSD xmm1,xmm2/mem32 ; F2 0F 59 /r [WILLAMETTE,SSE2]
MULSS MULSS xmm1,xmm2/mem32 ; F3 0F 59 /r [KATMAI,SSE]
NEG NOT NEG r/m8 ; F6 /3 [8086]
NEG r/m16 ; o16 F7 /3 [8086]
NEG r/m32 ; o32 F7 /3 [386]
NOT r/m8 ; F6 /2 [8086]
NOT r/m16 ; o16 F7 /2 [8086]
NOT r/m32 ; o32 F7 /2 [386]
NOP NOP ; 90 [8086]
OR OR r/m8,reg8 ; 08 /r [8086]
OR r/m16,reg16 ; o16 09 /r [8086]
OR r/m32,reg32 ; o32 09 /r [386]
OR reg8,r/m8 ; 0A /r [8086]
OR reg16,r/m16 ; o16 0B /r [8086]
OR reg32,r/m32 ; o32 0B /r [386]
OR r/m8,imm8 ; 80 /1 ib [8086]
OR r/m16,imm16 ; o16 81 /1 iw [8086]
OR r/m32,imm32 ; o32 81 /1 id [386]
OR r/m16,imm8 ; o16 83 /1 ib [8086]
OR r/m32,imm8 ; o32 83 /1 ib [386]
OR AL,imm8 ; 0C ib [8086]
OR AX,imm16 ; o16 0D iw [8086]
OR EAX,imm32 ; o32 0D id [386]
In the forms with an 8-bit immediate second operand and a longer
first operand, the second operand is considered to be signed, and is
sign-extended to the length of the first operand. In these cases, the 
The MMX instruction 
ORPD ORPD xmm1,xmm2/m128 ; 66 0F 56 /r [WILLAMETTE,SSE2]
ORPS ORPS xmm1,xmm2/m128 ; 0F 56 /r [KATMAI,SSE]
OUT OUT imm8,AL ; E6 ib [8086]
OUT imm8,AX ; o16 E7 ib [8086]
OUT imm8,EAX ; o32 E7 ib [386]
OUT DX,AL ; EE [8086]
OUT DX,AX ; o16 EF [8086]
OUT DX,EAX ; o32 EF [386]
OUTSB OUTSW OUTSD OUTSB ; 6E [186]
OUTSW ; o16 6F [186]
OUTSD ; o32 6F [386]
The register used is 
The segment register used to load from 
The 
PACKSSDW PACKSSWB PACKUSWB PACKSSDW mm1,mm2/m64 ; 0F 6B /r [PENT,MMX]
PACKSSWB mm1,mm2/m64 ; 0F 63 /r [PENT,MMX]
PACKUSWB mm1,mm2/m64 ; 0F 67 /r [PENT,MMX]
PACKSSDW xmm1,xmm2/m128 ; 66 0F 6B /r [WILLAMETTE,SSE2]
PACKSSWB xmm1,xmm2/m128 ; 66 0F 63 /r [WILLAMETTE,SSE2]
PACKUSWB xmm1,xmm2/m128 ; 66 0F 67 /r [WILLAMETTE,SSE2]
All these instructions start by combining the source and destination
operands, and then splitting the result in smaller sections which it
then packs into the destination register. The 
PACKSSWB PACKSSDW PACKSSWB PACKUSWB PACKSSWB To perform signed saturation on a number, it is replaced by the
largest signed number (
PADDB PADDW PADDD PADDB mm1,mm2/m64 ; 0F FC /r [PENT,MMX]
PADDW mm1,mm2/m64 ; 0F FD /r [PENT,MMX]
PADDD mm1,mm2/m64 ; 0F FE /r [PENT,MMX]
PADDB xmm1,xmm2/m128 ; 66 0F FC /r [WILLAMETTE,SSE2]
PADDW xmm1,xmm2/m128 ; 66 0F FD /r [WILLAMETTE,SSE2]
PADDD xmm1,xmm2/m128 ; 66 0F FE /r [WILLAMETTE,SSE2]
PADDB PADDW PADDD When an individual result is too large to fit in its destination, it is wrapped around and the low bits are stored, with the carry bit discarded.
PADDQ PADDQ mm1,mm2/m64 ; 0F D4 /r [PENT,MMX]
PADDQ xmm1,xmm2/m128 ; 66 0F D4 /r [WILLAMETTE,SSE2]
When an individual result is too large to fit in its destination, it is wrapped around and the low bits are stored, with the carry bit discarded.
PADDSB PADDSW PADDSB mm1,mm2/m64 ; 0F EC /r [PENT,MMX]
PADDSW mm1,mm2/m64 ; 0F ED /r [PENT,MMX]
PADDSB xmm1,xmm2/m128 ; 66 0F EC /r [WILLAMETTE,SSE2]
PADDSW xmm1,xmm2/m128 ; 66 0F ED /r [WILLAMETTE,SSE2]
When an individual result is too large to fit in its destination, a saturated value is stored. The resulting value is the value with the largest magnitude of the same sign as the result which will fit in the available space.
PADDSIW PADDSIW mmxreg,r/m64 ; 0F 51 /r [CYRIX,MMX]
To work out the implied register, invert the lowest bit in the
register number. So 
PADDUSB PADDUSW PADDUSB mm1,mm2/m64 ; 0F DC /r [PENT,MMX]
PADDUSW mm1,mm2/m64 ; 0F DD /r [PENT,MMX]
PADDUSB xmm1,xmm2/m128 ; 66 0F DC /r [WILLAMETTE,SSE2]
PADDUSW xmm1,xmm2/m128 ; 66 0F DD /r [WILLAMETTE,SSE2]
When an individual result is too large to fit in its destination, a saturated value is stored. The resulting value is the maximum value that will fit in the available space.
PAND PANDN PAND mm1,mm2/m64 ; 0F DB /r [PENT,MMX]
PANDN mm1,mm2/m64 ; 0F DF /r [PENT,MMX]
PAND xmm1,xmm2/m128 ; 66 0F DB /r [WILLAMETTE,SSE2]
PANDN xmm1,xmm2/m128 ; 66 0F DF /r [WILLAMETTE,SSE2]
PAUSE PAUSE ; F3 90 [WILLAMETTE,SSE2]
PAVEB PAVEB mmxreg,r/m64 ; 0F 50 /r [CYRIX,MMX]
This opcode maps to 
PAVGB PAVGW PAVGB mm1,mm2/m64 ; 0F E0 /r [KATMAI,MMX]
PAVGW mm1,mm2/m64 ; 0F E3 /r [KATMAI,MMX,SM]
PAVGB xmm1,xmm2/m128 ; 66 0F E0 /r [WILLAMETTE,SSE2]
PAVGW xmm1,xmm2/m128 ; 66 0F E3 /r [WILLAMETTE,SSE2]
PAVGB PAVGW PAVGUSB PAVGUSB mm1,mm2/m64 ; 0F 0F /r BF [PENT,3DNOW]
This instruction performs exactly the same operations as the 
PCMPxx PCMPEQB mm1,mm2/m64 ; 0F 74 /r [PENT,MMX]
PCMPEQW mm1,mm2/m64 ; 0F 75 /r [PENT,MMX]
PCMPEQD mm1,mm2/m64 ; 0F 76 /r [PENT,MMX]
PCMPGTB mm1,mm2/m64 ; 0F 64 /r [PENT,MMX]
PCMPGTW mm1,mm2/m64 ; 0F 65 /r [PENT,MMX]
PCMPGTD mm1,mm2/m64 ; 0F 66 /r [PENT,MMX]
PCMPEQB xmm1,xmm2/m128 ; 66 0F 74 /r [WILLAMETTE,SSE2]
PCMPEQW xmm1,xmm2/m128 ; 66 0F 75 /r [WILLAMETTE,SSE2]
PCMPEQD xmm1,xmm2/m128 ; 66 0F 76 /r [WILLAMETTE,SSE2]
PCMPGTB xmm1,xmm2/m128 ; 66 0F 64 /r [WILLAMETTE,SSE2]
PCMPGTW xmm1,xmm2/m128 ; 66 0F 65 /r [WILLAMETTE,SSE2]
PCMPGTD xmm1,xmm2/m128 ; 66 0F 66 /r [WILLAMETTE,SSE2]
The 
PCMPxxB PCMPxxW PCMPxxD PCMPEQx PCMPGTx PDISTIB PDISTIB mm,m64 ; 0F 54 /r [CYRIX,MMX]
To work out the implied register, invert the lowest bit in the
register number. So 
Note that 
Operation:
dstI[0-7] := dstI[0-7] + ABS(src0[0-7] - src1[0-7]),
dstI[8-15] := dstI[8-15] + ABS(src0[8-15] - src1[8-15]),
.......
.......
dstI[56-63] := dstI[56-63] + ABS(src0[56-63] - src1[56-63]).
PEXTRW PEXTRW reg32,mm,imm8 ; 0F C5 /r ib [KATMAI,MMX]
PEXTRW reg32,xmm,imm8 ; 66 0F C5 /r ib [WILLAMETTE,SSE2]
When the source operand is an 
PF2ID PF2ID mm1,mm2/m64 ; 0F 0F /r 1D [PENT,3DNOW]
PF2IW PF2IW mm1,mm2/m64 ; 0F 0F /r 1C [PENT,3DNOW]
PFACC PFACC mm1,mm2/m64 ; 0F 0F /r AE [PENT,3DNOW]
The operation is:
dst[0-31] := dst[0-31] + dst[32-63],
dst[32-63] := src[0-31] + src[32-63].
PFADD PFADD mm1,mm2/m64 ; 0F 0F /r 9E [PENT,3DNOW]
dst[0-31] := dst[0-31] + src[0-31],
dst[32-63] := dst[32-63] + src[32-63].
PFCMPxx PFCMPEQ mm1,mm2/m64 ; 0F 0F /r B0 [PENT,3DNOW]
PFCMPGE mm1,mm2/m64 ; 0F 0F /r 90 [PENT,3DNOW]
PFCMPGT mm1,mm2/m64 ; 0F 0F /r A0 [PENT,3DNOW]
The 
PFCMPEQ PFCMPGE PFCMPGT PFMAX PFMAX mm1,mm2/m64 ; 0F 0F /r A4 [PENT,3DNOW]
PFMIN PFMIN mm1,mm2/m64 ; 0F 0F /r 94 [PENT,3DNOW]
PFMUL PFMUL mm1,mm2/m64 ; 0F 0F /r B4 [PENT,3DNOW]
dst[0-31] := dst[0-31] * src[0-31],
dst[32-63] := dst[32-63] * src[32-63].
PFNACC PFNACC mm1,mm2/m64 ; 0F 0F /r 8A [PENT,3DNOW]
The operation is:
dst[0-31] := dst[0-31] - dst[32-63],
dst[32-63] := src[0-31] - src[32-63].
PFPNACC PFPNACC mm1,mm2/m64 ; 0F 0F /r 8E [PENT,3DNOW]
The operation is:
dst[0-31] := dst[0-31] - dst[32-63],
dst[32-63] := src[0-31] + src[32-63].
PFRCP PFRCP mm1,mm2/m64 ; 0F 0F /r 96 [PENT,3DNOW]
For higher precision reciprocals, this instruction should be
followed by two more instructions: 
PFRCPIT1 PFRCPIT1 mm1,mm2/m64 ; 0F 0F /r A6 [PENT,3DNOW]
For the final step in a reciprocal, returning the full 24-bit
accuracy of a single-precision FP value, see 
PFRCPIT2 PFRCPIT2 mm1,mm2/m64 ; 0F 0F /r B6 [PENT,3DNOW]
The first source value (
PFRSQIT1 PFRSQIT1 mm1,mm2/m64 ; 0F 0F /r A7 [PENT,3DNOW]
For the final step in a calculation, returning the full 24-bit
accuracy of a single-precision FP value, see 
PFRSQRT PFRSQRT mm1,mm2/m64 ; 0F 0F /r 97 [PENT,3DNOW]
For higher precision reciprocals, this instruction should be
followed by two more instructions: 
PFSUB PFSUB mm1,mm2/m64 ; 0F 0F /r 9A [PENT,3DNOW]
dst[0-31] := dst[0-31] - src[0-31],
dst[32-63] := dst[32-63] - src[32-63].
PFSUBR PFSUBR mm1,mm2/m64 ; 0F 0F /r AA [PENT,3DNOW]
dst[0-31] := src[0-31] - dst[0-31],
dst[32-63] := src[32-63] - dst[32-63].
PI2FD PI2FD mm1,mm2/m64 ; 0F 0F /r 0D [PENT,3DNOW]
PF2IW PI2FW mm1,mm2/m64 ; 0F 0F /r 0C [PENT,3DNOW]
PINSRW PINSRW mm,r16/r32/m16,imm8 ;0F C4 /r ib [KATMAI,MMX]
PINSRW xmm,r16/r32/m16,imm8 ;66 0F C4 /r ib [WILLAMETTE,SSE2]
PMACHRIW PMACHRIW mm,m64 ; 0F 5E /r [CYRIX,MMX]
The operation of this instruction is:
dstI[0-15] := dstI[0-15] + (mm[0-15] *m64[0-15]
+ 0x00004000)[15-30],
dstI[16-31] := dstI[16-31] + (mm[16-31]*m64[16-31]
+ 0x00004000)[15-30],
dstI[32-47] := dstI[32-47] + (mm[32-47]*m64[32-47]
+ 0x00004000)[15-30],
dstI[48-63] := dstI[48-63] + (mm[48-63]*m64[48-63]
+ 0x00004000)[15-30].
Note that 
PMADDWD PMADDWD mm1,mm2/m64 ; 0F F5 /r [PENT,MMX]
PMADDWD xmm1,xmm2/m128 ; 66 0F F5 /r [WILLAMETTE,SSE2]
The operation of this instruction is:
dst[0-31] := (dst[0-15] * src[0-15])
+ (dst[16-31] * src[16-31]);
dst[32-63] := (dst[32-47] * src[32-47])
+ (dst[48-63] * src[48-63]);
The following apply to the 
dst[64-95] := (dst[64-79] * src[64-79])
+ (dst[80-95] * src[80-95]);
dst[96-127] := (dst[96-111] * src[96-111])
+ (dst[112-127] * src[112-127]).
PMAGW PMAGW mm1,mm2/m64 ; 0F 52 /r [CYRIX,MMX]
PMAXSW PMAXSW mm1,mm2/m64 ; 0F EE /r [KATMAI,MMX]
PMAXSW xmm1,xmm2/m128 ; 66 0F EE /r [WILLAMETTE,SSE2]
PMAXUB PMAXUB mm1,mm2/m64 ; 0F DE /r [KATMAI,MMX]
PMAXUB xmm1,xmm2/m128 ; 66 0F DE /r [WILLAMETTE,SSE2]
PMINSW PMINSW mm1,mm2/m64 ; 0F EA /r [KATMAI,MMX]
PMINSW xmm1,xmm2/m128 ; 66 0F EA /r [WILLAMETTE,SSE2]
PMINUB PMINUB mm1,mm2/m64 ; 0F DA /r [KATMAI,MMX]
PMINUB xmm1,xmm2/m128 ; 66 0F DA /r [WILLAMETTE,SSE2]
PMOVMSKB PMOVMSKB reg32,mm ; 0F D7 /r [KATMAI,MMX]
PMOVMSKB reg32,xmm ; 66 0F D7 /r [WILLAMETTE,SSE2]
PMULHRWC PMULHRIW PMULHRWC mm1,mm2/m64 ; 0F 59 /r [CYRIX,MMX]
PMULHRIW mm1,mm2/m64 ; 0F 5D /r [CYRIX,MMX]
These instructions take two packed 16-bit integer inputs, multiply the values in the inputs, round on bit 15 of each result, then store bits 15-30 of each result to the corresponding position of the destination register.
PMULHRWC PMULHRIW PADDSIW The operation of this instruction is:
dst[0-15] := (src1[0-15] *src2[0-15] + 0x00004000)[15-30]
dst[16-31] := (src1[16-31]*src2[16-31] + 0x00004000)[15-30]
dst[32-47] := (src1[32-47]*src2[32-47] + 0x00004000)[15-30]
dst[48-63] := (src1[48-63]*src2[48-63] + 0x00004000)[15-30]
See also 
PMULHRWA PMULHRWA mm1,mm2/m64 ; 0F 0F /r B7 [PENT,3DNOW]
The operation of this instruction is:
dst[0-15] := (src1[0-15] *src2[0-15] + 0x00008000)[16-31];
dst[16-31] := (src1[16-31]*src2[16-31] + 0x00008000)[16-31];
dst[32-47] := (src1[32-47]*src2[32-47] + 0x00008000)[16-31];
dst[48-63] := (src1[48-63]*src2[48-63] + 0x00008000)[16-31].
See also 
PMULHUW PMULHUW mm1,mm2/m64 ; 0F E4 /r [KATMAI,MMX]
PMULHUW xmm1,xmm2/m128 ; 66 0F E4 /r [WILLAMETTE,SSE2]
PMULHW PMULLW PMULHW mm1,mm2/m64 ; 0F E5 /r [PENT,MMX]
PMULLW mm1,mm2/m64 ; 0F D5 /r [PENT,MMX]
PMULHW xmm1,xmm2/m128 ; 66 0F E5 /r [WILLAMETTE,SSE2]
PMULLW xmm1,xmm2/m128 ; 66 0F D5 /r [WILLAMETTE,SSE2]
PMULHW PMULLW PMULUDQ PMULUDQ mm1,mm2/m64 ; 0F F4 /r [WILLAMETTE,SSE2]
PMULUDQ xmm1,xmm2/m128 ; 66 0F F4 /r [WILLAMETTE,SSE2]
The operation is:
dst[0-63] := dst[0-31] * src[0-31];
dst[64-127] := dst[64-95] * src[64-95].
PMVccZB PMVZB mmxreg,mem64 ; 0F 58 /r [CYRIX,MMX]
PMVNZB mmxreg,mem64 ; 0F 5A /r [CYRIX,MMX]
PMVLZB mmxreg,mem64 ; 0F 5B /r [CYRIX,MMX]
PMVGEZB mmxreg,mem64 ; 0F 5C /r [CYRIX,MMX]
These instructions, specific to the Cyrix MMX extensions, perform
parallel conditional moves. The two input operands are treated as
vectors of eight bytes. Each byte of the destination (first) operand is
either written from the corresponding byte of the source (second)
operand, or left alone, depending on the value of the byte in the implied
operand (specified in the same way as 
PMVZB PMVNZB PMVLZB PMVGEZB Note that these instructions cannot take a register as their second source operand.
POP POP reg16 ; o16 58+r [8086]
POP reg32 ; o32 58+r [386]
POP r/m16 ; o16 8F /0 [8086]
POP r/m32 ; o32 8F /0 [386]
POP CS ; 0F [8086,UNDOC]
POP DS ; 1F [8086]
POP ES ; 07 [8086]
POP SS ; 17 [8086]
POP FS ; 0F A1 [386]
POP GS ; 0F A9 [386]
The address-size attribute of the instruction determines whether 
The operand-size attribute of the instruction determines whether the
stack pointer is incremented by 2 or 4: this means that segment register
pops in 
The above opcode listings give two forms for general-purpose
register pop instructions: for example, 
POPAx POPA ; 61 [186]
POPAW ; o16 61 [186]
POPAD ; o32 61 [386]
POPAW DI SI BP SP BX DX CX AX PUSHAW SP PUSHAW POPAD EDI ESI EBP ESP EBX EDX ECX EAX PUSHAD 
Note that the registers are popped in reverse order of their numeric values in opcodes (see section B.2.1).
POPFx POPF ; 9D [8086]
POPFW ; o16 9D [8086]
POPFD ; o32 9D [386]
POPFW POPFD 
See also 
POR POR mm1,mm2/m64 ; 0F EB /r [PENT,MMX]
POR xmm1,xmm2/m128 ; 66 0F EB /r [WILLAMETTE,SSE2]
PREFETCH PREFETCH mem8 ; 0F 0D /0 [PENT,3DNOW]
PREFETCHW mem8 ; 0F 0D /1 [PENT,3DNOW]
For more details, see the 3DNow! Technology Manual.
PREFETCHh PREFETCHNTA m8 ; 0F 18 /0 [KATMAI]
PREFETCHT0 m8 ; 0F 18 /1 [KATMAI]
PREFETCHT1 m8 ; 0F 18 /2 [KATMAI]
PREFETCHT2 m8 ; 0F 18 /3 [KATMAI]
The 
The hints are:
T0 T1 T2 NTA Note that this group of instructions doesn't provide a guarantee that the data will be in the cache when it is needed. For more details, see the Intel IA32 Software Developer Manual, Volume 2.
PSADBW PSADBW mm1,mm2/m64 ; 0F F6 /r [KATMAI,MMX]
PSADBW xmm1,xmm2/m128 ; 66 0F F6 /r [WILLAMETTE,SSE2]
PSHUFD PSHUFD xmm1,xmm2/m128,imm8 ; 66 0F 70 /r ib [WILLAMETTE,SSE2]
Bits 0 and 1 of imm8 encode the source position of the doubleword to be copied to position 0 in the destination operand. Bits 2 and 3 encode for position 1, bits 4 and 5 encode for position 2, and bits 6 and 7 encode for position 3. For example, an encoding of 10 in bits 0 and 1 of imm8 indicates that the doubleword at bits 64-95 of the source operand will be copied to bits 0-31 of the destination.
PSHUFHW PSHUFHW xmm1,xmm2/m128,imm8 ; F3 0F 70 /r ib [WILLAMETTE,SSE2]
The operation of this instruction is similar to the 
PSHUFLW PSHUFLW xmm1,xmm2/m128,imm8 ; F2 0F 70 /r ib [WILLAMETTE,SSE2]
The operation of this instruction is similar to the 
PSHUFW PSHUFW mm1,mm2/m64,imm8 ; 0F 70 /r ib [KATMAI,MMX]
Bits 0 and 1 of imm8 encode the source position of the word to be copied to position 0 in the destination operand. Bits 2 and 3 encode for position 1, bits 4 and 5 encode for position 2, and bits 6 and 7 encode for position 3. For example, an encoding of 10 in bits 0 and 1 of imm8 indicates that the word at bits 32-47 of the source operand will be copied to bits 0-15 of the destination.
PSLLx PSLLW mm1,mm2/m64 ; 0F F1 /r [PENT,MMX]
PSLLW mm,imm8 ; 0F 71 /6 ib [PENT,MMX]
PSLLW xmm1,xmm2/m128 ; 66 0F F1 /r [WILLAMETTE,SSE2]
PSLLW xmm,imm8 ; 66 0F 71 /6 ib [WILLAMETTE,SSE2]
PSLLD mm1,mm2/m64 ; 0F F2 /r [PENT,MMX]
PSLLD mm,imm8 ; 0F 72 /6 ib [PENT,MMX]
PSLLD xmm1,xmm2/m128 ; 66 0F F2 /r [WILLAMETTE,SSE2]
PSLLD xmm,imm8 ; 66 0F 72 /6 ib [WILLAMETTE,SSE2]
PSLLQ mm1,mm2/m64 ; 0F F3 /r [PENT,MMX]
PSLLQ mm,imm8 ; 0F 73 /6 ib [PENT,MMX]
PSLLQ xmm1,xmm2/m128 ; 66 0F F3 /r [WILLAMETTE,SSE2]
PSLLQ xmm,imm8 ; 66 0F 73 /6 ib [WILLAMETTE,SSE2]
PSLLDQ xmm1,imm8 ; 66 0F 73 /7 ib [PENT,MMX]
PSLLW PSLLD PSLLQ PSLLDQ PSRAx PSRAW mm1,mm2/m64 ; 0F E1 /r [PENT,MMX]
PSRAW mm,imm8 ; 0F 71 /4 ib [PENT,MMX]
PSRAW xmm1,xmm2/m128 ; 66 0F E1 /r [WILLAMETTE,SSE2]
PSRAW xmm,imm8 ; 66 0F 71 /4 ib [WILLAMETTE,SSE2]
PSRAD mm1,mm2/m64 ; 0F E2 /r [PENT,MMX]
PSRAD mm,imm8 ; 0F 72 /4 ib [PENT,MMX]
PSRAD xmm1,xmm2/m128 ; 66 0F E2 /r [WILLAMETTE,SSE2]
PSRAD xmm,imm8 ; 66 0F 72 /4 ib [WILLAMETTE,SSE2]
PSRAW PSRAD PSRLx PSRLW mm1,mm2/m64 ; 0F D1 /r [PENT,MMX]
PSRLW mm,imm8 ; 0F 71 /2 ib [PENT,MMX]
PSRLW xmm1,xmm2/m128 ; 66 0F D1 /r [WILLAMETTE,SSE2]
PSRLW xmm,imm8 ; 66 0F 71 /2 ib [WILLAMETTE,SSE2]
PSRLD mm1,mm2/m64 ; 0F D2 /r [PENT,MMX]
PSRLD mm,imm8 ; 0F 72 /2 ib [PENT,MMX]
PSRLD xmm1,xmm2/m128 ; 66 0F D2 /r [WILLAMETTE,SSE2]
PSRLD xmm,imm8 ; 66 0F 72 /2 ib [WILLAMETTE,SSE2]
PSRLQ mm1,mm2/m64 ; 0F D3 /r [PENT,MMX]
PSRLQ mm,imm8 ; 0F 73 /2 ib [PENT,MMX]
PSRLQ xmm1,xmm2/m128 ; 66 0F D3 /r [WILLAMETTE,SSE2]
PSRLQ xmm,imm8 ; 66 0F 73 /2 ib [WILLAMETTE,SSE2]
PSRLDQ xmm1,imm8 ; 66 0F 73 /3 ib [WILLAMETTE,SSE2]
PSRLW PSRLD PSRLQ PSRLDQ PSUBx PSUBB mm1,mm2/m64 ; 0F F8 /r [PENT,MMX]
PSUBW mm1,mm2/m64 ; 0F F9 /r [PENT,MMX]
PSUBD mm1,mm2/m64 ; 0F FA /r [PENT,MMX]
PSUBQ mm1,mm2/m64 ; 0F FB /r [WILLAMETTE,SSE2]
PSUBB xmm1,xmm2/m128 ; 66 0F F8 /r [WILLAMETTE,SSE2]
PSUBW xmm1,xmm2/m128 ; 66 0F F9 /r [WILLAMETTE,SSE2]
PSUBD xmm1,xmm2/m128 ; 66 0F FA /r [WILLAMETTE,SSE2]
PSUBQ xmm1,xmm2/m128 ; 66 0F FB /r [WILLAMETTE,SSE2]
PSUBB PSUBW PSUBD PSUBQ PSUBSxx PSUBUSx PSUBSB mm1,mm2/m64 ; 0F E8 /r [PENT,MMX]
PSUBSW mm1,mm2/m64 ; 0F E9 /r [PENT,MMX]
PSUBSB xmm1,xmm2/m128 ; 66 0F E8 /r [WILLAMETTE,SSE2]
PSUBSW xmm1,xmm2/m128 ; 66 0F E9 /r [WILLAMETTE,SSE2]
PSUBUSB mm1,mm2/m64 ; 0F D8 /r [PENT,MMX]
PSUBUSW mm1,mm2/m64 ; 0F D9 /r [PENT,MMX]
PSUBUSB xmm1,xmm2/m128 ; 66 0F D8 /r [WILLAMETTE,SSE2]
PSUBUSW xmm1,xmm2/m128 ; 66 0F D9 /r [WILLAMETTE,SSE2]
PSUBSB PSUBSW PSUBUSB PSUBUSW PSUBSIW PSUBSIW mm1,mm2/m64 ; 0F 55 /r [CYRIX,MMX]
PSWAPD PSWAPD mm1,mm2/m64 ; 0F 0F /r BB [PENT,3DNOW]
In the 
The operation in the 
dst[0-15] = src[48-63];
dst[16-31] = src[32-47];
dst[32-47] = src[16-31];
dst[48-63] = src[0-15].
The operation in the 
dst[0-31] = src[32-63];
dst[32-63] = src[0-31].
PUNPCKxxx PUNPCKHBW mm1,mm2/m64 ; 0F 68 /r [PENT,MMX]
PUNPCKHWD mm1,mm2/m64 ; 0F 69 /r [PENT,MMX]
PUNPCKHDQ mm1,mm2/m64 ; 0F 6A /r [PENT,MMX]
PUNPCKHBW xmm1,xmm2/m128 ; 66 0F 68 /r [WILLAMETTE,SSE2]
PUNPCKHWD xmm1,xmm2/m128 ; 66 0F 69 /r [WILLAMETTE,SSE2]
PUNPCKHDQ xmm1,xmm2/m128 ; 66 0F 6A /r [WILLAMETTE,SSE2]
PUNPCKHQDQ xmm1,xmm2/m128 ; 66 0F 6D /r [WILLAMETTE,SSE2]
PUNPCKLBW mm1,mm2/m32 ; 0F 60 /r [PENT,MMX]
PUNPCKLWD mm1,mm2/m32 ; 0F 61 /r [PENT,MMX]
PUNPCKLDQ mm1,mm2/m32 ; 0F 62 /r [PENT,MMX]
PUNPCKLBW xmm1,xmm2/m128 ; 66 0F 60 /r [WILLAMETTE,SSE2]
PUNPCKLWD xmm1,xmm2/m128 ; 66 0F 61 /r [WILLAMETTE,SSE2]
PUNPCKLDQ xmm1,xmm2/m128 ; 66 0F 62 /r [WILLAMETTE,SSE2]
PUNPCKLQDQ xmm1,xmm2/m128 ; 66 0F 6C /r [WILLAMETTE,SSE2]
The remaining elements, are then interleaved into the destination, alternating elements from the second (source) operand and the first (destination) operand: so the leftmost part of each element in the result always comes from the second operand, and the rightmost from the destination.
PUNPCKxBW PUNPCKxWD PUNPCKxDQ PUNPCKxQDQ So, for example, for 
PUNPCKHBW 0x7B7A6B6A5B5A4B4A PUNPCKHWD 0x7B6B7A6A5B4B5A4A PUNPCKHDQ 0x7B6B5B4B7A6A5A4A PUNPCKLBW 0x3B3A2B2A1B1A0B0A PUNPCKLWD 0x3B2B3A2A1B0B1A0A PUNPCKLDQ 0x3B2B1B0B3A2A1A0A PUSH PUSH reg16 ; o16 50+r [8086]
PUSH reg32 ; o32 50+r [386]
PUSH r/m16 ; o16 FF /6 [8086]
PUSH r/m32 ; o32 FF /6 [386]
PUSH CS ; 0E [8086]
PUSH DS ; 1E [8086]
PUSH ES ; 06 [8086]
PUSH SS ; 16 [8086]
PUSH FS ; 0F A0 [386]
PUSH GS ; 0F A8 [386]
PUSH imm8 ; 6A ib [186]
PUSH imm16 ; o16 68 iw [186]
PUSH imm32 ; o32 68 id [386]
The address-size attribute of the instruction determines whether 
The operand-size attribute of the instruction determines whether the
stack pointer is decremented by 2 or 4: this means that segment register
pushes in 
The above opcode listings give two forms for general-purpose
register push instructions: for example, 
Unlike the undocumented and barely supported 
The instruction 
PUSHAx PUSHA ; 60 [186]
PUSHAD ; o32 60 [386]
PUSHAW ; o16 60 [186]
In both cases, the value of 
Note that the registers are pushed in order of their numeric values in opcodes (see section B.2.1).
See also 
PUSHFx PUSHF ; 9C [8086]
PUSHFD ; o32 9C [386]
PUSHFW ; o16 9C [8086]
PUSHFW PUSHFD 
See also 
PXOR PXOR mm1,mm2/m64 ; 0F EF /r [PENT,MMX]
PXOR xmm1,xmm2/m128 ; 66 0F EF /r [WILLAMETTE,SSE2]
RCL RCR RCL r/m8,1 ; D0 /2 [8086]
RCL r/m8,CL ; D2 /2 [8086]
RCL r/m8,imm8 ; C0 /2 ib [186]
RCL r/m16,1 ; o16 D1 /2 [8086]
RCL r/m16,CL ; o16 D3 /2 [8086]
RCL r/m16,imm8 ; o16 C1 /2 ib [186]
RCL r/m32,1 ; o32 D1 /2 [386]
RCL r/m32,CL ; o32 D3 /2 [386]
RCL r/m32,imm8 ; o32 C1 /2 ib [386]
RCR r/m8,1 ; D0 /3 [8086]
RCR r/m8,CL ; D2 /3 [8086]
RCR r/m8,imm8 ; C0 /3 ib [186]
RCR r/m16,1 ; o16 D1 /3 [8086]
RCR r/m16,CL ; o16 D3 /3 [8086]
RCR r/m16,imm8 ; o16 C1 /3 ib [186]
RCR r/m32,1 ; o32 D1 /3 [386]
RCR r/m32,CL ; o32 D3 /3 [386]
RCR r/m32,imm8 ; o32 C1 /3 ib [386]
The number of bits to rotate by is given by the second operand. Only the bottom five bits of the rotation count are considered by processors above the 8086.
You can force the longer (286 and upwards, beginning with a 
RCPPS RCPPS xmm1,xmm2/m128 ; 0F 53 /r [KATMAI,SSE]
RCPSS RCPSS xmm1,xmm2/m128 ; F3 0F 53 /r [KATMAI,SSE]
RDMSR RDMSR ; 0F 32 [PENT,PRIV]
RDPMC RDPMC ; 0F 33 [P6]
This instruction is available on P6 and later processors and on MMX class processors.
RDSHR RDSHR r/m32 ; 0F 36 /0 [386,CYRIX,SMM]
See also 
RDTSC RDTSC ; 0F 31 [PENT]
RET RETF RETN RET ; C3 [8086]
RET imm16 ; C2 iw [8086]
RETF ; CB [8086]
RETF imm16 ; CA iw [8086]
RETN ; C3 [8086]
RETN imm16 ; C2 iw [8086]
RET RETN IP EIP imm16 RETF IP EIP CS ROL ROR ROL r/m8,1 ; D0 /0 [8086]
ROL r/m8,CL ; D2 /0 [8086]
ROL r/m8,imm8 ; C0 /0 ib [186]
ROL r/m16,1 ; o16 D1 /0 [8086]
ROL r/m16,CL ; o16 D3 /0 [8086]
ROL r/m16,imm8 ; o16 C1 /0 ib [186]
ROL r/m32,1 ; o32 D1 /0 [386]
ROL r/m32,CL ; o32 D3 /0 [386]
ROL r/m32,imm8 ; o32 C1 /0 ib [386]
ROR r/m8,1 ; D0 /1 [8086]
ROR r/m8,CL ; D2 /1 [8086]
ROR r/m8,imm8 ; C0 /1 ib [186]
ROR r/m16,1 ; o16 D1 /1 [8086]
ROR r/m16,CL ; o16 D3 /1 [8086]
ROR r/m16,imm8 ; o16 C1 /1 ib [186]
ROR r/m32,1 ; o32 D1 /1 [386]
ROR r/m32,CL ; o32 D3 /1 [386]
ROR r/m32,imm8 ; o32 C1 /1 ib [386]
The number of bits to rotate by is given by the second operand. Only the bottom five bits of the rotation count are considered by processors above the 8086.
You can force the longer (286 and upwards, beginning with a 
RSDC RSDC segreg,m80 ; 0F 79 /r [486,CYRIX,SMM]
RSLDT RSLDT m80 ; 0F 7B /0 [486,CYRIX,SMM]
RSM RSM ; 0F AA [PENT]
RSQRTPS RSQRTPS xmm1,xmm2/m128 ; 0F 52 /r [KATMAI,SSE]
RSQRTSS RSQRTSS xmm1,xmm2/m128 ; F3 0F 52 /r [KATMAI,SSE]
RSTS RSTS m80 ; 0F 7D /0 [486,CYRIX,SMM]
SAHF SAHF ; 9E [8086]
The operation of 
AH --> SF:ZF:0:AF:0:PF:1:CF
See also 
SAL SAR SAL r/m8,1 ; D0 /4 [8086]
SAL r/m8,CL ; D2 /4 [8086]
SAL r/m8,imm8 ; C0 /4 ib [186]
SAL r/m16,1 ; o16 D1 /4 [8086]
SAL r/m16,CL ; o16 D3 /4 [8086]
SAL r/m16,imm8 ; o16 C1 /4 ib [186]
SAL r/m32,1 ; o32 D1 /4 [386]
SAL r/m32,CL ; o32 D3 /4 [386]
SAL r/m32,imm8 ; o32 C1 /4 ib [386]
SAR r/m8,1 ; D0 /7 [8086]
SAR r/m8,CL ; D2 /7 [8086]
SAR r/m8,imm8 ; C0 /7 ib [186]
SAR r/m16,1 ; o16 D1 /7 [8086]
SAR r/m16,CL ; o16 D3 /7 [8086]
SAR r/m16,imm8 ; o16 C1 /7 ib [186]
SAR r/m32,1 ; o32 D1 /7 [386]
SAR r/m32,CL ; o32 D3 /7 [386]
SAR r/m32,imm8 ; o32 C1 /7 ib [386]
The number of bits to shift by is given by the second operand. Only the bottom five bits of the shift count are considered by processors above the 8086.
You can force the longer (286 and upwards, beginning with a 
SALC SALC ; D6 [8086,UNDOC]
SBB SBB r/m8,reg8 ; 18 /r [8086]
SBB r/m16,reg16 ; o16 19 /r [8086]
SBB r/m32,reg32 ; o32 19 /r [386]
SBB reg8,r/m8 ; 1A /r [8086]
SBB reg16,r/m16 ; o16 1B /r [8086]
SBB reg32,r/m32 ; o32 1B /r [386]
SBB r/m8,imm8 ; 80 /3 ib [8086]
SBB r/m16,imm16 ; o16 81 /3 iw [8086]
SBB r/m32,imm32 ; o32 81 /3 id [386]
SBB r/m16,imm8 ; o16 83 /3 ib [8086]
SBB r/m32,imm8 ; o32 83 /3 ib [386]
SBB AL,imm8 ; 1C ib [8086]
SBB AX,imm16 ; o16 1D iw [8086]
SBB EAX,imm32 ; o32 1D id [386]
In the forms with an 8-bit immediate second operand and a longer
first operand, the second operand is considered to be signed, and is
sign-extended to the length of the first operand. In these cases, the 
To subtract one number from another without also subtracting the
contents of the carry flag, use 
SCASB SCASW SCASD SCASB ; AE [8086]
SCASW ; o16 AF [8086]
SCASD ; o32 AF [386]
The register used is 
Segment override prefixes have no effect for this instruction: the
use of 
The 
SETcc SETcc r/m8 ; 0F 90+cc /2 [386]
SFENCE SFENCE ; 0F AE /7 [KATMAI]
Weakly ordered memory types can be used to achieve higher processor
performance through such techniques as out-of-order issue,
write-combining, and write-collapsing. The degree to which a consumer of
data recognizes or knows that the data is weakly ordered varies among
applications and may be unknown to the producer of this data. The 
Mod (7:6) = 11B
Reg/Opcode (5:3) = 111B
R/M (2:0) = 000B
All other ModRM encodings are defined to be reserved, and use of these encodings risks incompatibility with future processors.
See also 
SGDT SIDT SLDT SGDT mem ; 0F 01 /0 [286,PRIV]
SIDT mem ; 0F 01 /1 [286,PRIV]
SLDT r/m16 ; 0F 00 /0 [286,PRIV]
See also 
SHL SHR SHL r/m8,1 ; D0 /4 [8086]
SHL r/m8,CL ; D2 /4 [8086]
SHL r/m8,imm8 ; C0 /4 ib [186]
SHL r/m16,1 ; o16 D1 /4 [8086]
SHL r/m16,CL ; o16 D3 /4 [8086]
SHL r/m16,imm8 ; o16 C1 /4 ib [186]
SHL r/m32,1 ; o32 D1 /4 [386]
SHL r/m32,CL ; o32 D3 /4 [386]
SHL r/m32,imm8 ; o32 C1 /4 ib [386]
SHR r/m8,1 ; D0 /5 [8086]
SHR r/m8,CL ; D2 /5 [8086]
SHR r/m8,imm8 ; C0 /5 ib [186]
SHR r/m16,1 ; o16 D1 /5 [8086]
SHR r/m16,CL ; o16 D3 /5 [8086]
SHR r/m16,imm8 ; o16 C1 /5 ib [186]
SHR r/m32,1 ; o32 D1 /5 [386]
SHR r/m32,CL ; o32 D3 /5 [386]
SHR r/m32,imm8 ; o32 C1 /5 ib [386]
A synonym for 
The number of bits to shift by is given by the second operand. Only the bottom five bits of the shift count are considered by processors above the 8086.
You can force the longer (286 and upwards, beginning with a 
SHLD SHRD SHLD r/m16,reg16,imm8 ; o16 0F A4 /r ib [386]
SHLD r/m16,reg32,imm8 ; o32 0F A4 /r ib [386]
SHLD r/m16,reg16,CL ; o16 0F A5 /r [386]
SHLD r/m16,reg32,CL ; o32 0F A5 /r [386]
SHRD r/m16,reg16,imm8 ; o16 0F AC /r ib [386]
SHRD r/m32,reg32,imm8 ; o32 0F AC /r ib [386]
SHRD r/m16,reg16,CL ; o16 0F AD /r [386]
SHRD r/m32,reg32,CL ; o32 0F AD /r [386]
SHLD SHRD For example, if 
The number of bits to shift by is given by the third operand. Only the bottom five bits of the shift count are considered.
SHUFPD SHUFPD xmm1,xmm2/m128,imm8 ; 66 0F C6 /r ib [WILLAMETTE,SSE2]
The select operand is an 8-bit immediate: bit 0 selects which value is moved from the destination operand to the result (where 0 selects the low quadword and 1 selects the high quadword) and bit 1 selects which value is moved from the source operand to the result. Bits 2 through 7 of the shuffle operand are reserved.
SHUFPS SHUFPS xmm1,xmm2/m128,imm8 ; 0F C6 /r ib [KATMAI,SSE]
The select operand is an 8-bit immediate: bits 0 and 1 select the value to be moved from the destination operand the low doubleword of the result, bits 2 and 3 select the value to be moved from the destination operand the second doubleword of the result, bits 4 and 5 select the value to be moved from the source operand the third doubleword of the result, and bits 6 and 7 select the value to be moved from the source operand to the high doubleword of the result.
SMI SMI ; F1 [386,UNDOC]
SMINT SMINTOLD SMINT ; 0F 38 [PENT,CYRIX]
SMINTOLD ; 0F 7E [486,CYRIX]
This pair of opcodes are specific to the Cyrix and compatible range of processors (Cyrix, IBM, Via).
SMSW SMSW r/m16 ; 0F 01 /4 [286,PRIV]
For 32-bit code, this would use the low 16-bits of the specified register (or a 16bit memory location), without needing an operand size override byte.
SQRTPD SQRTPD xmm1,xmm2/m128 ; 66 0F 51 /r [WILLAMETTE,SSE2]
SQRTPS SQRTPS xmm1,xmm2/m128 ; 0F 51 /r [KATMAI,SSE]
SQRTSD SQRTSD xmm1,xmm2/m128 ; F2 0F 51 /r [WILLAMETTE,SSE2]
SQRTSS SQRTSS xmm1,xmm2/m128 ; F3 0F 51 /r [KATMAI,SSE]
STC STD STI STC ; F9 [8086]
STD ; FD [8086]
STI ; FB [8086]
These instructions set various flags. 
To clear the carry, direction, or interrupt flags, use the 
STMXCSR STMXCSR m32 ; 0F AE /3 [KATMAI,SSE]
For details of the 
See also 
STOSB STOSW STOSD STOSB ; AA [8086]
STOSW ; o16 AB [8086]
STOSD ; o32 AB [386]
The register used is 
Segment override prefixes have no effect for this instruction: the
use of 
The 
STR STR r/m16 ; 0F 00 /1 [286,PRIV]
SUB SUB r/m8,reg8 ; 28 /r [8086]
SUB r/m16,reg16 ; o16 29 /r [8086]
SUB r/m32,reg32 ; o32 29 /r [386]
SUB reg8,r/m8 ; 2A /r [8086]
SUB reg16,r/m16 ; o16 2B /r [8086]
SUB reg32,r/m32 ; o32 2B /r [386]
SUB r/m8,imm8 ; 80 /5 ib [8086]
SUB r/m16,imm16 ; o16 81 /5 iw [8086]
SUB r/m32,imm32 ; o32 81 /5 id [386]
SUB r/m16,imm8 ; o16 83 /5 ib [8086]
SUB r/m32,imm8 ; o32 83 /5 ib [386]
SUB AL,imm8 ; 2C ib [8086]
SUB AX,imm16 ; o16 2D iw [8086]
SUB EAX,imm32 ; o32 2D id [386]
In the forms with an 8-bit immediate second operand and a longer
first operand, the second operand is considered to be signed, and is
sign-extended to the length of the first operand. In these cases, the 
SUBPD SUBPD xmm1,xmm2/m128 ; 66 0F 5C /r [WILLAMETTE,SSE2]
SUBPS SUBPS xmm1,xmm2/m128 ; 0F 5C /r [KATMAI,SSE]
SUBSD SUBSD xmm1,xmm2/m128 ; F2 0F 5C /r [WILLAMETTE,SSE2]
SUBSS SUBSS xmm1,xmm2/m128 ; F3 0F 5C /r [KATMAI,SSE]
SVDC SVDC m80,segreg ; 0F 78 /r [486,CYRIX,SMM]
SVLDT SVLDT m80 ; 0F 7A /0 [486,CYRIX,SMM]
SVTS SVTS m80 ; 0F 7C /0 [486,CYRIX,SMM]
SYSCALL SYSCALL ; 0F 05 [P6,AMD]
EIP ECX STAR EIP STAR CS STAR The 
For more information, see the 
SYSENTER SYSENTER ; 0F 34 [P6]
SYSENTER_CS_MSR SYSENTER_EIP_MSR SYSENTER_ESP_MSR 
SYSENTER_CS_MSR CS SYSENTER_EIP_MSR EIP SYSENTER_CS_MSR SS SYSENTER_ESP_MSR ESP VM EFLAGS In particular, note that this instruction des not save the values of 
For more information, see the Intel Architecture Software Developer's Manual, Volume 2.
SYSEXIT SYSEXIT ; 0F 35 [P6,PRIV]
SYSENTER_CS_MSR EDX ECX 
SYSENTER_CS_MSR CS EDX EIP SYSENTER_CS_MSR SS ECX ESP EIP For more information on the use of the 
SYSRET SYSRET ; 0F 07 [P6,AMD,PRIV]
ECX SYSCALL EIP STAR CS STAR SS SS STAR The 
For more information, see the 
TEST TEST r/m8,reg8 ; 84 /r [8086]
TEST r/m16,reg16 ; o16 85 /r [8086]
TEST r/m32,reg32 ; o32 85 /r [386]
TEST r/m8,imm8 ; F6 /0 ib [8086]
TEST r/m16,imm16 ; o16 F7 /0 iw [8086]
TEST r/m32,imm32 ; o32 F7 /0 id [386]
TEST AL,imm8 ; A8 ib [8086]
TEST AX,imm16 ; o16 A9 iw [8086]
TEST EAX,imm32 ; o32 A9 id [386]
UCOMISD UCOMISD xmm1,xmm2/m128 ; 66 0F 2E /r [WILLAMETTE,SSE2]
UCOMISS UCOMISS xmm1,xmm2/m128 ; 0F 2E /r [KATMAI,SSE]
UD0 UD1 UD2 UD0 ; 0F FF [186,UNDOC]
UD1 ; 0F B9 [186,UNDOC]
UD2 ; 0F 0B [186]
All these opcodes can be used to generate invalid opcode exceptions on all currently available processors.
UMOV UMOV r/m8,reg8 ; 0F 10 /r [386,UNDOC]
UMOV r/m16,reg16 ; o16 0F 11 /r [386,UNDOC]
UMOV r/m32,reg32 ; o32 0F 11 /r [386,UNDOC]
UMOV reg8,r/m8 ; 0F 12 /r [386,UNDOC]
UMOV reg16,r/m16 ; o16 0F 13 /r [386,UNDOC]
UMOV reg32,r/m32 ; o32 0F 13 /r [386,UNDOC]
This undocumented instruction is used by in-circuit emulators to
access user memory (as opposed to host memory). It is used just like an
ordinary memory/register or register/register 
This instruction is only available on some AMD and IBM 386 and 486 processors.
UNPCKHPD UNPCKHPD xmm1,xmm2/m128 ; 66 0F 15 /r [WILLAMETTE,SSE2]
The operation of this instruction is:
dst[63-0] := dst[127-64];
dst[127-64] := src[127-64].
UNPCKHPS UNPCKHPS xmm1,xmm2/m128 ; 0F 15 /r [KATMAI,SSE]
The operation of this instruction is:
dst[31-0] := dst[95-64];
dst[63-32] := src[95-64];
dst[95-64] := dst[127-96];
dst[127-96] := src[127-96].
UNPCKLPD UNPCKLPD xmm1,xmm2/m128 ; 66 0F 14 /r [WILLAMETTE,SSE2]
The operation of this instruction is:
dst[63-0] := dst[63-0];
dst[127-64] := src[63-0].
UNPCKLPS UNPCKLPS xmm1,xmm2/m128 ; 0F 14 /r [KATMAI,SSE]
The operation of this instruction is:
dst[31-0] := dst[31-0];
dst[63-32] := src[31-0];
dst[95-64] := dst[63-32];
dst[127-96] := src[63-32].
VERR VERW VERR r/m16 ; 0F 00 /4 [286,PRIV]
VERW r/m16 ; 0F 00 /5 [286,PRIV]
VERR VERW WAIT WAIT ; 9B [8086]
FWAIT ; 9B [8086]
On higher processors, 
WBINVD WBINVD ; 0F 09 [486]
WRMSR WRMSR ; 0F 30 [PENT]
WRSHR WRSHR r/m32 ; 0F 37 /0 [386,CYRIX,SMM]
See also 
XADD XADD r/m8,reg8 ; 0F C0 /r [486]
XADD r/m16,reg16 ; o16 0F C1 /r [486]
XADD r/m32,reg32 ; o32 0F C1 /r [486]
XBTS XBTS reg16,r/m16 ; o16 0F A6 /r [386,UNDOC]
XBTS reg32,r/m32 ; o32 0F A6 /r [386,UNDOC]
The implied operation of this instruction is:
XBTS r/m16,reg16,AX,CL
XBTS r/m32,reg32,EAX,CL
Writes a bit string from the source operand to the destination. 
XCHG XCHG reg8,r/m8 ; 86 /r [8086]
XCHG reg16,r/m8 ; o16 87 /r [8086]
XCHG reg32,r/m32 ; o32 87 /r [386]
XCHG r/m8,reg8 ; 86 /r [8086]
XCHG r/m16,reg16 ; o16 87 /r [8086]
XCHG r/m32,reg32 ; o32 87 /r [386]
XCHG AX,reg16 ; o16 90+r [8086]
XCHG EAX,reg32 ; o32 90+r [386]
XCHG reg16,AX ; o16 90+r [8086]
XCHG reg32,EAX ; o32 90+r [386]
XLATB XLAT ; D7 [8086]
XLATB ; D7 [8086]
The base register used is 
The segment register used to load from 
XOR XOR r/m8,reg8 ; 30 /r [8086]
XOR r/m16,reg16 ; o16 31 /r [8086]
XOR r/m32,reg32 ; o32 31 /r [386]
XOR reg8,r/m8 ; 32 /r [8086]
XOR reg16,r/m16 ; o16 33 /r [8086]
XOR reg32,r/m32 ; o32 33 /r [386]
XOR r/m8,imm8 ; 80 /6 ib [8086]
XOR r/m16,imm16 ; o16 81 /6 iw [8086]
XOR r/m32,imm32 ; o32 81 /6 id [386]
XOR r/m16,imm8 ; o16 83 /6 ib [8086]
XOR r/m32,imm8 ; o32 83 /6 ib [386]
XOR AL,imm8 ; 34 ib [8086]
XOR AX,imm16 ; o16 35 iw [8086]
XOR EAX,imm32 ; o32 35 id [386]
In the forms with an 8-bit immediate second operand and a longer
first operand, the second operand is considered to be signed, and is
sign-extended to the length of the first operand. In these cases, the 
The 
XORPD XORPD xmm1,xmm2/m128 ; 66 0F 57 /r [WILLAMETTE,SSE2]
XORPS XORPS xmm1,xmm2/m128 ; 0F 57 /r [KATMAI,SSE]