PowerPC ASM Cheatsheet
Instructions, registers and general info cheatsheet for PowerPC 32-bit Big Endian Assembly architecture used by the Wii U.
How to read this cheatsheet
- If a right shift operation does not specify it is sign-fill, it is implicitly zero-fill by default
- If a value is referred to simply as "value" without specifying bit-count, it is implicitly 32 bits (aka a WORD, integer or float depending on context)
- C-style casts are used due to being shorter to fit in the small table cells code snippets, but treat them as static_cast<T>
- # = placeholder (either a number or a letter, a letter is a labelled placeholder for a number so it may be referenced to by other text)
- r# = register (shorthand for GPR#)
- f# = floating point register (shorthand for FPR#)
- i# = immediate (the subscript numbers next to it is it's size in bits)
- ui# = unsigned immediate (above is signed)
- * = unsure of exact functionality
typedef unsigned int uint; // 32 bit integer value
typedef signed int sint; // 32 bit integer value
typedef unsigned short ushort; // 16 bit integer value
typedef signed short sshort; // 16 bit integer value
typedef unsigned char ubyte; // 8 bit integer value
typedef signed char sbyte; // 8 bit integer value
float; // 32 bit floating point value
double; // 64 bit floating point value
|
---|
Register | Name | Attributes | Bits | Purpose |
---|---|---|---|---|
General Purpose Registers (GPRs) | ||||
r0 | GPR0 | Volatile + Cross-Module | 32 | General purpose, may be used by function linkage |
r1 | GPR1 | Saved + Reserved | 32 | Reserved for storing the stack frame pointer |
r2 | GPR2 | Reserved | 32 | Reserved for usage by the system |
r3 | GPR3 | Volatile | 32 | Stores 1st argument passed to function calls and their return value |
r4 - r10 | GPR4 - GPR10 | Volatile | 32 | Store from 2nd to 8th argument passed to function calls |
r11 - r12 | GPR11 - GPR12 | Volatile + Cross-Module | 32 | General purpose, may be used by function linkage |
r13 | GPR13 | Reserved | 32 | Reserved for storing the small data area (SDA) pointer |
r14 - r31 | GPR14 - GPR31 | Saved | 32 | General purpose, store generic integer values and pointers |
Floating Point Registers (FPRs) | ||||
f0 | FPR0 | Volatile | 64 | Store generic floating point numbers |
f1 | FPR1 | Volatile | 64 | Stores 1st float argument passed to function calls and their float return value |
f2 - f8 | FPR2 - FPR8 | Volatile | 64 | Store from 2nd to 8th float argument passed to function calls |
f9 - f13 | FPR9 - FPR13 | Volatile | 64 | Store generic floating point numbers |
f14 - f30 | FPR14 - FPR30 | Saved | 64 | Store generic floating point numbers |
f31 | FPR31 | Saved | 64 | General purpose, used for static chain if needed |
Special Purpose Registers (SPRs) | ||||
PC / IAR | Program Counter / Instruction Address Register | Internal | 32 | Stores the address of the current instruction (Controlled by the CPU) |
LR | Link Register | Volatile | 32 | Stores the return address for some of the branching instructions |
CTR | CounT Register | Volatile | 32 | Stores the counter of loop iterations for most instructions that perform loops |
XER | fiXed point Exception Register | Volatile | 32 | ??? |
FPSCR | Floating Point Status and Control Register | Volatile | 32 | ??? |
... | There are way more but less common SPRs which won't be listed here | - | - | For a full but undescriptive list of all SPRs, visit wiiubrew.org/Hardware/Espresso |
CR | Condition Register | Mixed (See below) | 32 | Divided in 8 bitfields of 4 bits each to hold different kinds of conditions. See below for details. |
SPR: Condition Register (CR) | ||||
cr0 | Condition Register Bitfield 0 | Volatile | 4 | Stores a condition |
cr1 | Condition Register Bitfield 1 | Volatile | 4 | Stores a condition ("floating point invalid exception") |
cr2 - cr4 | Condition Register Bitfield 2 - 4 | Saved | 4 x3 | Stores a condition |
cr5 - cr7 | Condition Register Bitfield 5 - 7 | Volatile | 4 x3 | Stores a condition |
Instruction | Name | Parameters | Pseudocode Equivalent | Additional Info |
---|---|---|---|---|
add
|
ADD operation | rA, rB, rC
|
rA = rB + rC
|
Adds the values of rB and rC together and stores the result in rA |
addi
|
ADD Immediate | rA, rB, iX₁₆
|
rA = rB + iX
|
Adds the values of rB and iX together and stores the result in rA |
addis
|
ADD Immediate Shifted | rA, rB, iX₁₆
|
rA = rB + (iX << 16)
|
Adds the values of rB and (iX << 16) together and stores the result in rA |
and
|
AND Operation | rA, rB, rC
|
rA = rB & rC
|
Performs an AND operation on rB and rC then stores the result in rA |
andc
|
AND Complement | rA, rB, rC
|
rA = rB & ~rC
|
Performs an AND operation on rB and negated rC then stores the result in rA |
andi.
|
AND Immediate | rA, rB, uiX₁₆
|
rA = rB & uiX
|
Performs an AND operation on rB and uiX then stores the result in rA |
andis.
|
AND Immediate Shifted | rA, rB, uiX₁₆
|
rA = rB & (uiX << 16)
|
Performs an AND operation on rB and (uiX << 16) then stores the result in rA |
b
|
Branch | iX₂₄
|
goto LABEL
|
Jumps from the current address to IAR + iX, either up or down |
bl
|
Branch and Link | iX₂₄
|
((void (*)())IAR + iX)()
|
Jumps from the current address to IAR + iX, either up or down
Also stores the address of the instruction directly below it in LR This is the most common instruction to use for calling a function |
blr
|
Branch to Link Register | N/A | return <r3 / f1>
|
Jumps from the current address to the address stored in LR
This is essentially the return statement of a function, with the value currently loaded into either r3 or f1 holding the returned value, depending on if the return type is a fixed or floating point value |
beq
|
Branch if EQual | iX₂₄
|
if (x == y) goto LABEL
|
If the EQ bit in CR0 is 1, jumps from the current address to IAR + iX
Otherwise, does nothing |
bne
|
Branch if Not Equal | iX₂₄
|
if (x != y) goto LABEL
|
If the EQ bit in CR0 is 0, jumps from the current address to IAR + iX
Otherwise, does nothing |
bgt
|
Branch if Greater Than | iX₂₄
|
if (x > y) goto LABEL
|
If the GT bit in CR0 is 1, jumps from the current address to IAR + iX
Otherwise, does nothing |
blt
|
Branch if Less Than | iX₂₄
|
if (x < y) goto LABEL
|
If the LT bit in CR0 is 1, jumps from the current address to IAR + iX
Otherwise, does nothing |
bge
|
Branch if Greater than or Equal | iX₂₄
|
if (x >= y) goto LABEL
|
If either the GT bit or EQ bit in CR0 is 1, jumps from the current address to IAR + iX
Otherwise, does nothing |
ble
|
Branch if Less than or Equal | iX₂₄
|
if (x <= y) goto LABEL
|
If either the LT bit or EQ bit in CR0 is 1, jumps from the current address to IAR + iX
Otherwise, does nothing |
bng
|
Branch if Not Greater than | |||
bnl
|
Branch if Not Less than | |||
bso
|
Branch if Summary Overflow | ??? | ??? | ??? |
bns
|
Branch if Not Summary overflow | ??? | ??? | ??? |
bun
|
Branch if UNordered | ??? | ??? | ??? |
bnu
|
Branch if Not Unordered | ??? | ??? | ??? |
bctr
|
Branch to CounT Register | |||
bctrl
|
Branch to CounT Register and Link | |||
bdnz
|
Branch if Decremented count register Not Zero | |||
bdnzt
|
Branch if Decremented count register Not Zero and if condition True | |||
bdnzf
|
Branch if Decremented count register Not Zero and if condition False | |||
bdz
|
Branch if Decremented count register Zero | |||
cmp
|
CoMPare | cr#, 0, rA, rB
|
||
cmpw
|
CoMPare Word | rA, rB
|
||
cmpwi
|
CoMPare Word Immediate | rA, iX₁₆
|
||
cmplwi
|
CoMPare Logical Word Immediate | |||
cntlzw
|
CouNT Leading Zeros Word | |||
divw
|
DIVide Word | rA, rB, rC
|
rA = rB / rC
|
Divides the value of rB by rC and stores the result in rA. The remainder is lost. |
eieio
|
Enforce In-order Execution of I/O | ??? | ??? | ??? |
eqv
|
EQuiValent | rA, rB, rC
|
rA = rB == rC
|
Compares if the values of rB and rC are equal and stores the result in rA (?) |
extsb
|
EXTend Sign Byte | rA, rB
|
rA = (int8_t)rB
|
Fills the upper 24 bits of rB's value with the sign bit of the stored 8 bit value |
extsh
|
EXTend Sign Halfword | rA, rB
|
rA = (int16_t)rB
|
Fills the upper 16 bits of rB's value with the sign bit of the stored 16 bit value |
fmr
|
Float Move Register | fA, fB
|
fA = fB
|
Copies the value of fB into fA (Despite the instruction name, fB is preserved) |
isync
|
Instruction SYNChronize | N/A | Assembly-only instruction | Delay all following instructions until all previous instructions required for context. |
lfs
|
Load Float Single | fA, iX₁₆(rA)
|
fA = (float)(*(rA + iX))
|
Loads the value at the address (rA + iX) casted to float into fA. |
lfd
|
Load Float Double | fA, iX₁₆(rA)
|
fA = (double)(*(rA + iX))
|
Loads the 64 bit value at the address (rA + iX) casted to double into fA. |
lbz
|
Load Byte Zero-fill | rA, iX₁₆(rB)
|
rA = (ubyte)(*(rB + iX))
|
Loads the 8 bit value at the address (rB + iX) into rA |
lhz
|
Load Halfword Zero-fill | rA, iX₁₆(rB)
|
rA = (ushort)(*(rB + iX))
|
Loads the 16 bit value at the address (rB + iX) into rA |
li
|
Load Immediate | rA, iX₁₆
|
rA = iX
|
Loads iX into rA |
lis
|
Load Immediate Shifted | rA, iX₁₆
|
rA = rA | (iX << 16)
|
Loads iX into the upper 16 bits of rA |
lwz
|
Load Word Zero-fill | rA, iX₁₆(rB)
|
rA = *(rB + iX)
|
Loads the value at the address (rB + iX) into rA |
la
|
Load Address | rA, iX₁₆(rB)
|
rA = rB + iX
|
Adds iX to the address stored in rB and loads the result into rA. |
lwzu
|
Load Word Zero Update | rA, iX₁₆(rB)
|
rA = *(rB + iX);
rB = rB + iX;
|
Loads the value at the address (rB + iX) into rA Then loads rB with the address (rB + iX) |
lwzx
|
Load Word Zero indeXed | rA, rB, rC
|
rA = *(rB + rC)
|
Loads the value at the address (rB + rC) into rA |
lmw *
|
Load Multiple Words | rA, iX₁₆(rB)
|
int EA = rB + iX;
int N = rA;
do {
GPR[N] = *(EA);
EA = EA + 4;
N = N + 1;
} while (N <= 31);
|
Loads GPR[rA] to r31 with the value at the address (rB + iX + N),
where N starts at 0 and increments by 4 for each register loaded.
|
mr
|
Move Register | rA, rB
|
rA = rB
|
Copies the value of rB into rA (Despite the instruction name, rB is preserved) |
mflr
|
Move From Link Register | rA
|
rA = LR
|
Copies the value of LR into rA |
mtlr
|
Move To Link Register | rA
|
LR = rA
|
Copies the value of rA into the LR |
mtctr
|
Move To CounT Register | rA
|
CTR = rA
|
Copies the value of rA into the CTR |
mtspr
|
Move To Special Purpose Register | SPR, rA
|
SPRs[SPR] = rA
|
Copies the value of rA into the special purpose register SPR |
mtfsf *
|
Move To FpScr Fields | UNK1, fA
|
??? | Copies the value of fA into the FPSCR under the control of the field mask in UNK1 |
mtfsb1
|
Move To FpScr Bit 1 | iX?
|
FPSCR = FPSCR | 0b1 << iX - 1
|
Sets bit iX of the FPSCR register to 1 |
mullw
|
MULtiply Low Word | rA, rB, rC
|
rA = (int64_t)(rB * rC) & 0xFFFFFFFF
|
Multiplies the value of rB by rC and stores the low 32 bits of the result in rA |
mullh
|
MULtiply (L) High word | rA, rB, rC
|
rA = (int64_t)(rB * rC) >> 32
|
Multiplies the value of rB by rC and stores the high 32 bits of the result in rA |
mulli
|
MULtiply Low Immediate | rA, rB, iX₁₆
|
rA = rB * iX
|
Multiplies the value of rB by iX and stores the result in rA |
nand
|
NAND operation | rA, rB, rC
|
rA = ~(rB & rC)
|
Stores in rA the negated result of (rB & rC) |
neg
|
NEGate | rA, rB
|
rA = ~rB + 1
|
Stores in rA the result of negated rB with 1 added to it's value afterwards |
nop
|
No OPeration | N/A | ;
|
Does nothing |
nor
|
NOR operation | rA, rB, rC
|
rA = ~(rB | rC)
|
Stores in rA the negated result of (rB | rC) |
not
|
NOT operation | rA, rB
|
rA = ~rB
|
Stores in rA the result of negated rB |
or
|
OR operation | rA, rB, rC
|
rA = rB | rC
|
Stores in rA the result of (rB | rC) |
orc
|
OR Complement | rA, rB, rC
|
rA = rB | ~rC
|
Stores in rA the result of (rB | ~rC) |
ori
|
OR Immediate | rA, rB, iX₁₆
|
rA = rB | iX
|
Stores in rA the result of (rB | iX) |
oris
|
OR Immediate Shifted | rA, rB, iX₁₆
|
rA = rB | (iX << 16)
|
Stores in rA the result of (rB | (iX << 16)) |
rlwinm
|
Rotate Left Word Immediate aNd Mask | rA, rB, iX₅, iY₅, iZ₅
|
uint mask = ((uint)-1) << (31 - iZ + iY) >> iY;
rA = (rB << iX) | (rB >> (32 - iX)) & mask;
|
Rotates the value in rB by iX bits to the left
The result of the above is AND'ed with the mask specified by iY and iZ iY specifies the starting bit of the 1-bits in the mask (0-indexed) iZ specifies the end bit of the 1-bits in the mask (0-indexed) The final result is stored in rA |
sc
|
System Call | iX₇
|
N/A | Calls upon the system to perform a service identified by iX |
slw
|
Shift Left Word | rA, rB, rC
|
rA = rB << rC
|
Shifts the value in rB by the value in rC to the left and stores the result in rA |
slwi
|
Shift Left Word Immediate | rA, rB, iX₅
|
rA = rB << iX
|
Shifts the value in rB by iX to the left and stores the result in rA |
srw
|
Shift Right Word | rA, rB, rC
|
rA = (unsigned)rB >> rC
|
Shifts the value in rB by the value in rC to the right and stores the result in rA |
srwi
|
Shift Right Word Immediate | rA, rB, iX₅
|
rA = (unsigned)rB >> iX
|
Shifts the value in rB by iX to the right and stores the result in rA |
sraw
|
Shift Right Algebraic Word | rA, rB, rC
|
rA = (signed)rB >> rC
|
Shifts the value in rB by the value in rC to the right and stores the result in rA
Unlike regular zero-fill right shift operations, this one sign-fills the vacant bits |
srawi
|
Shift Right Algebraic Word Immediate | rA, rB, iX₅
|
rA = (signed)rB >> iX
|
Shifts the value in rB by iX to the right and stores the result in rA
Unlike regular zero-fill right shift operations, this one sign-fills the vacant bits |
subf
|
SUBtract From | rA, rB, rC
|
rA = rC - rB
|
Subtracts the value of rB from rC and stores the result in rA. |
subfic
|
SUBtract From Immediate Carrying | rA, rB, iX₅
|
rA = iX - rB
|
Subtracts the value of iX from rC and stores the result in rA. (CRO is modified) |
stfs
|
STore Float Single | fA, iX₁₆(rA)
|
*(rA + iX) = (float)fA
|
Stores the value of fA casted to float at the memory address (rA + iX) |
stfd
|
STore Float Double | fA, iX₁₆(rA)
|
*(rA + iX) = fA
|
Stores the 64 bit value of fA at the memory address (rA + iX) |
stb
|
STore Byte | rA, iX₁₆(rB)
|
*(rB + iX) = (ubyte)rA
|
Stores the 8 bit value of rA at the memory address (rB + iX) |
sth
|
STore Halfword | rA, iX₁₆(rB)
|
*(rB + iX) = (ushort)rA
|
Stores the 16 bit value of rA at the memory address (rB + iX) |
stw
|
STore Word | rA, iX₁₆(rB)
|
*(rB + iX) = rA
|
Stores the value of rA at the memory address (rB + iX) |
stwu
|
STore Word And Update | rA, iX₁₆(rB)
|
*(rB + iX) = rA
|
Stores the value of rA at the memory address (rB + iX)
Stores the computed address (rB + iX) into rB |
stwx
|
STore Word indeXed | rA, rB, rC
|
*(rB + rC) = rA
|
Stores the value of rA at the memory address (rB + rC) |
stmw *
|
STore Multiple Words | |||
xor
|
XOR operation | rA, rB, rC
|
rA = rB ^ rC
|
Performs an XOR operation on rB and rC then stores the result in rA |
xori
|
XOR Immediate | rA, rB, iX₁₆
|
rA = rB ^ iX
|
Performs an XOR operation on rB and iX then stores the result in rA |
xoris
|
XOR Immediate Shifted | rA, rB, iX₁₆
|
rA = rB ^ (iX << 16)
|
Performs an XOR operation on rB and (iX << 16) then stores the result in rA |
External Resources
- http://class.ece.iastate.edu/arun/CprE281_F05/lab/labw10a/Labw10a_Files/PowerPC%20Assembly%20Quick%20Reference.htm (few but nicely explained instructions with some examples and explains a little about assembly source files)
- https://jimkatz.github.io/powerpc_for_dummies (very incomplete, has mistakes)
- http://wiibrew.org/wiki/Assembler_Tutorial (also has missing instructions but way more accurate and better worded)
- https://fail0verflow.com/media/files/ppc_750cl.pdf (official instruction set docs, hard to navigate/search)
- http://personal.denison.edu/~bressoud/cs281-s07/ppc_instructions.pdf (similar to the above but stripped of all pages not documenting instructions, easier to search, missing instructions though)
- http://refspecs.linux-foundation.org/elf/elfspec_ppc.pdf (the PowerPC ELF specification, contains some advanced in-depth information about the processor and thus its assembly mechanics)