Other:WUAPPS: Difference between revisions

Content added Content deleted

Inline

Revision as of 00:29, 21 July 2023

Contents

1 !!!!! Still unfinished as of currently !!!!!

2 Terms and Definitions

2.1 YAML Types Definitions

3 Versioning

4 Environment Variables

5 Project Configuration

6 Project Folder Structure

7 Modules

8 Patches & Hooks

9 Symbol Maps

10 Conversion Maps

10.1 Auto-calculating text and data addresses

11 Configuring project.gpj

12 Console vs Emulator Compilation

12.1 Compilation Methods

13 Shift-JIS Standardization

!!!!! Still unfinished as of currently !!!!!

v3.0-DRAFT

Terms and Definitions

For the purposes of this document, the following terms and definitions apply. (Plural and variant forms of terms implicitly included)

Term	Definition
tool	Refers to any programs implementing this standard in compliance with it, and/or the developers working on said program.
implementation-defined	Refers to behavior that is up to tools to define however they please, as long as it meets certain minimum expectations criteria that may be specified by the standard, if any.
compiler	Currently this standard only acknowledges and accounts for GHS MULTI _{(Green Hills Software MULTI)} as compiler toolchain for use. Any mention of words such as “compiler”, “linker”, “assembler”, etc. all refer to the respective tools of GHS MULTI implicitly. Many assumptions and choices of default and possible values for configurations are based on the behavior of GHS MULTI and may not work correctly with other compiler toolchains. Support for other compiler toolchains is open for the future under a new major revision, but there are none planned at this time.
linker
assembler
/	In all instances within this document where slashes (`/`) are used or referenced in the context of a file or folder path, it is implied that it is interchangeable with backslashes (`\`) for compatibility, and vice-versa. Implementation note: When parsing file or folder paths tools must treat both `/` and `\` as path separators regardless of the running platform and interpret the path as valid, including paths making mixed use of both separators.
\
file name	In all instances within this document where file or folder names are used or referenced, including within file and folder paths, these names MUST meet the following requirements: Only contain the characters `a-z`, `A-Z`, `0-9`, `-_.,+()` NOT start with a `-` NOR end with a `.` File and folder paths inherit these rules with only the added allowed characters `/\:` as path separators. This ensures compatibility with all operating systems.
folder name
file path	In all instances within this document where file or folder paths are used or referenced, these paths MUST be CASE-SENSITIVE to ensure compatibility with all filesystem formats. Implementation note: This can be achieved with the following methods: On a case-sensitive filesystem: No action required, process the input path as-is. On a case-insensitive filesystem: If input path doesn’t exist, end. Else, fetch the stored case of the file path and compare against the input path, if the comparison is not equal, error. Else, the comparison is equal, therefore the path is valid.
folder path

YAML Types Definitions

Type	Definition
YAML_NULL	^[1]
Boolean	A standard generic YAML boolean, represented by either `true` or `false` word literals.
String	A standard generic YAML string, prior to further restrictions possibly applied to it by the field it's bound to, it may contain any character and have any length. An empty String is considered equivalent to YAML_NULL^[1] unless otherwise specified in a specific instance of String usage.
Number	A standard generic YAML number, equivalent to a floating point value.
Integer	^[2]
List<T>	^[3]: A standard generic YAML List, an array of values of any length. For all optional fields with this type, omitting the field or explicitly setting it to YAML_NULL^[1] is equivalent to falling back to the default value, if any, otherwise being YAML_NULL^[1] as fallback. An Empty List definition asserts truly no values, without using default values or fallbacks.
Record<T>	^[4]: A standard generic YAML Record, a map of indefinite key/value pairs.

Versioning

This standard’s version follows a restricted subset of SemVer, where only Major and Minor version are present, and no other extensions allowed, for a resulting version format of {StandardMajor}.{StandardMinor}. The standard does not have Patch versions or any other version extensions such as tags or metadata. (Draft revisions of the standard not effectively considered as "the standard" while in draft stage and therefore are exempt from these rules)

Tools in compliance with this standard MUST adapt part of their versioning scheme to match the version of this standard they currently support, using SemVer versioning, with the following additional conditions met:

The tool’s Major version MUST match the StandardMajor version it supports.
The tool’s Minor version MUST match the StandardMinor version it supports.
The tool’s Patch version MUST represent the tool’s own Major version.
The tool’s version Tag, if present, MUST represent the tool’s own Minor version as a valid integer.
- If omitted, the tool’s Minor version is 0.

The final required version format for compliant tools is {StandardMajor}.{StandardMinor}.{ToolMajor}-{ToolMinor}. With the ending -{ToolMinor} being optional if it is 0. Tool Patch versions are not allowed.

An optional +{ToolMetadata} tag at the very end (After -{ToolMinor} if present) is also allowed to be included for use by the tool and may contain anything which SemVer allows on that field.

Environment Variables

The following standard environment variables should be read and used by tools for their respective purposes when applicable. All environment variables are optional to be defined or not by the user, tools should not rely on them as the ONLY source of user input.

Any environment variables set should have lower precedence against values passed through more specific sources of user input defined by the tool, such as command-line arguments.

GHS_ROOT: If set, should contain the absolute path to the GHS MULTI folder containing the multi.exe file. Used to locate the necessary GHS MULTI executables needed for a full project build.

Project Configuration

Project configuration is done through the main configuration file, {ProjectDir}/project.yaml, containing the following fields:

YAML Key	YAML Value Type^[5]	Description	Default Value?
WUAPPSVersion	String	The presence of this field indicates the project's (not the tool's!) compliance with the specified version of this standard. Must follow the standard version format described in the Versioning chapter. Any input not matching the format must error. Tools must compare this field against their internal supported WUAPPS versions, if the project requests a version not supported by the tool, it must error.	*REQUIRED OPTION*
Name	String	The name of the project.	*REQUIRED OPTION*
Variables	Record<String>^[4]	Key/Value pairs of indefinite custom user-defined configuration variables. KEYS are variable names, which may only contain alphanumerics^[6] or underscore. VALUES are variable contents, which by themselves may contain any character, but are still bound by the restrictions of the fields they are used within. Variables can be referenced like UNIX variables as `$VARIABLE_NAME` inside of any String YAML Value of any field or array within project.yaml or module YAMLs. YAML Keys do NOT support variable interpolation. Variables may NOT be nested within each other. The literal character `$` cannot be used, as it isn't a valid character for file names and paths anyway, which are the primary use-case of variables. If a variable which does not exist is used somewhere, an error must be thrown by the tool.	Empty Record
RpxDir	String	Path to the RPX files folder. (relative to `{ProjectDir}`^[7])^[8]	`"./rpxs"`
ModulesBaseDir	String	Base path for module folders paths to resolve from. (relative to `{ProjectDir}`^[7])^[8] If `YAML_NULL`^[1], module paths will resolve from the location of the project.yaml file.	`YAML_NULL`^[1]
SourcesBaseDir	String	Base path for source folders paths to resolve from. (relative to `{ProjectDir}`^[7])^[8] If `YAML_NULL`^[1], source paths will resolve from the location of the module file they are in.	`YAML_NULL`^[1]
IncludeDirs	List<String>^[3]	List of paths to header folders. (relative to `{ProjectDir}`^[7])^[8]	`["./include"]`
BuildOptions	List<String>^[3]	List of build options to pass to the compiler. If both buildoptions.txt and this field are present, they are merged together. Works the same as buildoptions.txt but inlined into the project.yaml. (Whenever buildoptions.txt is referenced it is interchangeable with this option)	`YAML_NULL`^[1]
ExcludeDefaultBuildOptions	List<String>^[3] Boolean	List of default build options defined by the standard (See the project.gpj chapter) to opt-out of. The special value of `true` can be used to easily exclude ALL of the default options. The value of `false` equals to the default empty list. The listed values MUST specify the FULL default option name, including the exact prefixing dashes, but EXCLUDING the option's value, as examples: The option "-c99" is excluded by "-c99", but not by "--c99" nor "c99" The option "-kanji=shiftjis" is excluded by "-kanji", but not by "-kanji=" nor "-kanji=shiftjis" It is an error to attempt to exclude a non-default option or a malformed option like the above invalid examples. For default options with values, such as -kanji, where the user desires to override the value used by the option, the process is as one would expect: First exclude "-kanji" by adding it on this list, then define the new -kanji value on buildoptions.txt Attempting to override a default option in buildoptions.txt without excluding it first is implementation-defined behavior. The default behavior being undefined behavior dependent on how the compiler driver interprets the duplicate options. (given some compiler options are known to specifically support multiple uses, while others do not)	Empty List
Modules	List<String>^[3]	List of extensionless file paths of global modules to compile. If a module listed is not found or invalid, an error should be thrown.	Empty List
MinAlign	Record<Integer^[2]>^[4]	Key/Value pairs of indefinite custom user-defined minimum alignments to use per ELF section by section name. The definition of the record itself, empty or partial, does not void the default section's values unless they are explicitly redefined each. Example: `MinAlign: { .text: 0x80 }` does NOT affect .rodata, .data and .bss default values, the defaults will still be used. All sections not declared, whether by defaults or explicit, shall have a default alignment of `0` (no minimum alignment) Implementation Notes: `moduleAlignment = max(module section alignment, global minimum section alignment)` `sectionAlignment = max(section alignments of all modules)`	{ .text: 0x20, .rodata: 0x20, .data: 0x20, .bss: 0x40 }
Targets	Record< Target^[9], YAML_NULL^[1] >^[4]	Key/Value pairs of indefinite target configurations. KEYS are target names. (At least 1 non-abstract target is required to be defined) To create an “empty” target using all default settings, give it (the field itself) `YAML_NULL`^[1] as the value.	*REQUIRED OPTION*
Target YAML Type Structure
Abstract	Boolean	Whether this target is an abstract template for other targets to inherit from. Prevents target from being used directly for compilation.	`false`
Extends	String List<String>^[3]	As a String, name of a target to inherit settings from, with potentially infinitely nested inheritance from the inherited target extending another, recursively. As a List, list of names of targets to multi-inherit settings from, in reverse insertion order, without nesting allowed. Example: `A: Extends: [C, B]` makes A inherit from B, and then the resulting A+B intermediate target inherits from C, producing the final A+B+C target. If either B or C have an Extends field, an error must be thrown, as multi-inheriting doesn't support multiple bases for preventing circular or self extensions. Inheritance Semantics: For non-List fields, the highest definition on the inheritance sequence (nested or multi) shall take priority. For List fields, each definition is merged onto the previous, unless they have special behavior defined, which the only 4 current List fields do have. Refer to the Notes at the end of the Target YAML Type Structure table for details on inheritance semantics for the current List fields. The fields Abstract and Extends do not participate in inheritance as their values are only in respect to the target definition itself, not the final resolved target data. For non-List fields of targets with extensions, two special String values can be used in place of their values: `@inherit`: This is the default special value used for all fields not explicitly defined in a target with extensions, it signals to use the default value for the inherited target. `@self`: Opposite to @inherit, signals that the option should use its default value for the current target. This affects options whose default value is `{TargetName}`^[10].	`YAML_NULL`^[1]
AddrMap	String	Name of *maps/.convmap** file to use with this target.	`{TargetName}`^[10]
BaseRpx	String	Name of *{RpxDir}/.rpx** file to use with this target.	`{TargetName}`^[10]
Remove/Modules	List<String>^[3]	List of global/template modules to remove from compilation with this target. Values follow same rules as the global Modules field. The following extra rules apply: It is an error to attempt to remove a module only to re-add it on the same template/target. It is an error to attempt to remove a module which exists but is not on the current modules list.	Empty List
Add/Modules	List<String>^[3]	List of additional modules to compile with this target. Values follow same rules as the global Modules field. The following extra rule applies: It is an error to attempt to add a template/target module already on the current modules list.	Empty List
Remove/BuildOptions	List<String>^[3]	List of global/template build options to remove from use with this target. Values follow same rules as the global ExcludeDefaultBuildOptions field.	Empty List
Add/BuildOptions	List<String>^[3]	List of additional build options to use with this target. Values follow same rules as the global BuildOptions field.	Empty List
Notes:		Notes for the 4 options above (Remove/Modules, Add/Modules, Remove/BuildOptions and Add/BuildOptions): These 4 options should be processed in the order of removals first, then additions. It doesn't matter the order the user arranges the fields on their YAML, only the order below. Before all, load the global Modules and BuildOptions lists. If processing a target extending a template, process the template first. Remove the unwanted modules and build options listed by the template/target from their lists. Add the extra wanted modules and build options listed by the template/target to their lists. If just processed a template, cycle back to step 3 and process the extending target.	-

Project Folder Structure

All WUAPPS-based projects must have a {ProjectDir}^[7] within which the project.yaml and other project metadata files are located, this folder follows the following structure:

`conv/`	Your WUAPPS Conversion Maps files
`maps/`	Your symbol map and conversion maps files
	`main.map`	Your primary symbol map file
`buildoptions.txt`	Optional file, stores extra build options (See the project.gpj chapter)
`project.yaml`	The main project configuration file

Modules

⚠ Warning: The module system will be removed once the dynamic loader system is finished.

WUAPPS projects are currently structured in “modules”, enabled/disabled by the project.yaml, which are also defined by YAML files, which in turn declare source files (C++ / Assembly) to compile and assemble as well as binary patches & hooks to apply directly to the base RPX file.

A module file may not be named project.yaml (case-insensitive) to prevent conflicts. All other names that fit within the standard-wide file name rules (See Terms and Definitions) are valid.

Each module file is structured as follows:

YAML Key	YAML Value Type^[5]	Description	Default Value?
Files	Record<List<String>^[3]>^[4]	Key/Value pairs of filetypes mapped to lists of paths to source files or folders for searching source files within. KEYS must be one of the valid filetypes: `C`, `C++`, `Assembly`, `Text` (A key not matching one of these must error) All filetypes are independently optional and can be used or omitted in any combination. Folder paths ending in nothing, a trailing `/` or `/` will be included non-recursively. Folder paths ending in `/` will be included recursively. The `` character shall NOT be used as a wildcard in the middle of paths.	Record<Empty List>^[4]
Hooks	List<Hook^[9]>	List of patches & hooks to apply when this module is enabled.	Empty List
Hook YAML Type Structure
type	String	The type of the patch/hook, the the Patches & Hooks chapter for details.	*REQUIRED OPTION*
addr	String List<String>	A stringified hex number with 0x prefix, indicating where to apply the patch/hook. Can also be a list of multiple of the above, for applying the same patch/hook at multiple locations at once.	*REQUIRED OPTION*
????	-	Other hook-type specific fields exist, see the Patches & Hooks chapter for details.	-

<<< UNFINISHED BEYOND THIS POINT >>>

Patches & Hooks

This section documents the different types of structures you can write on the Hooks list of a module. The current valid types, and their extra fields besides the base type and addr common to all hook types, are as follows:

nop - A shorthand for one or multiple sequential patch hooks with 60000000 (nop) as data
- Optional field: count (Default: 1) - A positive non-zero decimal integer number, specifying how many nop’s to apply starting from addr
return - A shortcut for a patch hook with 4E800020 (blr) as data
branch - Inserts the respective branch instruction instr at addr jumping to the address of the symbol func
- Additional field: instr - The branch instruction to use: b or bl
- Additional field: func - The symbol whose address the branch instruction should jump to
funcptr - Inserts the address of the symbol func at addr
- Additional field: func - The symbol whose address to write at addr
patch - The most basic and versatile hook, simply writes data at addr (overwrites existing data)
- Additional field: data - A value to be encoded according to the datatype field and inserted at the addr(s)
- Optional field: datatype (Default: raw) - A string representing a C++ data type to interpret the value of data as, the supported types are:
  - raw: A sequence of hex bytes, value of data should be a string.
  - f32/f64/float/double: A 32/64 bit floating point number, value of data should be a numeric literal within the specified type’s range.
  - u8/u16/u32/u64/uchar/ushort/uint/ulonglong: A 8/16/32/64 bit unsigned integer, value of data should be a numeric integer literal within the specified type’s range.
  - s8/s16/s32/s64/schar/short/int/longlong: A 8/16/32/64 bit signed integer, value of data should be a numeric integer literal within the specified type’s range.
  - char: A 8 bit ASCII encoded character, value of data should be a string literal with exactly one character located inside the ASCII character set range (0x00-0x7F). Invalid ASCII characters should trigger an error.
  - wchar: A 16 bit configurable-encoding encoded character, value of data should be a string literal with exactly one character fitting inside a 2-byte space or less in the chosen encoding (after conversion from UTF8).
    - Characters which don’t fit should trigger an error. Characters which encode to only 1-byte in the chosen encoding should be big-endian null-padded.
  - string: A null-terminated configurable-encoding C string. Single byte null terminator is automatically added. Value of data should be a string literal with only valid characters for the chosen encoding (after conversion from UTF8).
    - Invalid characters of the chosen encoding should trigger an error.
  - wstring: A null-terminated configurable-encoding wide string. 2-byte null terminator is automatically added. Value of data should be a string literal with only valid characters for the chosen encoding (after conversion from UTF8).
    - Invalid characters of the chosen encoding should trigger an error.
  - #[]: Where # is any of the types above, you may suffix a type with [] to make an array of it. Value of data will be an array of values of the respective type.
    - Array types enforce their addr to be aligned by the size of its elements.
      - string and wstring are 4-byte aligned, therefore all strings in an array must be null-padded to have a byte length multiple of 4, including the last string in the array.
    - The difference between char[] and string is a char[] doesn’t automatically null-terminate, uses ASCII encoding and doesn’t align. Meanwhile string does automatically null-terminate, can use configurable-encoding and aligns by 4.
      - Likewise, the difference between wchar[] and wstring is a wchar[] doesn’t automatically null-terminate and aligns by 2. Meanwhile wstring does automatically 2-byte null-terminate and aligns by 4.
      - To write a null character on a char[] or wchar[], write down YAML_NULL^[1] or use standard string escape sequences.
    - Multidimensional arrays such as int[][] are not supported.
    - Optional conditional field: encoding (Default: Shift-JIS^[11]) - This field specifies the encoding to encode the input string value with.
      - This field may be included if datatype is either string, wchar or wstring.
        If this field is present when datatype is not one of the above, an error must be thrown.
      - Valid encoding values and their aliases are:
        UTF-8, UTF8, utf-8, utf8 (ONLY valid for string datatype, throw error if not used with it)
        
        UCS-2, UCS2, ucs-2, ucs2 (NOT valid for string datatype, throw error if used with it)
        
        Shift-JIS, ShiftJIS, shift-jis, shiftjis (valid for all 3 encoding-compatible datatypes)^[11]
        
        UCS-2 NOTES:
        UCS-2 is an obsolete predecessor of UTF-16, as such it may be difficult to find encoders for UCS-2 specifically. The difference between the two encodings is simply UCS-2 does not support multi-character codepoints (surrogates) to support characters beyond U+FFFF, therefore an implementation using a UTF-16 encoder simply pre-validating all input characters to block any characters above U+FFFF is a valid implementation of UCS-2. Other things to note about UCS-2 is that the encoding must be in big endian and a Byte Order Mark (BOM) should NOT be included.

Symbol Maps

The primary symbol map for a project is located at {ProjectDir}/maps/main.map and has a very basic syntax similar to any other symbol map file.

Whitespace is free-form
# is used for comments
- Both full line and end-of-line comments are supported
- There is no multi-line comment support
Semicolon-separated list of key value pairs in the format: SYMBOL = ADDRESS;
- Where SYMBOL is the symbol’s text
- Where VALUE is the symbol’s address as a hex number with a required 0x prefix
  - Alternatively, if VALUE does not start with 0x, it shall be interpreted as a previously defined symbol named VALUE which instructs the parser to re-use that symbol’s address for the current one.
    - If the referenced symbol is not found an error must be thrown.
Anything not fitting the above syntax rules is a syntax error

The primary symbol map is not the one actually given to the compiler, as it must be converted to the format expected by the compiler (*.x) and addresses must be converted to different regions according to the build targets through the maps/*.convmap files.

Conversion Maps

The *.convmap files inside the {ProjectDir}/maps folder are used for converting the addresses of the primary symbol map ({ProjectDir}/maps/main.map) to different regions and versions of the game/app being modified.

The addresses in the primary symbol map can be of any region/version of your choosing, but must be consistent throughout the map.

For build targets of the same region/version as your primary symbol map addresses, where no conversion is necessary, a matching *.convmap file is not required and use a nullish value wherever an address map is requested.

The conversion map files support // comments at both start and end of lines, and /* ... */ multi-line comments.

Conversion maps are defined by the following EBNF syntax: (excluding the comments which may be arbitrarily placed within any S token)

/*
<optional>
^ = assert start of line
$ = assert end of line
*/

/* Common */
S                     = [\r\n\t\f\v ]* ; /* optional whitespace */
Z                     = [\r\n\t\f\v ]+ ; /* required whitespace */
hex_literal           = '0x' hex_literal_no_prefix ;
hex_literal_no_0x     = [A-Fa-f0-9]{1,8} ;
decimal_literal       = '0' | [1-9] [0-9]* ;
integer_literal       = decimal_literal | hex_literal ;
word                  = [A-Za-z] ;

/* Convmap */
start                 = S <text_addr data_addr> statement* EOF ;
text_addr             = 'TextAddr' S '=' S hex_literal S ';' S ;
data_addr             = 'DataAddr' S '=' S hex_literal S ';' S ;

statement             = <range_offset | platform_directive> S ;

range_offset          = range S ':' S ('+' | '-') integer_literal S ';' ;
range                 = hex_literal_no_0x S '-' S hex_literal_no_0x ;

platform              = 'Emulator' | 'Console' <'=' word> ;
platform_directive    = ^ '.platform' Z platform <Z 'extends' Z platform> ;

As clarification, notice that integer_literal is a value separate from the sign (+ or -), therefore the u32 range restriction does NOT prevent negative numbers, but rather specifies the max. u32 value as the maximum value for both positive and negative inputs.

Besides address conversion offsets, the *.convmap files also store the start address of where the custom text and data section groups will be placed in memory at runtime. For targets targetting only emulators (Cemu), these are not necessary and should be omitted to be auto-calculated, but for targets targetting real Wii U hardware, they must be provided as they cannot be auto-calculated due to real hardware shifting the addresses on load.

TextAddr = ADDRESS;
DataAddr = ADDRESS;

Where ADDRESS is a hex (0x-prefixed) integer inside the u32 value range. If these values are not provided for a console-targetting build it will cause build errors.

Anything not fitting either of the two above syntax rulesets is a syntax error.

Additionally, once these two values are obtained, whether by manual setting or automatic calculation, tools should automatically add to compilation runs the standard defines TEXT_ADDR and DATA_ADDR, respectively setting them to their corresponding value as a hex u32 literal.

Auto-calculating text and data addresses

When automatically calculating the values for TextAddr and DataAddr if they are omitted, tools should follow the following steps:

Open and parse the base RPX file used, locating the ELF section of type 0x80000004 (SHT_RPL_FILEINFO)
If the section is not found, the base RPX is malformed, abort and error.
Parse the found section’s data, this does not need to be a full parse and can be implementation-defined, as long as the following field values are correctly read in their respective sizes:
1. u32 at +0x00: MAGIC_AND_VERSION
2. u32 at +0x08: TEXT_ALIGN
3. u32 at +0x10: DATA_ALIGN
If MAGIC_AND_VERSION is NOT 0xCAFE0402, the RPX version is unknown or malformed, abort and error.
Store the other two values for later use, then locate the ELF section named .text
If the section is not found, the base RPX is malformed, abort and error.
Calculate its end address through the formula: TEXT_END = section.addr + section.size + 1
Round up the result through the formula: TEXT_ADDR = ceil(TEXT_END / TEXT_ALIGN) * TEXT_ALIGN
You now have the value of TEXT_ADDR, store it and use as needed for other operations.
Find all ELF sections named .data, .rodata, and/or .bss. If any of them are not found, ignore it and move on.
If NONE are found, stop here and use 0x10000000 as default value for DATA_ADDR.
If only one is found, skip this step. Find the one with the highest start address.
Calculate the chosen section’s end address through the formula: DATA_END = section.addr + section.size + 1
Round up the result through the formula: DATA_ADDR = ceil(DATA_END / DATA_ALIGN) * DATA_ALIGN
You now have the value of DATA_ADDR, store it and use as needed for other operations.

Configuring project.gpj

Generation of the project.gpj is arguably the main task of a WUAPPS tool, the final result combining all of the information given to it through project.yaml and other means, for the GPJ file to then be given to the compiler driver (currently only gbuild as GPJ is a format specific to it in the first place) for the actual compilation process to be performed.

This file specifies the build options, settings to pass to the compiler, linker and assembler, the list of files to compile or perform other tasks with, among possibly other things. Please note the GPJ format syntax itself is specified by GHS MULTI and not this standard.

The generated project.gpj MUST always start with the following structure:

#!gbuild
primaryTarget=ppc_cos_ndebug.tgt
[Project]

This structure is technically all that is minimally required for a “valid GPJ file”, however it is functionally useless as it will not compile anything, essentially a no-op GPJ file.

After the starting structure, a list of tab-indented, newline separated (but with multiple space-separated entries allowed per line), global build options in CLI flags form is placed. The indentation of 1 tab at the start of each line of an option is REQUIRED, as a non-tab indented line signals the end of the global build options GPJ section. The following build options are UNCONDITIONALLY REQUIRED to be specified at all times, users should not be allowed to remove or modify them in any way:

-MD -> Enables generation of {ProjectDir}/objs/*.d files for incremental compilation in future runs. To optionally provide the option to make a build without incremental compilation, tools should do so by deleting the {ProjectDir}/objs folder.
-cpu=espresso -> Espresso is the name of the CPU used on the Wii U, the only currently supported target platform.
-sda=none -> The use of SDA (or ZDA) creates additional ELF data sections in the compiled output file, which are currently not accounted for anywhere in the standard and will disrupt tool operations such as binary patching, address calculations, and others.
- As such use of SDA (or ZDA) is currently unsupported and should not be allowed by tools.
--no_commons -> Required for the same reason as -sda=none.
-object_dir= -> Path relative to the project.gpj, configures the output objects (.o files) folder, the value (path) of the flag is implementation-defined but the flag itself is required to be present.

After the above options, the following REQUIRED AS DEFAULTS options are included, but each only if the user did not override or opt-out of it (the methods for overriding and opting out are defined by the ExcludeDefaultBuildOptions setting in project.yaml):

-c99
--g++
--link_once_templates
--enable_noinline
--max_inlining
--no_exceptions
--no_rtti
--no_implicit_include
-no_ansi_alias
-only_explicit_reg_use
-kanji=shiftjis
-Ospeed
-Onounroll
-Dcafe

Please note the following:

The local order (between themselves) of the options does not matter.
The usage of - or -- DOES MATTER, even for full word options, dash styles are NOT interchangeable to GHS MULTI, the exact option dash style must be used for each option (For example, -kanji is valid but --kanji is not).

After the required and default options (with exclusions and overrides performed) have been added, the user’s own settings are added from the file ${ProjectDir}/buildoptions.txt. If the file does not exist, the user has no custom build options and tools should silently proceed. The format for the build options file is the same as the GPJ’s global build options section, minus the required tab indents. Tools should not need (but may, for extended implementation-defined behavior) to “parse” the file contents beyond simply copying each line of it and appending them to the GPJ’s build options section, with the extra tab indent added to the start of each line.

After all global build options have been specified comes the Files List section, in which the files to be included in the compilation run are listed, in the form of non-intended (the first non-indented line in the file marks the end of the global build options and start of the File List), newline-separated file paths relative to the project.gpj. For tools, the File List should be generated from the resulting list of files from merging the Files field list of every module included in the current compilation target. Additional files may be appended from implementation-defined sources, as long as they are not required for successful compilation, so other standard compliant tools can still compile the project.

For each entry of the File List, the compiler driver generally assumes what to do with each file by mapping certain well-known file extensions to groups of File Types, from which it determines what to do with the file.

There are several types, but only the following are relevant to us:

C (Extensions: .c) Action: Compile with the C compiler
C++ (Extensions: .cc, .cpp, .cxx, .c++, .C, .CXX, .CPP) Action: Compile with the C++ compiler
Assembly (Extensions: .s, .asm, .ppc) Action: Assemble with the PowerPC assembler
- Note: The .ppc is special and enables further behavior of preprocessing the file with the C preprocessor.
Text (The fallback type if it can’t recognize an extension) Action: Silently ignore file

Everything above is exact and case-sensitive. This creates an issue for users which may want to use extensions not listed here for their source files, such as .S for assembly files. For this scenario the GPJ format allows at the end of a File List path entry for a [TYPE] structure to be placed, where TYPE is the appropriate File Type desired for that file. Tools should apply the types of the Files Record on module files to this structure for explicit type mapping.

Console vs Emulator Compilation

All compilation targets and templates are in theory, designed to be able to be compiled for both console and emulator. In practice, this may not be possible on a per-project basis due to project-specific requirements for targets of console and emulator respectively, such as special defines and modules. Implementations must support all possible scenarios, dual-compilation of the same target or console-only and emulator-only targets.

Note: Simultaneous dual-compilation for both console and emulator is not required, implementations are allowed to require separate runs with different program arguments for this task.

When allowing the user to select if the target will be compiled for console or emulator, through whichever means chosen by the tool, it MUST support a variable string value (NOT a boolean toggle) for the console targetting setting input, as a future-proofing mechanism to support multiple console compilation methods. (But setting a tool-chosen default is valid)

In order to make dual-compilation of targets possible, the most common use-case of having a special define to determine whether compilation is being done for emulator or console shall be built-in into implementations under a few standard defines:

-DPLATFORM_IS_EMULATOR=<0|1>
-DPLATFORM_IS_CONSOLE=<0|1>
-DPLATFORM_IS_CONSOLE_CAFELOADER=<0|1>

The value of each define, 0 or 1, is determined by the compilation method used as specified below.

Compilation Methods

Below are the currently valid values implementations must support as input to the console compilation input setting, alongside further information on implementation details of each method.

CafeLoader
- -DPLATFORM_IS_EMULATOR=0 build option is set.
- -DPLATFORM_IS_CONSOLE=1 build option is set.
- -DPLATFORM_IS_CONSOLE_CAFELOADER=1 build option is set.
- Special Requirements: Requires TextAddr and DataAddr to be manually provided in the selected target’s address map.
- Minimal Expected Output: Structurally valid Addr.bin, Code.bin, Data.bin and Patches.hax files, refer to the CafeLoader documentation for further information on correctly generating these files.
none or unset (equivalent to emulator compilation)
- -DPLATFORM_IS_EMULATOR=1 build option is set.
- -DPLATFORM_IS_CONSOLE=0 build option is set.
- -DPLATFORM_IS_CONSOLE_CAFELOADER=0 build option is set.
- Special Requirements: None
- Minimal Expected Output: A structurally valid patched *.rpx or *.elf, the output name of which is implementation-defined other than the required .rpx or .elf extension.

Shift-JIS Standardization

Put simply, Shift-JIS has a complex and ancient history, which led to it being poorly documented and standardized, with many different variants all wanting to use the same or similar “Shift-JIS” name. The original Shift-JIS specification, known as JIS X 0201 is NOT a fully compliant superset of ASCII, since it replaces the characters \ (U+005C) and ~ (U+007E) with the characters ¥ (U+00A5) and ‾ (U+203E) respectively. This is inconsistency is problematic and for that reason when the western world adopted support for it, primarily Microsoft, those two characters were reverted to their original ASCII values, but the name “Shift-JIS” was kept, creating an ambiguous definition for the ASCII compliant Shift-JIS^[11] and the original Shift-JIS.

In the present day, the Unicode Consortium has standardized both encodings in the ICU separately: ibm-943_P130-1999 (original Shift-JIS) and ibm-943_P15A-2003 (ASCII compliant Shift-JIS^[11])

For the purposes of this document, outside of the current section (“Shift-JIS Standardization”), all references to “Shift-JIS” refer to the ASCII compliant Shift-JIS^[11] (ibm-943_P15A-2003) variant. Implementations should check the Shift-JIS variant their encoders/decoders are using to ensure it is the correct one. The below tests can be used to check:

\~ encoded to Shift-JIS^[11] must equal to the byte sequence 5C 7E
¥‾ encoded to Shift-JIS^[11] must equal to the byte sequence 5C 7E
the byte sequence 5C 7E decoded from Shift-JIS^[11] must equal to \~
Shift-JIS^[11] decoding functionality is currently not required for implementing the standard, but is included here for future-proofing.

↑ ^1.00 ^1.01 ^1.02 ^1.03 ^1.04 ^1.05 ^1.06 ^1.07 ^1.08 ^1.09 ^1.10 ^1.11 ^1.12 Refers to a null literal inside a YAML file using either literal null word syntax or the ~ shorthand syntax.
↑ ^2.0 ^2.1 Note that the Integer type is arbitrary and does not actually exist in YAML, as YAML treats all numbers as floats, it is required that tools instead validate (NOT coerce!) the received YAML float number as an integer, and error otherwise.
↑ ^3.00 ^3.01 ^3.02 ^3.03 ^3.04 ^3.05 ^3.06 ^3.07 ^3.08 ^3.09 ^3.10 The syntax List<T> indicates a YAML List whose entries contain values only of the YAML Type T
↑ ^4.0 ^4.1 ^4.2 ^4.3 ^4.4 ^4.5 The syntax Record<T> indicates a YAML Record whose keys contain values only of the YAML Type T
↑ ^5.0 ^5.1 Tools must error upon encountering any mismatched types from the ones specified here. No form of type coercion should be done. Multiple types on different lines for the same field means "OR", i.e. either type is valid.
↑ a-z, A-Z, 0-9
↑ ^7.0 ^7.1 ^7.2 ^7.3 ^7.4 The folder project.yaml is located within
↑ ^8.0 ^8.1 ^8.2 ^8.3 . = {ProjectDir}Paths may also use one or more ../ to refer to folders anywhere outside of {ProjectDir}
↑ ^9.0 ^9.1 Custom YAML Record structure type, defined below.
↑ ^10.0 ^10.1 ^10.2 The name of the target, as specified by its YAML key.
↑ ^11.0 ^11.1 ^11.2 ^11.3 ^11.4 ^11.5 ^11.6 ^11.7 ^11.8 Specifically referring to the ibm-943_P15A-2003 encoding. Read Shift-JIS Standardization section for details.

[:0-1] 1.00 ^1.01 ^1.02 ^1.03 ^1.04 ^1.05 ^1.06 ^1.07 ^1.08 ^1.09 ^1.10 ^1.11 ^1.12 Refers to a null literal inside a YAML file using either literal null word syntax or the ~ shorthand syntax.

[:9-2] 2.0 ^2.1 Note that the Integer type is arbitrary and does not actually exist in YAML, as YAML treats all numbers as floats, it is required that tools instead validate (NOT coerce!) the received YAML float number as an integer, and error otherwise.

[:2-3] 3.00 ^3.01 ^3.02 ^3.03 ^3.04 ^3.05 ^3.06 ^3.07 ^3.08 ^3.09 ^3.10 The syntax List<T> indicates a YAML List whose entries contain values only of the YAML Type T

[:1-4] 4.0 ^4.1 ^4.2 ^4.3 ^4.4 ^4.5 The syntax Record<T> indicates a YAML Record whose keys contain values only of the YAML Type T

[:3-5] 5.0 ^5.1 Tools must error upon encountering any mismatched types from the ones specified here. No form of type coercion should be done. Multiple types on different lines for the same field means "OR", i.e. either type is valid.

[6] a-z, A-Z, 0-9

[:4-7] 7.0 ^7.1 ^7.2 ^7.3 ^7.4 The folder project.yaml is located within

[:5-8] 8.0 ^8.1 ^8.2 ^8.3 . = {ProjectDir}Paths may also use one or more ../ to refer to folders anywhere outside of {ProjectDir}

[:6-9] 9.0 ^9.1 Custom YAML Record structure type, defined below.

[:7-10] 10.0 ^10.1 ^10.2 The name of the target, as specified by its YAML key.

[:8-11] 11.0 ^11.1 ^11.2 ^11.3 ^11.4 ^11.5 ^11.6 ^11.7 ^11.8 Specifically referring to the ibm-943_P15A-2003 encoding. Read Shift-JIS Standardization section for details.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

@@ Line 5: / Line 5: @@
 </blockquote>v3.0-DRAFT
 ===Terms and Definitions===
-For the purposes of this document, the following terms and definitions apply.
+For the purposes of this document, the following terms and definitions apply. (Plural and variant forms of terms implicitly included)
+{| class="wikitable"
-*A "tool" or "tools" refers to any programs implementing this standard in compliance with it, and/or the developers working on said tool.
+!Term
-*"implementation-defined" means behavior that is up to tools to define however they please, as long as it meets certain minimum expectations criteria that may be specified by the standard.
+!Definition
-*Currently this standard only acknowledges and accounts for '''GHS MULTI''' as compiler toolchain for use. Any mention of words such as “compiler”, “linker”, “assembler”, etc. all refer to the respective tools of GHS MULTI implicitly. Many assumptions and choices of default and possible values for configurations are based on the behavior of GHS MULTI and may not work correctly with other compiler toolchains. Support for other compiler toolchains is open for the future under a new major revision, but there are none planned at this time.
+|-
-*In all instances within this document where slashes (<code>/</code>) are used or referenced in the context of a file or folder path, it is implied that it is interchangeable with backslashes (<code>\</code>) for compatibility. ''Implementation note:'' When parsing file or folder paths tools must treat both <code>/</code> and <code>\</code> as path separators regardless of the running platform and interpret the path as valid, including paths making mixed use of both separators.
+|'''tool'''
-*In all instances within this document where file or folder names are used or referenced, including within file and folder paths, these names MUST only contain the following characters <code>a-z</code>, <code>A-Z</code>, <code>0-9</code>, <code>-_.,+()</code>, and MUST NOT start with a <code>-</code> or end with a <code>.</code>. File and folder paths inherit these rules with only the added allowed characters <code>/\:</code> as path separators. This ensures compatibility with all operating systems.
+|Refers to any programs implementing this standard in compliance with it, and/or the developers working on said program.
-*In all instances within this document where file or folder paths are used or referenced, these paths MUST be CASE-SENSITIVE to ensure compatibility with all filesystem formats. ''Implementation note:'' This can be achieved with the following methods:
+|-
-**On a case-sensitive filesystem: No action required, process the input path as-is.
+|'''implementation-defined'''
-**On a case-insensitive filesystem: If input path doesn’t exist, end. Else, fetch the stored case of the file path and compare against the input path, if the comparison is not equal, error. Else, the comparison is equal, therefore the path is valid.
+|Refers to behavior that is up to tools to define however they please, as long as it meets certain minimum expectations criteria that may be specified by the standard, if any.
-*<code>YAML_NULL</code><ref name=":0">Refers to a null literal inside a YAML file using either literal <code>null</code> word syntax or the <code>~</code> shorthand syntax.</ref>
+|-
-*For any value of type STRING, an empty string is to be considered equivalent to having set <code>YAML_NULL</code> (which may in turn be considered invalid for values which do not allow being <code>YAML_NULL</code>), unless a special behavior for empty strings on that specific value is explicitly specified.
+|'''compiler'''
-*Any text in the form <code>{PLACEHOLDER}</code> is an instance of a dynamic text segment named <code>PLACEHOLDER</code> which can have user-defined value, with or without restrictions on a case by case basis specified near the definition of the dynamic text segment.
+| rowspan="3" |Currently this standard only acknowledges and accounts for '''GHS MULTI''' ''<sub>(Green Hills Software MULTI)</sub>'' as compiler toolchain for use. Any mention of words such as “compiler”, “linker”, “assembler”, etc. all refer to the respective tools of '''GHS MULTI''' implicitly. Many assumptions and choices of default and possible values for configurations are based on the behavior of '''GHS MULTI''' and may not work correctly with other compiler toolchains. Support for other compiler toolchains is open for the future under a new major revision, but there are none planned at this time.
-*For all optional list-based YAML fields, omitting the field or explicitly setting it to <code>YAML_NULL</code> falls back to the default value, if any, otherwise being <code>YAML_NULL</code> as implicit default fallback.
+|-
-**Setting it to an '''empty list''' (<code>[]</code>) asserts truly no values, without using the default value, if any.
+|'''linker'''
-*For all optional non-list YAML fields (including maps <code>{}</code>), only omitting the field falls back to the default value, if any, otherwise being <code>YAML_NULL</code> as implicit default fallback. The behavior of explicitly setting it to to <code>YAML_NULL</code> is defined by each field.
+|-
-**If a non-list YAML field does not specify a behavior for <code>YAML_NULL</code> then it is not allowed to be set to <code>YAML_NULL</code>.
+|'''assembler'''
-*“Valid C++98 macro identifiers”<ref>Strings which may only contain <code>A-Z</code>, <code>a-z</code>, <code>0-9</code>, and <code>_</code>, as well as not starting with <code>0-9</code>.</ref>
+|-
-*References of <code>Shift-JIS</code> <ref name=":8">Specifically referring to the [https://icu4c-demos.unicode.org/icu-bin/convexp?conv=ibm-943_P15A-2003&s=ALL ibm-943_P15A-2003] encoding. Read [[#shift-jis-standardization|Shift-JIS Standardization]] section for details.</ref>
+|'''/'''
+| rowspan="2" |In all instances within this document where slashes (<code>/</code>) are used or referenced in the context of a file or folder path, it is implied that it is interchangeable with backslashes (<code>\</code>) for compatibility, and vice-versa.
+''Implementation note:'' When parsing file or folder paths tools must treat both <code>/</code> and <code>\</code> as path separators regardless of the running platform and interpret the path as valid, including paths making mixed use of both separators.
+|-
+|'''\'''
+|-
+|'''file name'''
+| rowspan="2" |In all instances within this document where file or folder '''names''' are used or referenced, including within file and folder paths, these names MUST meet the following requirements:
+* Only contain the characters <code>a-z</code>, <code>A-Z</code>, <code>0-9</code>, <code>-_.,+()</code>
+* NOT start with a <code>-</code> NOR end with a <code>.</code>
+File and folder paths inherit these rules with only the added allowed characters <code>/\:</code> as path separators. This ensures compatibility with all operating systems.
+|-
+|'''folder name'''
+|-
+|'''file path'''
+| rowspan="2" |In all instances within this document where file or folder '''paths''' are used or referenced, these paths MUST be CASE-SENSITIVE to ensure compatibility with all filesystem formats.
+''Implementation note:'' This can be achieved with the following methods:
+*'''On a case-sensitive filesystem:''' No action required, process the input path as-is.
+*'''On a case-insensitive filesystem:''' If input path doesn’t exist, end. Else, fetch the stored case of the file path and compare against the input path, if the comparison is not equal, error. Else, the comparison is equal, therefore the path is valid.
+|-
+|'''folder path'''
+|}
+==== YAML Types Definitions ====
+{| class="wikitable mw-collapsible mw-collapsed"
+!Type
+!Definition
+|-
+|YAML_NULL
+|<ref name=":0">Refers to a null literal inside a YAML file using either literal <code>null</code> word syntax or the <code>~</code> shorthand syntax.</ref>
+|-
+|Boolean
+|A standard generic YAML boolean, represented by either <code>true</code> or <code>false</code> word literals.
+|-
+|String
+|A standard generic YAML string, prior to further restrictions possibly applied to it by the field it's bound to, it may contain any character and have any length.
+An empty String is considered equivalent to YAML_NULL<ref name=":0" /> unless otherwise specified in a specific instance of String usage.
+|-
+|Number
+|A standard generic YAML number, equivalent to a floating point value.
+|-
+|Integer
+|<ref name=":9" />
+|-
+|List<T>
+|<ref name=":2" />: A standard generic YAML List, an array of values of any length.
+For all optional fields with this type, omitting the field or explicitly setting it to YAML_NULL<ref name=":0" /> is equivalent to falling back to the default value, if any, otherwise being YAML_NULL<ref name=":0" /> as fallback.
+An Empty List definition asserts truly no values, without using default values or fallbacks.
+|-
+|Record<T>
+|<ref name=":1" />: A standard generic YAML Record, a map of indefinite key/value pairs.
+|}
 ===Versioning===
 This standard’s version follows a restricted subset of [https://semver.org/ SemVer], where only '''Major''' and '''Minor''' version are present, and no other extensions allowed, for a resulting version format of <code>{StandardMajor}.{StandardMinor}</code>. The standard does not have '''Patch''' versions or any other version extensions such as tags or metadata. (<small>''Draft revisions'' of the standard not effectively considered as "the standard" while in draft stage and therefore are exempt from these rules)</small>
@@ Line 140: / Line 195: @@
 |-
 |MinAlign
-|Record<Integer<ref>Note that the Integer type is arbitrary and does not actually exist in YAML, as YAML treats all numbers as floats, it is required that tools instead '''validate''' (NOT coerce!) the received YAML float number as an integer, and error otherwise.</ref>><ref name=":1" />
+|Record<Integer<ref name=":9">Note that the Integer type is arbitrary and does not actually exist in YAML, as YAML treats all numbers as floats, it is required that tools instead '''validate''' (NOT coerce!) the received YAML float number as an integer, and error otherwise.</ref>><ref name=":1" />
 |Key/Value pairs of indefinite custom user-defined minimum alignments to use per ELF section by section name.
 The definition of the record itself, empty or partial, does not void the default section's values unless they are explicitly redefined each.
@@ Line 168: / Line 223: @@
 ><ref name=":1" />
 |Key/Value pairs of indefinite target configurations. '''KEYS''' are target names. (At least 1 non-abstract target is required to be defined)
-To create an “empty” target using all default settings, give it (the field itself) <code>YAML_NULL</code> as the value.
+To create an “empty” target using all default settings, give it (the field itself) <code>YAML_NULL</code><ref name=":0" /> as the value.
 |'''<small>''REQUIRED OPTION''</small>'''
 |-
@@ Line 322: / Line 377: @@
 This section documents the different types of structures you can write on the <code>Hooks</code> list of a [[#modules|module]]. The current valid types, and their extra fields besides the base <code>type</code> and <code>addr</code> common to all hook types, are as follows:
+* <code>nop</code> - A shorthand for one or multiple sequential <code>patch</code> hooks with <code>60000000</code> (<code>nop</code>) as <code>data</code>
+** '''Optional field:''' <code>count</code> (Default: <code>1</code>) - A positive non-zero decimal integer number, specifying how many <code>nop</code>’s to apply starting from <code>addr</code>
+* <code>return</code> - A shortcut for a <code>patch</code> hook with <code>4E800020</code> (<code>blr</code>) as <code>data</code>
+* <code>branch</code> - Inserts the respective branch instruction <code>instr</code> at <code>addr</code> jumping to the address of the symbol <code>func</code>
+** '''Additional field:''' <code>instr</code> - The branch instruction to use: <code>b</code> or <code>bl</code>
+** '''Additional field:''' <code>func</code> - The symbol whose address the branch instruction should jump to
+* <code>funcptr</code> - Inserts the address of the symbol <code>func</code> at <code>addr</code>
+** '''Additional field:''' <code>func</code> - The symbol whose address to write at <code>addr</code>
 * <code>patch</code> - The most basic and versatile hook, simply writes <code>data</code> at <code>addr</code> (overwrites existing data)
 ** '''Additional field:''' <code>data</code> - A value to be encoded according to the <code>datatype</code> field and inserted at the <code>addr</code>(s)
@@ Line 341: / Line 404: @@
 **** The difference between <code>char[]</code> and <code>string</code> is a <code>char[]</code> doesn’t automatically null-terminate, uses '''ASCII''' encoding and doesn’t align. Meanwhile <code>string</code> does automatically null-terminate, can use configurable-encoding and aligns by 4.
 ***** Likewise, the difference between <code>wchar[]</code> and <code>wstring</code> is a <code>wchar[]</code> doesn’t automatically null-terminate and aligns by 2. Meanwhile <code>wstring</code> does automatically 2-byte null-terminate and aligns by 4.
-***** To write a null character on a <code>char[]</code> or <code>wchar[]</code>, write down ''YAML_NULL'' or use standard string escape sequences.
+***** To write a null character on a <code>char[]</code> or <code>wchar[]</code>, write down <code>YAML_NULL</code>''<ref name=":0" />'' or use standard string escape sequences.
 **** Multidimensional arrays such as <code>int[][]</code> are not supported.
-**** '''Optional conditional field:''' <code>encoding</code> (Default: <code>Shift-JIS</code><ref name=":8" />) - This field specifies the encoding to encode the input string value with.
+**** '''Optional conditional field:''' <code>encoding</code> (Default: <code>Shift-JIS</code><ref name=":8">Specifically referring to the [https://icu4c-demos.unicode.org/icu-bin/convexp?conv=ibm-943_P15A-2003&s=ALL ibm-943_P15A-2003] encoding. Read [[#shift-jis-standardization|Shift-JIS Standardization]] section for details.</ref>) - This field specifies the encoding to encode the input string value with.
 ***** This field may be included if <code>datatype</code> is either <code>string</code>, <code>wchar</code> or <code>wstring</code>.
 ****** If this field is present when <code>datatype</code> is not one of the above, an error must be thrown.
@@ Line 349: / Line 412: @@
 ****** <code>UTF-8</code>, <code>UTF8</code>, <code>utf-8</code>, <code>utf8</code> (ONLY valid for <code>string</code> datatype, throw error if not used with it)
 ****** <code>UCS-2</code>, <code>UCS2</code>, <code>ucs-2</code>, <code>ucs2</code> (NOT valid for <code>string</code> datatype, throw error if used with it)
-****** <code>Shift-JIS</code>, <code>ShiftJIS</code>, <code>shift-jis</code>, <code>shiftjis</code> (valid for all 3 encoding-compatible datatypes)
+****** <code>Shift-JIS</code>, <code>ShiftJIS</code>, <code>shift-jis</code>, <code>shiftjis</code> (valid for all 3 encoding-compatible datatypes)<ref name=":8" />
 ****** '''UCS-2 NOTES:'''
-******* <code>UCS-2</code> is an obsolete predecessor of <code>UTF-16</code>, as such it may be difficult to find encoders for <code>UCS-2</code> specifically. The difference between the two encodings is simply <code>UCS-2</code> does not support multi-character codepoints (surrogates) to support characters beyond <code>U+FFFF</code>, therefore an implementation using a <code>UTF-16</code> encoder simply pre-validating all input characters to block any characters above <code>U+FFFF</code> is a valid implementation of <code>UCS-2</code>. Other things to note about <code>UCS-2</code> is that the encoding must be in '''big endian''' and a Byte Order Mark (BOM) should NOT be included.
+******* ''<code>UCS-2</code> is an obsolete predecessor of <code>UTF-16</code>, as such it may be difficult to find encoders for <code>UCS-2</code> specifically. The difference between the two encodings is simply <code>UCS-2</code> does not support multi-character codepoints (surrogates) to support characters beyond <code>U+FFFF</code>, therefore an implementation using a <code>UTF-16</code> encoder simply pre-validating all input characters to block any characters above <code>U+FFFF</code> is a valid implementation of <code>UCS-2</code>. Other things to note about <code>UCS-2</code> is that the encoding must be in '''big endian''' and a Byte Order Mark (BOM) should NOT be included.''
-** <code>nop</code> - A shorthand for one or multiple sequential <code>patch</code> hooks with <code>60000000</code> (<code>nop</code>) as <code>data</code>
-*** '''Optional field:''' <code>count</code> (Default: <code>1</code>) - A positive non-zero decimal integer number, specifying how many <code>nop</code>’s to apply starting from <code>addr</code>
-** <code>return</code> - A shortcut for a <code>patch</code> hook with <code>4E800020</code> (<code>blr</code>) as <code>data</code>
-** <code>branch</code> - Inserts the respective branch instruction <code>instr</code> at <code>addr</code> jumping to the address of the symbol <code>func</code>
-*** '''Additional field:''' <code>instr</code> - The branch instruction to use: <code>b</code> or <code>bl</code>
-*** '''Additional field:''' <code>func</code> - The symbol whose address the branch instruction should jump to
-** <code>funcptr</code> - Inserts the address of the symbol <code>func</code> at <code>addr</code>
-*** '''Additional field:''' <code>func</code> - The symbol whose address to write at <code>addr</code>
-''All hook additional fields are required unless explicitly marked as optional.''
 ===Symbol Maps===
 The primary symbol map for a project is located at <code>{ProjectDir}/maps/main.map</code> and has a very basic syntax similar to any other symbol map file.
@@ Line 522: / Line 575: @@
 ** '''Minimal Expected Output:''' A structurally valid patched <code>*.rpx</code> or <code>*.elf</code>, the output name of which is implementation-defined other than the required <code>.rpx</code> or <code>.elf</code> extension.
 ===Shift-JIS Standardization===
-Put simply, [https://en.wikipedia.org/wiki/Shift_JIS Shift-JIS] has a complex and ancient history, which led to it being poorly documented and standardized, with many different variants all wanting to use the same or similar “Shift-JIS” name. The '''original Shift-JIS specification''', known as [https://www.sljfaq.org/afaq/encodings.html#encodings-JIS-X-0201 JIS X 0201] is NOT a fully compliant superset of ASCII, since it replaces the characters <code>\</code> (<code>U+005C</code>) and <code>~</code> (<code>U+007E</code>) with the characters <code>¥</code> (<code>U+00A5</code>) and <code>‾</code> (<code>U+203E</code>) respectively. This is inconsistency is problematic and for that reason when the western world adopted support for it, primarily Microsoft, those two characters were reverted to their original ASCII values, but the name “Shift-JIS” was kept, creating an ambiguous definition for the ASCII compliant Shift-JIS and the original Shift-JIS.
+Put simply, [https://en.wikipedia.org/wiki/Shift_JIS Shift-JIS] has a complex and ancient history, which led to it being poorly documented and standardized, with many different variants all wanting to use the same or similar “Shift-JIS” name. The '''original Shift-JIS specification''', known as [https://www.sljfaq.org/afaq/encodings.html#encodings-JIS-X-0201 JIS X 0201] is NOT a fully compliant superset of ASCII, since it replaces the characters <code>\</code> (<code>U+005C</code>) and <code>~</code> (<code>U+007E</code>) with the characters <code>¥</code> (<code>U+00A5</code>) and <code>‾</code> (<code>U+203E</code>) respectively. This is inconsistency is problematic and for that reason when the western world adopted support for it, primarily Microsoft, those two characters were reverted to their original ASCII values, but the name “Shift-JIS” was kept, creating an ambiguous definition for the ASCII compliant Shift-JIS<ref name=":8" /> and the original Shift-JIS.
-In the present day, the Unicode Consortium has standardized both encodings in the [https://icu.unicode.org/ ICU] separately: [https://icu4c-demos.unicode.org/icu-bin/convexp?conv=ibm-943_P130-1999&s=ALL ibm-943_P130-1999] (original Shift-JIS) and [https://icu4c-demos.unicode.org/icu-bin/convexp?conv=ibm-943_P15A-2003&s=ALL ibm-943_P15A-2003] (ASCII compliant Shift-JIS)
+In the present day, the Unicode Consortium has standardized both encodings in the [https://icu.unicode.org/ ICU] separately: [https://icu4c-demos.unicode.org/icu-bin/convexp?conv=ibm-943_P130-1999&s=ALL ibm-943_P130-1999] (original Shift-JIS) and [https://icu4c-demos.unicode.org/icu-bin/convexp?conv=ibm-943_P15A-2003&s=ALL ibm-943_P15A-2003] (ASCII compliant Shift-JIS<ref name=":8" />)
-For the purposes of this document, outside of the current section (“Shift-JIS Standardization”), '''all references to “Shift-JIS” refer to the ASCII compliant Shift-JIS (ibm-943_P15A-2003) variant'''. Implementations should check the Shift-JIS variant their encoders/decoders are using to ensure it is the correct one. The below tests can be used to check:
+For the purposes of this document, outside of the current section (“Shift-JIS Standardization”), '''all references to “Shift-JIS” refer to the ASCII compliant Shift-JIS'''<ref name=":8" /> '''(ibm-943_P15A-2003) variant'''. Implementations should check the Shift-JIS variant their encoders/decoders are using to ensure it is the correct one. The below tests can be used to check:
 * <code>\~</code> encoded to Shift-JIS<ref name=":8" /> '''must equal''' to the byte sequence <code>5C 7E</code>