Other:WUAPPS: Difference between revisions
Jhmaster2000 (talk | contribs) m (fhursksdemqwio938t2) |
Jhmaster2000 (talk | contribs) (proper yaml type defs) |
||
Line 5: | Line 5: | ||
</blockquote>v3.0-DRAFT |
</blockquote>v3.0-DRAFT |
||
===Terms and Definitions=== |
===Terms and Definitions=== |
||
For the purposes of this document, the following terms and definitions apply. |
For the purposes of this document, the following terms and definitions apply. (Plural and variant forms of terms implicitly included) |
||
{| class="wikitable" |
|||
*A "tool" or "tools" refers to any programs implementing this standard in compliance with it, and/or the developers working on said tool. |
|||
!Term |
|||
*"implementation-defined" means behavior that is up to tools to define however they please, as long as it meets certain minimum expectations criteria that may be specified by the standard. |
|||
!Definition |
|||
*Currently this standard only acknowledges and accounts for '''GHS MULTI''' as compiler toolchain for use. Any mention of words such as “compiler”, “linker”, “assembler”, etc. all refer to the respective tools of GHS MULTI implicitly. Many assumptions and choices of default and possible values for configurations are based on the behavior of GHS MULTI and may not work correctly with other compiler toolchains. Support for other compiler toolchains is open for the future under a new major revision, but there are none planned at this time. |
|||
|- |
|||
*In all instances within this document where slashes (<code>/</code>) are used or referenced in the context of a file or folder path, it is implied that it is interchangeable with backslashes (<code>\</code>) for compatibility. ''Implementation note:'' When parsing file or folder paths tools must treat both <code>/</code> and <code>\</code> as path separators regardless of the running platform and interpret the path as valid, including paths making mixed use of both separators. |
|||
|'''tool''' |
|||
*In all instances within this document where file or folder names are used or referenced, including within file and folder paths, these names MUST only contain the following characters <code>a-z</code>, <code>A-Z</code>, <code>0-9</code>, <code>-_.,+()</code>, and MUST NOT start with a <code>-</code> or end with a <code>.</code>. File and folder paths inherit these rules with only the added allowed characters <code>/\:</code> as path separators. This ensures compatibility with all operating systems. |
|||
|Refers to any programs implementing this standard in compliance with it, and/or the developers working on said program. |
|||
*In all instances within this document where file or folder paths are used or referenced, these paths MUST be CASE-SENSITIVE to ensure compatibility with all filesystem formats. ''Implementation note:'' This can be achieved with the following methods: |
|||
|- |
|||
**On a case-sensitive filesystem: No action required, process the input path as-is. |
|||
|'''implementation-defined''' |
|||
**On a case-insensitive filesystem: If input path doesn’t exist, end. Else, fetch the stored case of the file path and compare against the input path, if the comparison is not equal, error. Else, the comparison is equal, therefore the path is valid. |
|||
|Refers to behavior that is up to tools to define however they please, as long as it meets certain minimum expectations criteria that may be specified by the standard, if any. |
|||
*<code>YAML_NULL</code><ref name=":0">Refers to a null literal inside a YAML file using either literal <code>null</code> word syntax or the <code>~</code> shorthand syntax.</ref> |
|||
|- |
|||
*For any value of type STRING, an empty string is to be considered equivalent to having set <code>YAML_NULL</code> (which may in turn be considered invalid for values which do not allow being <code>YAML_NULL</code>), unless a special behavior for empty strings on that specific value is explicitly specified. |
|||
|'''compiler''' |
|||
*Any text in the form <code>{PLACEHOLDER}</code> is an instance of a dynamic text segment named <code>PLACEHOLDER</code> which can have user-defined value, with or without restrictions on a case by case basis specified near the definition of the dynamic text segment. |
|||
| rowspan="3" |Currently this standard only acknowledges and accounts for '''GHS MULTI''' ''<sub>(Green Hills Software MULTI)</sub>'' as compiler toolchain for use. Any mention of words such as “compiler”, “linker”, “assembler”, etc. all refer to the respective tools of '''GHS MULTI''' implicitly. Many assumptions and choices of default and possible values for configurations are based on the behavior of '''GHS MULTI''' and may not work correctly with other compiler toolchains. Support for other compiler toolchains is open for the future under a new major revision, but there are none planned at this time. |
|||
*For all optional list-based YAML fields, omitting the field or explicitly setting it to <code>YAML_NULL</code> falls back to the default value, if any, otherwise being <code>YAML_NULL</code> as implicit default fallback. |
|||
|- |
|||
**Setting it to an '''empty list''' (<code>[]</code>) asserts truly no values, without using the default value, if any. |
|||
|'''linker''' |
|||
*For all optional non-list YAML fields (including maps <code>{}</code>), only omitting the field falls back to the default value, if any, otherwise being <code>YAML_NULL</code> as implicit default fallback. The behavior of explicitly setting it to to <code>YAML_NULL</code> is defined by each field. |
|||
|- |
|||
**If a non-list YAML field does not specify a behavior for <code>YAML_NULL</code> then it is not allowed to be set to <code>YAML_NULL</code>. |
|||
|'''assembler''' |
|||
*“Valid C++98 macro identifiers”<ref>Strings which may only contain <code>A-Z</code>, <code>a-z</code>, <code>0-9</code>, and <code>_</code>, as well as not starting with <code>0-9</code>.</ref> |
|||
|- |
|||
*References of <code>Shift-JIS</code> <ref name=":8">Specifically referring to the [https://icu4c-demos.unicode.org/icu-bin/convexp?conv=ibm-943_P15A-2003&s=ALL ibm-943_P15A-2003] encoding. Read [[#shift-jis-standardization|Shift-JIS Standardization]] section for details.</ref> |
|||
|'''/''' |
|||
| rowspan="2" |In all instances within this document where slashes (<code>/</code>) are used or referenced in the context of a file or folder path, it is implied that it is interchangeable with backslashes (<code>\</code>) for compatibility, and vice-versa. |
|||
''Implementation note:'' When parsing file or folder paths tools must treat both <code>/</code> and <code>\</code> as path separators regardless of the running platform and interpret the path as valid, including paths making mixed use of both separators. |
|||
|- |
|||
|'''\''' |
|||
|- |
|||
|'''file name''' |
|||
| rowspan="2" |In all instances within this document where file or folder '''names''' are used or referenced, including within file and folder paths, these names MUST meet the following requirements: |
|||
* Only contain the characters <code>a-z</code>, <code>A-Z</code>, <code>0-9</code>, <code>-_.,+()</code> |
|||
* NOT start with a <code>-</code> NOR end with a <code>.</code> |
|||
File and folder paths inherit these rules with only the added allowed characters <code>/\:</code> as path separators. This ensures compatibility with all operating systems. |
|||
|- |
|||
|'''folder name''' |
|||
|- |
|||
|'''file path''' |
|||
| rowspan="2" |In all instances within this document where file or folder '''paths''' are used or referenced, these paths MUST be CASE-SENSITIVE to ensure compatibility with all filesystem formats. |
|||
''Implementation note:'' This can be achieved with the following methods: |
|||
*'''On a case-sensitive filesystem:''' No action required, process the input path as-is. |
|||
*'''On a case-insensitive filesystem:''' If input path doesn’t exist, end. Else, fetch the stored case of the file path and compare against the input path, if the comparison is not equal, error. Else, the comparison is equal, therefore the path is valid. |
|||
|- |
|||
|'''folder path''' |
|||
|} |
|||
==== YAML Types Definitions ==== |
|||
{| class="wikitable mw-collapsible mw-collapsed" |
|||
!Type |
|||
!Definition |
|||
|- |
|||
|YAML_NULL |
|||
|<ref name=":0">Refers to a null literal inside a YAML file using either literal <code>null</code> word syntax or the <code>~</code> shorthand syntax.</ref> |
|||
|- |
|||
|Boolean |
|||
|A standard generic YAML boolean, represented by either <code>true</code> or <code>false</code> word literals. |
|||
|- |
|||
|String |
|||
|A standard generic YAML string, prior to further restrictions possibly applied to it by the field it's bound to, it may contain any character and have any length. |
|||
An empty String is considered equivalent to YAML_NULL<ref name=":0" /> unless otherwise specified in a specific instance of String usage. |
|||
|- |
|||
|Number |
|||
|A standard generic YAML number, equivalent to a floating point value. |
|||
|- |
|||
|Integer |
|||
|<ref name=":9" /> |
|||
|- |
|||
|List<T> |
|||
|<ref name=":2" />: A standard generic YAML List, an array of values of any length. |
|||
For all optional fields with this type, omitting the field or explicitly setting it to YAML_NULL<ref name=":0" /> is equivalent to falling back to the default value, if any, otherwise being YAML_NULL<ref name=":0" /> as fallback. |
|||
An Empty List definition asserts truly no values, without using default values or fallbacks. |
|||
|- |
|||
|Record<T> |
|||
|<ref name=":1" />: A standard generic YAML Record, a map of indefinite key/value pairs. |
|||
|} |
|||
===Versioning=== |
===Versioning=== |
||
This standard’s version follows a restricted subset of [https://semver.org/ SemVer], where only '''Major''' and '''Minor''' version are present, and no other extensions allowed, for a resulting version format of <code>{StandardMajor}.{StandardMinor}</code>. The standard does not have '''Patch''' versions or any other version extensions such as tags or metadata. (<small>''Draft revisions'' of the standard not effectively considered as "the standard" while in draft stage and therefore are exempt from these rules)</small> |
This standard’s version follows a restricted subset of [https://semver.org/ SemVer], where only '''Major''' and '''Minor''' version are present, and no other extensions allowed, for a resulting version format of <code>{StandardMajor}.{StandardMinor}</code>. The standard does not have '''Patch''' versions or any other version extensions such as tags or metadata. (<small>''Draft revisions'' of the standard not effectively considered as "the standard" while in draft stage and therefore are exempt from these rules)</small> |
||
Line 140: | Line 195: | ||
|- |
|- |
||
|MinAlign |
|MinAlign |
||
|Record<Integer<ref>Note that the Integer type is arbitrary and does not actually exist in YAML, as YAML treats all numbers as floats, it is required that tools instead '''validate''' (NOT coerce!) the received YAML float number as an integer, and error otherwise.</ref>><ref name=":1" /> |
|Record<Integer<ref name=":9">Note that the Integer type is arbitrary and does not actually exist in YAML, as YAML treats all numbers as floats, it is required that tools instead '''validate''' (NOT coerce!) the received YAML float number as an integer, and error otherwise.</ref>><ref name=":1" /> |
||
|Key/Value pairs of indefinite custom user-defined minimum alignments to use per ELF section by section name. |
|Key/Value pairs of indefinite custom user-defined minimum alignments to use per ELF section by section name. |
||
The definition of the record itself, empty or partial, does not void the default section's values unless they are explicitly redefined each. |
The definition of the record itself, empty or partial, does not void the default section's values unless they are explicitly redefined each. |
||
Line 168: | Line 223: | ||
><ref name=":1" /> |
><ref name=":1" /> |
||
|Key/Value pairs of indefinite target configurations. '''KEYS''' are target names. (At least 1 non-abstract target is required to be defined) |
|Key/Value pairs of indefinite target configurations. '''KEYS''' are target names. (At least 1 non-abstract target is required to be defined) |
||
To create an “empty” target using all default settings, give it (the field itself) <code>YAML_NULL</code> as the value. |
To create an “empty” target using all default settings, give it (the field itself) <code>YAML_NULL</code><ref name=":0" /> as the value. |
||
|'''<small>''REQUIRED OPTION''</small>''' |
|'''<small>''REQUIRED OPTION''</small>''' |
||
|- |
|- |
||
Line 322: | Line 377: | ||
This section documents the different types of structures you can write on the <code>Hooks</code> list of a [[#modules|module]]. The current valid types, and their extra fields besides the base <code>type</code> and <code>addr</code> common to all hook types, are as follows: |
This section documents the different types of structures you can write on the <code>Hooks</code> list of a [[#modules|module]]. The current valid types, and their extra fields besides the base <code>type</code> and <code>addr</code> common to all hook types, are as follows: |
||
* <code>nop</code> - A shorthand for one or multiple sequential <code>patch</code> hooks with <code>60000000</code> (<code>nop</code>) as <code>data</code> |
|||
** '''Optional field:''' <code>count</code> (Default: <code>1</code>) - A positive non-zero decimal integer number, specifying how many <code>nop</code>’s to apply starting from <code>addr</code> |
|||
* <code>return</code> - A shortcut for a <code>patch</code> hook with <code>4E800020</code> (<code>blr</code>) as <code>data</code> |
|||
* <code>branch</code> - Inserts the respective branch instruction <code>instr</code> at <code>addr</code> jumping to the address of the symbol <code>func</code> |
|||
** '''Additional field:''' <code>instr</code> - The branch instruction to use: <code>b</code> or <code>bl</code> |
|||
** '''Additional field:''' <code>func</code> - The symbol whose address the branch instruction should jump to |
|||
* <code>funcptr</code> - Inserts the address of the symbol <code>func</code> at <code>addr</code> |
|||
** '''Additional field:''' <code>func</code> - The symbol whose address to write at <code>addr</code> |
|||
* <code>patch</code> - The most basic and versatile hook, simply writes <code>data</code> at <code>addr</code> (overwrites existing data) |
* <code>patch</code> - The most basic and versatile hook, simply writes <code>data</code> at <code>addr</code> (overwrites existing data) |
||
** '''Additional field:''' <code>data</code> - A value to be encoded according to the <code>datatype</code> field and inserted at the <code>addr</code>(s) |
** '''Additional field:''' <code>data</code> - A value to be encoded according to the <code>datatype</code> field and inserted at the <code>addr</code>(s) |
||
Line 341: | Line 404: | ||
**** The difference between <code>char[]</code> and <code>string</code> is a <code>char[]</code> doesn’t automatically null-terminate, uses '''ASCII''' encoding and doesn’t align. Meanwhile <code>string</code> does automatically null-terminate, can use configurable-encoding and aligns by 4. |
**** The difference between <code>char[]</code> and <code>string</code> is a <code>char[]</code> doesn’t automatically null-terminate, uses '''ASCII''' encoding and doesn’t align. Meanwhile <code>string</code> does automatically null-terminate, can use configurable-encoding and aligns by 4. |
||
***** Likewise, the difference between <code>wchar[]</code> and <code>wstring</code> is a <code>wchar[]</code> doesn’t automatically null-terminate and aligns by 2. Meanwhile <code>wstring</code> does automatically 2-byte null-terminate and aligns by 4. |
***** Likewise, the difference between <code>wchar[]</code> and <code>wstring</code> is a <code>wchar[]</code> doesn’t automatically null-terminate and aligns by 2. Meanwhile <code>wstring</code> does automatically 2-byte null-terminate and aligns by 4. |
||
***** To write a null character on a <code>char[]</code> or <code>wchar[]</code>, write down '' |
***** To write a null character on a <code>char[]</code> or <code>wchar[]</code>, write down <code>YAML_NULL</code>''<ref name=":0" />'' or use standard string escape sequences. |
||
**** Multidimensional arrays such as <code>int[][]</code> are not supported. |
**** Multidimensional arrays such as <code>int[][]</code> are not supported. |
||
**** '''Optional conditional field:''' <code>encoding</code> (Default: <code>Shift-JIS</code><ref name=":8" />) - This field specifies the encoding to encode the input string value with. |
**** '''Optional conditional field:''' <code>encoding</code> (Default: <code>Shift-JIS</code><ref name=":8">Specifically referring to the [https://icu4c-demos.unicode.org/icu-bin/convexp?conv=ibm-943_P15A-2003&s=ALL ibm-943_P15A-2003] encoding. Read [[#shift-jis-standardization|Shift-JIS Standardization]] section for details.</ref>) - This field specifies the encoding to encode the input string value with. |
||
***** This field may be included if <code>datatype</code> is either <code>string</code>, <code>wchar</code> or <code>wstring</code>. |
***** This field may be included if <code>datatype</code> is either <code>string</code>, <code>wchar</code> or <code>wstring</code>. |
||
****** If this field is present when <code>datatype</code> is not one of the above, an error must be thrown. |
****** If this field is present when <code>datatype</code> is not one of the above, an error must be thrown. |
||
Line 349: | Line 412: | ||
****** <code>UTF-8</code>, <code>UTF8</code>, <code>utf-8</code>, <code>utf8</code> (ONLY valid for <code>string</code> datatype, throw error if not used with it) |
****** <code>UTF-8</code>, <code>UTF8</code>, <code>utf-8</code>, <code>utf8</code> (ONLY valid for <code>string</code> datatype, throw error if not used with it) |
||
****** <code>UCS-2</code>, <code>UCS2</code>, <code>ucs-2</code>, <code>ucs2</code> (NOT valid for <code>string</code> datatype, throw error if used with it) |
****** <code>UCS-2</code>, <code>UCS2</code>, <code>ucs-2</code>, <code>ucs2</code> (NOT valid for <code>string</code> datatype, throw error if used with it) |
||
****** <code>Shift-JIS</code>, <code>ShiftJIS</code>, <code>shift-jis</code>, <code>shiftjis</code> (valid for all 3 encoding-compatible datatypes) |
****** <code>Shift-JIS</code>, <code>ShiftJIS</code>, <code>shift-jis</code>, <code>shiftjis</code> (valid for all 3 encoding-compatible datatypes)<ref name=":8" /> |
||
****** '''UCS-2 NOTES:''' |
****** '''UCS-2 NOTES:''' |
||
******* <code>UCS-2</code> is an obsolete predecessor of <code>UTF-16</code>, as such it may be difficult to find encoders for <code>UCS-2</code> specifically. The difference between the two encodings is simply <code>UCS-2</code> does not support multi-character codepoints (surrogates) to support characters beyond <code>U+FFFF</code>, therefore an implementation using a <code>UTF-16</code> encoder simply pre-validating all input characters to block any characters above <code>U+FFFF</code> is a valid implementation of <code>UCS-2</code>. Other things to note about <code>UCS-2</code> is that the encoding must be in '''big endian''' and a Byte Order Mark (BOM) should NOT be included. |
******* ''<code>UCS-2</code> is an obsolete predecessor of <code>UTF-16</code>, as such it may be difficult to find encoders for <code>UCS-2</code> specifically. The difference between the two encodings is simply <code>UCS-2</code> does not support multi-character codepoints (surrogates) to support characters beyond <code>U+FFFF</code>, therefore an implementation using a <code>UTF-16</code> encoder simply pre-validating all input characters to block any characters above <code>U+FFFF</code> is a valid implementation of <code>UCS-2</code>. Other things to note about <code>UCS-2</code> is that the encoding must be in '''big endian''' and a Byte Order Mark (BOM) should NOT be included.'' |
||
** <code>nop</code> - A shorthand for one or multiple sequential <code>patch</code> hooks with <code>60000000</code> (<code>nop</code>) as <code>data</code> |
|||
*** '''Optional field:''' <code>count</code> (Default: <code>1</code>) - A positive non-zero decimal integer number, specifying how many <code>nop</code>’s to apply starting from <code>addr</code> |
|||
** <code>return</code> - A shortcut for a <code>patch</code> hook with <code>4E800020</code> (<code>blr</code>) as <code>data</code> |
|||
** <code>branch</code> - Inserts the respective branch instruction <code>instr</code> at <code>addr</code> jumping to the address of the symbol <code>func</code> |
|||
*** '''Additional field:''' <code>instr</code> - The branch instruction to use: <code>b</code> or <code>bl</code> |
|||
*** '''Additional field:''' <code>func</code> - The symbol whose address the branch instruction should jump to |
|||
** <code>funcptr</code> - Inserts the address of the symbol <code>func</code> at <code>addr</code> |
|||
*** '''Additional field:''' <code>func</code> - The symbol whose address to write at <code>addr</code> |
|||
''All hook additional fields are required unless explicitly marked as optional.'' |
|||
===Symbol Maps=== |
===Symbol Maps=== |
||
The primary symbol map for a project is located at <code>{ProjectDir}/maps/main.map</code> and has a very basic syntax similar to any other symbol map file. |
The primary symbol map for a project is located at <code>{ProjectDir}/maps/main.map</code> and has a very basic syntax similar to any other symbol map file. |
||
Line 522: | Line 575: | ||
** '''Minimal Expected Output:''' A structurally valid patched <code>*.rpx</code> or <code>*.elf</code>, the output name of which is implementation-defined other than the required <code>.rpx</code> or <code>.elf</code> extension. |
** '''Minimal Expected Output:''' A structurally valid patched <code>*.rpx</code> or <code>*.elf</code>, the output name of which is implementation-defined other than the required <code>.rpx</code> or <code>.elf</code> extension. |
||
===Shift-JIS Standardization=== |
===Shift-JIS Standardization=== |
||
Put simply, [https://en.wikipedia.org/wiki/Shift_JIS Shift-JIS] has a complex and ancient history, which led to it being poorly documented and standardized, with many different variants all wanting to use the same or similar “Shift-JIS” name. The '''original Shift-JIS specification''', known as [https://www.sljfaq.org/afaq/encodings.html#encodings-JIS-X-0201 JIS X 0201] is NOT a fully compliant superset of ASCII, since it replaces the characters <code>\</code> (<code>U+005C</code>) and <code>~</code> (<code>U+007E</code>) with the characters <code>¥</code> (<code>U+00A5</code>) and <code>‾</code> (<code>U+203E</code>) respectively. This is inconsistency is problematic and for that reason when the western world adopted support for it, primarily Microsoft, those two characters were reverted to their original ASCII values, but the name “Shift-JIS” was kept, creating an ambiguous definition for the ASCII compliant Shift-JIS and the original Shift-JIS. |
Put simply, [https://en.wikipedia.org/wiki/Shift_JIS Shift-JIS] has a complex and ancient history, which led to it being poorly documented and standardized, with many different variants all wanting to use the same or similar “Shift-JIS” name. The '''original Shift-JIS specification''', known as [https://www.sljfaq.org/afaq/encodings.html#encodings-JIS-X-0201 JIS X 0201] is NOT a fully compliant superset of ASCII, since it replaces the characters <code>\</code> (<code>U+005C</code>) and <code>~</code> (<code>U+007E</code>) with the characters <code>¥</code> (<code>U+00A5</code>) and <code>‾</code> (<code>U+203E</code>) respectively. This is inconsistency is problematic and for that reason when the western world adopted support for it, primarily Microsoft, those two characters were reverted to their original ASCII values, but the name “Shift-JIS” was kept, creating an ambiguous definition for the ASCII compliant Shift-JIS<ref name=":8" /> and the original Shift-JIS. |
||
In the present day, the Unicode Consortium has standardized both encodings in the [https://icu.unicode.org/ ICU] separately: [https://icu4c-demos.unicode.org/icu-bin/convexp?conv=ibm-943_P130-1999&s=ALL ibm-943_P130-1999] (original Shift-JIS) and [https://icu4c-demos.unicode.org/icu-bin/convexp?conv=ibm-943_P15A-2003&s=ALL ibm-943_P15A-2003] (ASCII compliant Shift-JIS) |
In the present day, the Unicode Consortium has standardized both encodings in the [https://icu.unicode.org/ ICU] separately: [https://icu4c-demos.unicode.org/icu-bin/convexp?conv=ibm-943_P130-1999&s=ALL ibm-943_P130-1999] (original Shift-JIS) and [https://icu4c-demos.unicode.org/icu-bin/convexp?conv=ibm-943_P15A-2003&s=ALL ibm-943_P15A-2003] (ASCII compliant Shift-JIS<ref name=":8" />) |
||
For the purposes of this document, outside of the current section (“Shift-JIS Standardization”), '''all references to “Shift-JIS” refer to the ASCII compliant Shift-JIS (ibm-943_P15A-2003) variant'''. Implementations should check the Shift-JIS variant their encoders/decoders are using to ensure it is the correct one. The below tests can be used to check: |
For the purposes of this document, outside of the current section (“Shift-JIS Standardization”), '''all references to “Shift-JIS” refer to the ASCII compliant Shift-JIS'''<ref name=":8" /> '''(ibm-943_P15A-2003) variant'''. Implementations should check the Shift-JIS variant their encoders/decoders are using to ensure it is the correct one. The below tests can be used to check: |
||
* <code>\~</code> encoded to Shift-JIS<ref name=":8" /> '''must equal''' to the byte sequence <code>5C 7E</code> |
* <code>\~</code> encoded to Shift-JIS<ref name=":8" /> '''must equal''' to the byte sequence <code>5C 7E</code> |
Revision as of 00:29, 21 July 2023
!!!!! Still unfinished as of currently !!!!!
v3.0-DRAFT
Terms and Definitions
For the purposes of this document, the following terms and definitions apply. (Plural and variant forms of terms implicitly included)
Term | Definition |
---|---|
tool | Refers to any programs implementing this standard in compliance with it, and/or the developers working on said program. |
implementation-defined | Refers to behavior that is up to tools to define however they please, as long as it meets certain minimum expectations criteria that may be specified by the standard, if any. |
compiler | Currently this standard only acknowledges and accounts for GHS MULTI (Green Hills Software MULTI) as compiler toolchain for use. Any mention of words such as “compiler”, “linker”, “assembler”, etc. all refer to the respective tools of GHS MULTI implicitly. Many assumptions and choices of default and possible values for configurations are based on the behavior of GHS MULTI and may not work correctly with other compiler toolchains. Support for other compiler toolchains is open for the future under a new major revision, but there are none planned at this time. |
linker | |
assembler | |
/ | In all instances within this document where slashes (/ ) are used or referenced in the context of a file or folder path, it is implied that it is interchangeable with backslashes (\ ) for compatibility, and vice-versa.
Implementation note: When parsing file or folder paths tools must treat both |
\ | |
file name | In all instances within this document where file or folder names are used or referenced, including within file and folder paths, these names MUST meet the following requirements:
File and folder paths inherit these rules with only the added allowed characters |
folder name | |
file path | In all instances within this document where file or folder paths are used or referenced, these paths MUST be CASE-SENSITIVE to ensure compatibility with all filesystem formats.
Implementation note: This can be achieved with the following methods:
|
folder path |
YAML Types Definitions
Type | Definition |
---|---|
YAML_NULL | [1] |
Boolean | A standard generic YAML boolean, represented by either true or false word literals.
|
String | A standard generic YAML string, prior to further restrictions possibly applied to it by the field it's bound to, it may contain any character and have any length.
An empty String is considered equivalent to YAML_NULL[1] unless otherwise specified in a specific instance of String usage. |
Number | A standard generic YAML number, equivalent to a floating point value. |
Integer | [2] |
List<T> | [3]: A standard generic YAML List, an array of values of any length.
For all optional fields with this type, omitting the field or explicitly setting it to YAML_NULL[1] is equivalent to falling back to the default value, if any, otherwise being YAML_NULL[1] as fallback. An Empty List definition asserts truly no values, without using default values or fallbacks. |
Record<T> | [4]: A standard generic YAML Record, a map of indefinite key/value pairs. |
Versioning
This standard’s version follows a restricted subset of SemVer, where only Major and Minor version are present, and no other extensions allowed, for a resulting version format of {StandardMajor}.{StandardMinor}
. The standard does not have Patch versions or any other version extensions such as tags or metadata. (Draft revisions of the standard not effectively considered as "the standard" while in draft stage and therefore are exempt from these rules)
Tools in compliance with this standard MUST adapt part of their versioning scheme to match the version of this standard they currently support, using SemVer versioning, with the following additional conditions met:
- The tool’s Major version MUST match the StandardMajor version it supports.
- The tool’s Minor version MUST match the StandardMinor version it supports.
- The tool’s Patch version MUST represent the tool’s own Major version.
- The tool’s version Tag, if present, MUST represent the tool’s own Minor version as a valid integer.
- If omitted, the tool’s Minor version is
0
.
- If omitted, the tool’s Minor version is
The final required version format for compliant tools is {StandardMajor}.{StandardMinor}.{ToolMajor}-{ToolMinor}
. With the ending -{ToolMinor}
being optional if it is 0
. Tool Patch versions are not allowed.
An optional +{ToolMetadata}
tag at the very end (After -{ToolMinor}
if present) is also allowed to be included for use by the tool and may contain anything which SemVer allows on that field.
Environment Variables
The following standard environment variables should be read and used by tools for their respective purposes when applicable. All environment variables are optional to be defined or not by the user, tools should not rely on them as the ONLY source of user input.
Any environment variables set should have lower precedence against values passed through more specific sources of user input defined by the tool, such as command-line arguments.
GHS_ROOT
: If set, should contain the absolute path to the GHS MULTI folder containing themulti.exe
file. Used to locate the necessary GHS MULTI executables needed for a full project build.
Project Configuration
Project configuration is done through the main configuration file, {ProjectDir}/project.yaml
, containing the following fields:
YAML Key | YAML Value Type[5] | Description | Default Value? |
---|---|---|---|
WUAPPSVersion | String | The presence of this field indicates the project's (not the tool's!) compliance with the specified version of this standard.
Must follow the standard version format described in the Versioning chapter. Any input not matching the format must error. Tools must compare this field against their internal supported WUAPPS versions, if the project requests a version not supported by the tool, it must error. |
REQUIRED OPTION |
Name | String | The name of the project. | REQUIRED OPTION |
Variables | Record<String>[4] | Key/Value pairs of indefinite custom user-defined configuration variables.
KEYS are variable names, which may only contain alphanumerics[6] or underscore. VALUES are variable contents, which by themselves may contain any character, but are still bound by the restrictions of the fields they are used within. Variables can be referenced like UNIX variables as YAML Keys do NOT support variable interpolation. Variables may NOT be nested within each other. The literal character If a variable which does not exist is used somewhere, an error must be thrown by the tool. |
Empty Record |
RpxDir | String | Path to the RPX files folder. (relative to {ProjectDir} [7])[8]
|
"./rpxs"
|
ModulesBaseDir | String | Base path for module folders paths to resolve from. (relative to {ProjectDir} [7])[8]
If |
YAML_NULL [1]
|
SourcesBaseDir | String | Base path for source folders paths to resolve from. (relative to {ProjectDir} [7])[8]
If |
YAML_NULL [1]
|
IncludeDirs | List<String>[3] | List of paths to header folders. (relative to {ProjectDir} [7])[8]
|
["./include"]
|
BuildOptions | List<String>[3] | List of build options to pass to the compiler. If both buildoptions.txt and this field are present, they are merged together.
Works the same as buildoptions.txt but inlined into the project.yaml. (Whenever buildoptions.txt is referenced it is interchangeable with this option) |
YAML_NULL [1]
|
ExcludeDefaultBuildOptions | List<String>[3]
Boolean |
List of default build options defined by the standard (See the project.gpj chapter) to opt-out of.
The special value of The listed values MUST specify the FULL default option name, including the exact prefixing dashes, but EXCLUDING the option's value, as examples:
It is an error to attempt to exclude a non-default option or a malformed option like the above invalid examples. For default options with values, such as -kanji, where the user desires to override the value used by the option, the process is as one would expect:
Attempting to override a default option in buildoptions.txt without excluding it first is implementation-defined behavior. The default behavior being undefined behavior dependent on how the compiler driver interprets the duplicate options. (given some compiler options are known to specifically support multiple uses, while others do not) |
Empty List |
Modules | List<String>[3] | List of extensionless file paths of global modules to compile. If a module listed is not found or invalid, an error should be thrown. | Empty List |
MinAlign | Record<Integer[2]>[4] | Key/Value pairs of indefinite custom user-defined minimum alignments to use per ELF section by section name.
The definition of the record itself, empty or partial, does not void the default section's values unless they are explicitly redefined each. Example: All sections not declared, whether by defaults or explicit, shall have a default alignment of Implementation Notes:
|
{ .text: 0x20,
.rodata: 0x20,
.data: 0x20,
.bss: 0x40 }
|
Targets | Record<
Target[9], YAML_NULL[1] >[4] |
Key/Value pairs of indefinite target configurations. KEYS are target names. (At least 1 non-abstract target is required to be defined)
To create an “empty” target using all default settings, give it (the field itself) |
REQUIRED OPTION |
Target YAML Type Structure | |||
Abstract | Boolean | Whether this target is an abstract template for other targets to inherit from. Prevents target from being used directly for compilation. | false
|
Extends | String
List<String>[3] |
As a String, name of a target to inherit settings from, with potentially infinitely nested inheritance from the inherited target extending another, recursively.
As a List, list of names of targets to multi-inherit settings from, in reverse insertion order, without nesting allowed. Example: If either B or C have an Extends field, an error must be thrown, as multi-inheriting doesn't support multiple bases for preventing circular or self extensions. Inheritance Semantics: For non-List fields, the highest definition on the inheritance sequence (nested or multi) shall take priority. For List fields, each definition is merged onto the previous, unless they have special behavior defined, which the only 4 current List fields do have. Refer to the Notes at the end of the Target YAML Type Structure table for details on inheritance semantics for the current List fields. The fields Abstract and Extends do not participate in inheritance as their values are only in respect to the target definition itself, not the final resolved target data. For non-List fields of targets with extensions, two special String values can be used in place of their values:
|
YAML_NULL [1]
|
AddrMap | String | Name of maps/*.convmap file to use with this target. | {TargetName} [10]
|
BaseRpx | String | Name of {RpxDir}/*.rpx file to use with this target. | {TargetName} [10]
|
Remove/Modules | List<String>[3] | List of global/template modules to remove from compilation with this target. Values follow same rules as the global Modules field. The following extra rules apply:
It is an error to attempt to remove a module only to re-add it on the same template/target. It is an error to attempt to remove a module which exists but is not on the current modules list. |
Empty List |
Add/Modules | List<String>[3] | List of additional modules to compile with this target. Values follow same rules as the global Modules field.
The following extra rule applies: It is an error to attempt to add a template/target module already on the current modules list. |
Empty List |
Remove/BuildOptions | List<String>[3] | List of global/template build options to remove from use with this target. Values follow same rules as the global ExcludeDefaultBuildOptions field. | Empty List |
Add/BuildOptions | List<String>[3] | List of additional build options to use with this target. Values follow same rules as the global BuildOptions field. | Empty List |
Notes: | Notes for the 4 options above (Remove/Modules, Add/Modules, Remove/BuildOptions and Add/BuildOptions):
These 4 options should be processed in the order of removals first, then additions. It doesn't matter the order the user arranges the fields on their YAML, only the order below.
|
- |
Project Folder Structure
All WUAPPS-based projects must have a {ProjectDir}
[7] within which the project.yaml
and other project metadata files are located, this folder follows the following structure:
conv/
|
Your WUAPPS Conversion Maps files | |
maps/
|
Your symbol map and conversion maps files | |
main.map
|
Your primary symbol map file | |
buildoptions.txt
|
Optional file, stores extra build options (See the project.gpj chapter) | |
project.yaml
|
The main project configuration file |
Modules
⚠ Warning: The module system will be removed once the dynamic loader system is finished.
WUAPPS projects are currently structured in “modules”, enabled/disabled by the project.yaml
, which are also defined by YAML files, which in turn declare source files (C++ / Assembly) to compile and assemble as well as binary patches & hooks to apply directly to the base RPX file.
A module file may not be named project.yaml
(case-insensitive) to prevent conflicts. All other names that fit within the standard-wide file name rules (See Terms and Definitions) are valid.
Each module file is structured as follows:
YAML Key | YAML Value Type[5] | Description | Default Value? |
---|---|---|---|
Files | Record<List<String>[3]>[4] | Key/Value pairs of filetypes mapped to lists of paths to source files or folders for searching source files within.
KEYS must be one of the valid filetypes: All filetypes are independently optional and can be used or omitted in any combination. Folder paths ending in nothing, a trailing The |
Record<Empty List>[4] |
Hooks | List<Hook[9]> | List of patches & hooks to apply when this module is enabled. | Empty List |
Hook YAML Type Structure | |||
type | String | The type of the patch/hook, the the Patches & Hooks chapter for details. | REQUIRED OPTION |
addr | String
List<String> |
A stringified hex number with 0x prefix, indicating where to apply the patch/hook.
Can also be a list of multiple of the above, for applying the same patch/hook at multiple locations at once. |
REQUIRED OPTION |
???? | - | Other hook-type specific fields exist, see the Patches & Hooks chapter for details. | - |
<<< UNFINISHED BEYOND THIS POINT >>>
Patches & Hooks
This section documents the different types of structures you can write on the Hooks
list of a module. The current valid types, and their extra fields besides the base type
and addr
common to all hook types, are as follows:
nop
- A shorthand for one or multiple sequentialpatch
hooks with60000000
(nop
) asdata
- Optional field:
count
(Default:1
) - A positive non-zero decimal integer number, specifying how manynop
’s to apply starting fromaddr
- Optional field:
return
- A shortcut for apatch
hook with4E800020
(blr
) asdata
branch
- Inserts the respective branch instructioninstr
ataddr
jumping to the address of the symbolfunc
- Additional field:
instr
- The branch instruction to use:b
orbl
- Additional field:
func
- The symbol whose address the branch instruction should jump to
- Additional field:
funcptr
- Inserts the address of the symbolfunc
ataddr
- Additional field:
func
- The symbol whose address to write ataddr
- Additional field:
patch
- The most basic and versatile hook, simply writesdata
ataddr
(overwrites existing data)- Additional field:
data
- A value to be encoded according to thedatatype
field and inserted at theaddr
(s) - Optional field:
datatype
(Default:raw
) - A string representing a C++ data type to interpret the value ofdata
as, the supported types are:raw
: A sequence of hex bytes, value ofdata
should be a string.f32
/f64
/float
/double
: A 32/64 bit floating point number, value ofdata
should be a numeric literal within the specified type’s range.u8
/u16
/u32
/u64
/uchar
/ushort
/uint
/ulonglong
: A 8/16/32/64 bit unsigned integer, value ofdata
should be a numeric integer literal within the specified type’s range.s8
/s16
/s32
/s64
/schar
/short
/int
/longlong
: A 8/16/32/64 bit signed integer, value ofdata
should be a numeric integer literal within the specified type’s range.char
: A 8 bit ASCII encoded character, value ofdata
should be a string literal with exactly one character located inside the ASCII character set range (0x00-0x7F
). Invalid ASCII characters should trigger an error.wchar
: A 16 bit configurable-encoding encoded character, value ofdata
should be a string literal with exactly one character fitting inside a 2-byte space or less in the chosen encoding (after conversion from UTF8).- Characters which don’t fit should trigger an error. Characters which encode to only 1-byte in the chosen encoding should be big-endian null-padded.
string
: A null-terminated configurable-encoding C string. Single byte null terminator is automatically added. Value ofdata
should be a string literal with only valid characters for the chosen encoding (after conversion from UTF8).- Invalid characters of the chosen encoding should trigger an error.
wstring
: A null-terminated configurable-encoding wide string. 2-byte null terminator is automatically added. Value ofdata
should be a string literal with only valid characters for the chosen encoding (after conversion from UTF8).- Invalid characters of the chosen encoding should trigger an error.
#[]
: Where#
is any of the types above, you may suffix a type with[]
to make an array of it. Value ofdata
will be an array of values of the respective type.- Array types enforce their
addr
to be aligned by the size of its elements.string
andwstring
are 4-byte aligned, therefore all strings in an array must be null-padded to have a byte length multiple of 4, including the last string in the array.
- The difference between
char[]
andstring
is achar[]
doesn’t automatically null-terminate, uses ASCII encoding and doesn’t align. Meanwhilestring
does automatically null-terminate, can use configurable-encoding and aligns by 4.- Likewise, the difference between
wchar[]
andwstring
is awchar[]
doesn’t automatically null-terminate and aligns by 2. Meanwhilewstring
does automatically 2-byte null-terminate and aligns by 4. - To write a null character on a
char[]
orwchar[]
, write downYAML_NULL
[1] or use standard string escape sequences.
- Likewise, the difference between
- Multidimensional arrays such as
int[][]
are not supported. - Optional conditional field:
encoding
(Default:Shift-JIS
[11]) - This field specifies the encoding to encode the input string value with.- This field may be included if
datatype
is eitherstring
,wchar
orwstring
.- If this field is present when
datatype
is not one of the above, an error must be thrown.
- If this field is present when
- Valid encoding values and their aliases are:
UTF-8
,UTF8
,utf-8
,utf8
(ONLY valid forstring
datatype, throw error if not used with it)UCS-2
,UCS2
,ucs-2
,ucs2
(NOT valid forstring
datatype, throw error if used with it)Shift-JIS
,ShiftJIS
,shift-jis
,shiftjis
(valid for all 3 encoding-compatible datatypes)[11]- UCS-2 NOTES:
UCS-2
is an obsolete predecessor ofUTF-16
, as such it may be difficult to find encoders forUCS-2
specifically. The difference between the two encodings is simplyUCS-2
does not support multi-character codepoints (surrogates) to support characters beyondU+FFFF
, therefore an implementation using aUTF-16
encoder simply pre-validating all input characters to block any characters aboveU+FFFF
is a valid implementation ofUCS-2
. Other things to note aboutUCS-2
is that the encoding must be in big endian and a Byte Order Mark (BOM) should NOT be included.
- This field may be included if
- Array types enforce their
- Additional field:
Symbol Maps
The primary symbol map for a project is located at {ProjectDir}/maps/main.map
and has a very basic syntax similar to any other symbol map file.
- Whitespace is free-form
#
is used for comments- Both full line and end-of-line comments are supported
- There is no multi-line comment support
- Semicolon-separated list of key value pairs in the format:
SYMBOL = ADDRESS;
- Where
SYMBOL
is the symbol’s text - Where
VALUE
is the symbol’s address as a hex number with a required0x
prefix- Alternatively, if
VALUE
does not start with0x
, it shall be interpreted as a previously defined symbol namedVALUE
which instructs the parser to re-use that symbol’s address for the current one.- If the referenced symbol is not found an error must be thrown.
- Alternatively, if
- Where
- Anything not fitting the above syntax rules is a syntax error
The primary symbol map is not the one actually given to the compiler, as it must be converted to the format expected by the compiler (*.x
) and addresses must be converted to different regions according to the build targets through the maps/*.convmap
files.
Conversion Maps
The *.convmap
files inside the {ProjectDir}/maps
folder are used for converting the addresses of the primary symbol map ({ProjectDir}/maps/main.map
) to different regions and versions of the game/app being modified.
The addresses in the primary symbol map can be of any region/version of your choosing, but must be consistent throughout the map.
For build targets of the same region/version as your primary symbol map addresses, where no conversion is necessary, a matching *.convmap
file is not required and use a nullish value wherever an address map is requested.
The conversion map files support //
comments at both start and end of lines, and /* ... */
multi-line comments.
Conversion maps are defined by the following EBNF syntax: (excluding the comments which may be arbitrarily placed within any S
token)
/*
<optional>
^ = assert start of line
$ = assert end of line
*/
/* Common */
S = [\r\n\t\f\v ]* ; /* optional whitespace */
Z = [\r\n\t\f\v ]+ ; /* required whitespace */
hex_literal = '0x' hex_literal_no_prefix ;
hex_literal_no_0x = [A-Fa-f0-9]{1,8} ;
decimal_literal = '0' | [1-9] [0-9]* ;
integer_literal = decimal_literal | hex_literal ;
word = [A-Za-z] ;
/* Convmap */
start = S <text_addr data_addr> statement* EOF ;
text_addr = 'TextAddr' S '=' S hex_literal S ';' S ;
data_addr = 'DataAddr' S '=' S hex_literal S ';' S ;
statement = <range_offset | platform_directive> S ;
range_offset = range S ':' S ('+' | '-') integer_literal S ';' ;
range = hex_literal_no_0x S '-' S hex_literal_no_0x ;
platform = 'Emulator' | 'Console' <'=' word> ;
platform_directive = ^ '.platform' Z platform <Z 'extends' Z platform> ;
- As clarification, notice that
integer_literal
is a value separate from the sign (+ or -), therefore theu32
range restriction does NOT prevent negative numbers, but rather specifies the max.u32
value as the maximum value for both positive and negative inputs.
Besides address conversion offsets, the *.convmap
files also store the start address of where the custom text and data section groups will be placed in memory at runtime. For targets targetting only emulators (Cemu), these are not necessary and should be omitted to be auto-calculated, but for targets targetting real Wii U hardware, they must be provided as they cannot be auto-calculated due to real hardware shifting the addresses on load.
TextAddr = ADDRESS;
DataAddr = ADDRESS;
Where ADDRESS
is a hex (0x
-prefixed) integer inside the u32
value range. If these values are not provided for a console-targetting build it will cause build errors.
Anything not fitting either of the two above syntax rulesets is a syntax error.
Additionally, once these two values are obtained, whether by manual setting or automatic calculation, tools should automatically add to compilation runs the standard defines TEXT_ADDR
and DATA_ADDR
, respectively setting them to their corresponding value as a hex u32
literal.
Auto-calculating text and data addresses
When automatically calculating the values for TextAddr
and DataAddr
if they are omitted, tools should follow the following steps:
- Open and parse the base RPX file used, locating the ELF section of type
0x80000004
(SHT_RPL_FILEINFO
) - If the section is not found, the base RPX is malformed, abort and error.
- Parse the found section’s data, this does not need to be a full parse and can be implementation-defined, as long as the following field values are correctly read in their respective sizes:
u32
at+0x00
:MAGIC_AND_VERSION
u32
at+0x08
:TEXT_ALIGN
u32
at+0x10
:DATA_ALIGN
- If
MAGIC_AND_VERSION
is NOT0xCAFE0402
, the RPX version is unknown or malformed, abort and error. - Store the other two values for later use, then locate the ELF section named
.text
- If the section is not found, the base RPX is malformed, abort and error.
- Calculate its end address through the formula:
TEXT_END = section.addr + section.size + 1
- Round up the result through the formula:
TEXT_ADDR = ceil(TEXT_END / TEXT_ALIGN) * TEXT_ALIGN
- You now have the value of
TEXT_ADDR
, store it and use as needed for other operations. - Find all ELF sections named
.data
,.rodata
, and/or.bss
. If any of them are not found, ignore it and move on. - If NONE are found, stop here and use
0x10000000
as default value forDATA_ADDR
. - If only one is found, skip this step. Find the one with the highest start address.
- Calculate the chosen section’s end address through the formula:
DATA_END = section.addr + section.size + 1
- Round up the result through the formula:
DATA_ADDR = ceil(DATA_END / DATA_ALIGN) * DATA_ALIGN
- You now have the value of
DATA_ADDR
, store it and use as needed for other operations.
Configuring project.gpj
Generation of the project.gpj
is arguably the main task of a WUAPPS tool, the final result combining all of the information given to it through project.yaml
and other means, for the GPJ file to then be given to the compiler driver (currently only gbuild
as GPJ is a format specific to it in the first place) for the actual compilation process to be performed.
This file specifies the build options, settings to pass to the compiler, linker and assembler, the list of files to compile or perform other tasks with, among possibly other things. Please note the GPJ format syntax itself is specified by GHS MULTI and not this standard.
The generated project.gpj
MUST always start with the following structure:
#!gbuild primaryTarget=ppc_cos_ndebug.tgt [Project]
This structure is technically all that is minimally required for a “valid GPJ file”, however it is functionally useless as it will not compile anything, essentially a no-op GPJ file.
After the starting structure, a list of tab-indented, newline separated (but with multiple space-separated entries allowed per line), global build options in CLI flags form is placed. The indentation of 1 tab at the start of each line of an option is REQUIRED, as a non-tab indented line signals the end of the global build options GPJ section. The following build options are UNCONDITIONALLY REQUIRED to be specified at all times, users should not be allowed to remove or modify them in any way:
-MD
-> Enables generation of{ProjectDir}/objs/*.d
files for incremental compilation in future runs. To optionally provide the option to make a build without incremental compilation, tools should do so by deleting the{ProjectDir}/objs
folder.-cpu=espresso
-> Espresso is the name of the CPU used on the Wii U, the only currently supported target platform.-sda=none
-> The use of SDA (or ZDA) creates additional ELF data sections in the compiled output file, which are currently not accounted for anywhere in the standard and will disrupt tool operations such as binary patching, address calculations, and others.- As such use of SDA (or ZDA) is currently unsupported and should not be allowed by tools.
--no_commons
-> Required for the same reason as-sda=none
.-object_dir=
-> Path relative to theproject.gpj
, configures the output objects (.o files) folder, the value (path) of the flag is implementation-defined but the flag itself is required to be present.
After the above options, the following REQUIRED AS DEFAULTS options are included, but each only if the user did not override or opt-out of it (the methods for overriding and opting out are defined by the ExcludeDefaultBuildOptions
setting in project.yaml
):
-c99
--g++
--link_once_templates
--enable_noinline
--max_inlining
--no_exceptions
--no_rtti
--no_implicit_include
-no_ansi_alias
-only_explicit_reg_use
-kanji=shiftjis
-Ospeed
-Onounroll
-Dcafe
Please note the following:
- The local order (between themselves) of the options does not matter.
- The usage of
-
or--
DOES MATTER, even for full word options, dash styles are NOT interchangeable to GHS MULTI, the exact option dash style must be used for each option (For example,-kanji
is valid but--kanji
is not).
After the required and default options (with exclusions and overrides performed) have been added, the user’s own settings are added from the file ${ProjectDir}/buildoptions.txt
. If the file does not exist, the user has no custom build options and tools should silently proceed. The format for the build options file is the same as the GPJ’s global build options section, minus the required tab indents. Tools should not need (but may, for extended implementation-defined behavior) to “parse” the file contents beyond simply copying each line of it and appending them to the GPJ’s build options section, with the extra tab indent added to the start of each line.
After all global build options have been specified comes the Files List section, in which the files to be included in the compilation run are listed, in the form of non-intended (the first non-indented line in the file marks the end of the global build options and start of the File List), newline-separated file paths relative to the project.gpj
. For tools, the File List should be generated from the resulting list of files from merging the Files
field list of every module included in the current compilation target. Additional files may be appended from implementation-defined sources, as long as they are not required for successful compilation, so other standard compliant tools can still compile the project.
For each entry of the File List, the compiler driver generally assumes what to do with each file by mapping certain well-known file extensions to groups of File Types, from which it determines what to do with the file.
There are several types, but only the following are relevant to us:
C
(Extensions:.c
) Action: Compile with the C compilerC++
(Extensions:.cc
,.cpp
,.cxx
,.c++
,.C
,.CXX
,.CPP
) Action: Compile with the C++ compilerAssembly
(Extensions:.s
,.asm
,.ppc
) Action: Assemble with the PowerPC assembler- Note: The
.ppc
is special and enables further behavior of preprocessing the file with the C preprocessor.
- Note: The
Text
(The fallback type if it can’t recognize an extension) Action: Silently ignore file
Everything above is exact and case-sensitive. This creates an issue for users which may want to use extensions not listed here for their source files, such as .S
for assembly files. For this scenario the GPJ format allows at the end of a File List path entry for a [TYPE]
structure to be placed, where TYPE
is the appropriate File Type desired for that file. Tools should apply the types of the Files Record on module files to this structure for explicit type mapping.
Console vs Emulator Compilation
All compilation targets and templates are in theory, designed to be able to be compiled for both console and emulator. In practice, this may not be possible on a per-project basis due to project-specific requirements for targets of console and emulator respectively, such as special defines and modules. Implementations must support all possible scenarios, dual-compilation of the same target or console-only and emulator-only targets.
Note: Simultaneous dual-compilation for both console and emulator is not required, implementations are allowed to require separate runs with different program arguments for this task.
When allowing the user to select if the target will be compiled for console or emulator, through whichever means chosen by the tool, it MUST support a variable string value (NOT a boolean toggle) for the console targetting setting input, as a future-proofing mechanism to support multiple console compilation methods. (But setting a tool-chosen default is valid)
In order to make dual-compilation of targets possible, the most common use-case of having a special define to determine whether compilation is being done for emulator or console shall be built-in into implementations under a few standard defines:
-DPLATFORM_IS_EMULATOR=<0|1>
-DPLATFORM_IS_CONSOLE=<0|1>
-DPLATFORM_IS_CONSOLE_CAFELOADER=<0|1>
The value of each define, 0 or 1, is determined by the compilation method used as specified below.
Compilation Methods
Below are the currently valid values implementations must support as input to the console compilation input setting, alongside further information on implementation details of each method.
CafeLoader
-DPLATFORM_IS_EMULATOR=0
build option is set.-DPLATFORM_IS_CONSOLE=1
build option is set.-DPLATFORM_IS_CONSOLE_CAFELOADER=1
build option is set.- Special Requirements: Requires
TextAddr
andDataAddr
to be manually provided in the selected target’s address map. - Minimal Expected Output: Structurally valid
Addr.bin
,Code.bin
,Data.bin
andPatches.hax
files, refer to the CafeLoader documentation for further information on correctly generating these files.
none
or unset (equivalent to emulator compilation)-DPLATFORM_IS_EMULATOR=1
build option is set.-DPLATFORM_IS_CONSOLE=0
build option is set.-DPLATFORM_IS_CONSOLE_CAFELOADER=0
build option is set.- Special Requirements: None
- Minimal Expected Output: A structurally valid patched
*.rpx
or*.elf
, the output name of which is implementation-defined other than the required.rpx
or.elf
extension.
Shift-JIS Standardization
Put simply, Shift-JIS has a complex and ancient history, which led to it being poorly documented and standardized, with many different variants all wanting to use the same or similar “Shift-JIS” name. The original Shift-JIS specification, known as JIS X 0201 is NOT a fully compliant superset of ASCII, since it replaces the characters \
(U+005C
) and ~
(U+007E
) with the characters ¥
(U+00A5
) and ‾
(U+203E
) respectively. This is inconsistency is problematic and for that reason when the western world adopted support for it, primarily Microsoft, those two characters were reverted to their original ASCII values, but the name “Shift-JIS” was kept, creating an ambiguous definition for the ASCII compliant Shift-JIS[11] and the original Shift-JIS.
In the present day, the Unicode Consortium has standardized both encodings in the ICU separately: ibm-943_P130-1999 (original Shift-JIS) and ibm-943_P15A-2003 (ASCII compliant Shift-JIS[11])
For the purposes of this document, outside of the current section (“Shift-JIS Standardization”), all references to “Shift-JIS” refer to the ASCII compliant Shift-JIS[11] (ibm-943_P15A-2003) variant. Implementations should check the Shift-JIS variant their encoders/decoders are using to ensure it is the correct one. The below tests can be used to check:
\~
encoded to Shift-JIS[11] must equal to the byte sequence5C 7E
¥‾
encoded to Shift-JIS[11] must equal to the byte sequence5C 7E
- the byte sequence
5C 7E
decoded from Shift-JIS[11] must equal to\~
- Shift-JIS[11] decoding functionality is currently not required for implementing the standard, but is included here for future-proofing.
- ↑ 1.00 1.01 1.02 1.03 1.04 1.05 1.06 1.07 1.08 1.09 1.10 1.11 1.12 Refers to a null literal inside a YAML file using either literal
null
word syntax or the~
shorthand syntax. - ↑ 2.0 2.1 Note that the Integer type is arbitrary and does not actually exist in YAML, as YAML treats all numbers as floats, it is required that tools instead validate (NOT coerce!) the received YAML float number as an integer, and error otherwise.
- ↑ 3.00 3.01 3.02 3.03 3.04 3.05 3.06 3.07 3.08 3.09 3.10 The syntax
List<T>
indicates a YAML List whose entries contain values only of the YAML TypeT
- ↑ 4.0 4.1 4.2 4.3 4.4 4.5 The syntax
Record<T>
indicates a YAML Record whose keys contain values only of the YAML TypeT
- ↑ 5.0 5.1 Tools must error upon encountering any mismatched types from the ones specified here. No form of type coercion should be done. Multiple types on different lines for the same field means "OR", i.e. either type is valid.
- ↑
a-z
,A-Z
,0-9
- ↑ 7.0 7.1 7.2 7.3 7.4 The folder
project.yaml
is located within - ↑ 8.0 8.1 8.2 8.3
.
={ProjectDir}
Paths may also use one or more../
to refer to folders anywhere outside of{ProjectDir}
- ↑ 9.0 9.1 Custom YAML Record structure type, defined below.
- ↑ 10.0 10.1 10.2 The name of the target, as specified by its YAML key.
- ↑ 11.0 11.1 11.2 11.3 11.4 11.5 11.6 11.7 11.8 Specifically referring to the ibm-943_P15A-2003 encoding. Read Shift-JIS Standardization section for details.