Other:WUAPPS: Difference between revisions

From Zenith
Jump to navigation Jump to search
Content added Content deleted
(dont need index anymore)
(monkey (not done))
Line 46: Line 46:
{| class="wikitable"
{| class="wikitable"
!YAML Key
!YAML Key
!YAML Value Type<ref>Tools must error upon encountering any mismatched types from the ones specified here. No form of type coercion should be done.
!YAML Value Type<ref name=":3">Tools must error upon encountering any mismatched types from the ones specified here. No form of type coercion should be done.


Multiple types on different lines for the same field means "OR", i.e. either type is valid.</ref>
Multiple types on different lines for the same field means "OR", i.e. either type is valid.</ref>
!Description
!Description
!Default Value
!Default Value?
|-
|-
|WUAPPSVersion
|WUAPPSVersion
Line 56: Line 56:
|The presence of this field indicates the ''project's'' (not the tool's!) compliance with the specified version of this standard.
|The presence of this field indicates the ''project's'' (not the tool's!) compliance with the specified version of this standard.
Must follow the standard version format described in the [[Other:WUAPPS#Versioning|Versioning]] chapter. Any input not matching the format must error.
Must follow the standard version format described in the [[Other:WUAPPS#Versioning|Versioning]] chapter. Any input not matching the format must error.

Tools must compare this field against their internal supported WUAPPS versions, if the project requests a version not supported by the tool, it must error.
Tools must compare this field against their internal supported WUAPPS versions, if the project requests a version not supported by the tool, it must error.
|'''<small>''REQUIRED OPTION''</small>'''
|'''<small>''REQUIRED OPTION''</small>'''
Line 68: Line 69:
|Key/Value pairs of indefinite custom user-defined configuration variables.
|Key/Value pairs of indefinite custom user-defined configuration variables.
'''KEYS''' are variable names, which may only contain alphanumerics<ref><code>a-z</code>, <code>A-Z</code>, <code>0-9</code></ref> or underscore.
'''KEYS''' are variable names, which may only contain alphanumerics<ref><code>a-z</code>, <code>A-Z</code>, <code>0-9</code></ref> or underscore.

'''VALUES''' are variable contents, which by themselves may contain any character, but are still bound by the restrictions of the fields they are used within.
'''VALUES''' are variable contents, which by themselves may contain any character, but are still bound by the restrictions of the fields they are used within.

These variables can be referenced like UNIX variables in the form <code>$VARIABLE_NAME</code> inside of any '''String YAML Value''' of any field or array within '''project.yaml''' or module YAMLs.
Variables can be referenced like UNIX variables as <code>$VARIABLE_NAME</code> inside of any '''String YAML Value''' of any field or array within '''project.yaml''' or module YAMLs.

'''YAML Keys''' do NOT support variable interpolation. Variables may NOT be nested within each other.
'''YAML Keys''' do NOT support variable interpolation. Variables may NOT be nested within each other.

The literal character <code>$</code> cannot be used, as it isn't a valid character for file names and paths anyway, which are the primary use-case of variables.
The literal character <code>$</code> cannot be used, as it isn't a valid character for file names and paths anyway, which are the primary use-case of variables.

If a variable which does not exist is used somewhere, an error must be thrown by the tool.
If a variable which does not exist is used somewhere, an error must be thrown by the tool.
|Empty Record
|Empty Record
Line 77: Line 83:
|RpxDir
|RpxDir
|String
|String
|Path to the RPX files folder. (relative to <code>{ProjectDir}</code><ref name=":4">The folder <code>project.yaml</code> is located within</ref>)<ref name=":5"><code>.</code> = <code>{ProjectDir}</code>Paths may also use one or more <code>../</code> to refer to folders anywhere outside of <code>{ProjectDir}</code></ref>
|Path to the RPX files folder
|<code>"./rpxs"</code>
|<code>"./rpxs"</code>
|-
|-
|ModulesBaseDir
|ModulesBaseDir
|String
|String
|Base path for module folders paths to resolve from. If <code>YAML_NULL</code><ref name=":0" />, module paths will resolve from the location of the project.yaml file.
|Base path for module folders paths to resolve from. (relative to <code>{ProjectDir}</code><ref name=":4" />)<ref name=":5" />
If <code>YAML_NULL</code><ref name=":0" />, module paths will resolve from the location of the project.yaml file.
|<code>YAML_NULL</code><ref name=":0" />
|<code>YAML_NULL</code><ref name=":0" />
|-
|-
|SourcesBaseDir
|SourcesBaseDir
|String
|String
|Base path for source folders paths to resolve from. If <code>YAML_NULL</code><ref name=":0" />, source paths will resolve from the location of the module file they are in.
|Base path for source folders paths to resolve from. (relative to <code>{ProjectDir}</code><ref name=":4" />)<ref name=":5" />
If <code>YAML_NULL</code><ref name=":0" />, source paths will resolve from the location of the module file they are in.
|<code>YAML_NULL</code><ref name=":0" />
|<code>YAML_NULL</code><ref name=":0" />
|-
|-
|IncludeDirs
|IncludeDirs
|List<String><ref name=":2">The syntax <code>List<T></code> indicates a YAML List whose entries contain values only of the YAML Type <code>T</code></ref>
|List<String><ref name=":2">The syntax <code>List<T></code> indicates a YAML List whose entries contain values only of the YAML Type <code>T</code></ref>
|List of paths to header folders
|List of paths to header folders. (relative to <code>{ProjectDir}</code><ref name=":4" />)<ref name=":5" />
|<code>["./include"]</code>
|<code>["./include"]</code>
|-
|-
Line 106: Line 114:
|List of default build options defined by the standard (See the [[Other:WUAPPS#Configuring project.gpj|project.gpj]] chapter) to opt-out of.
|List of default build options defined by the standard (See the [[Other:WUAPPS#Configuring project.gpj|project.gpj]] chapter) to opt-out of.
The special value of <code>true</code> can be used to easily exclude ALL of the default options. The value of <code>false</code> equals to the default empty list.
The special value of <code>true</code> can be used to easily exclude ALL of the default options. The value of <code>false</code> equals to the default empty list.

The listed values MUST specify the FULL default option name, including the exact prefixing dashes, but EXCLUDING the option's value, as examples:
The listed values MUST specify the FULL default option name, including the exact prefixing dashes, but EXCLUDING the option's value, as examples:


Line 116: Line 125:


* First exclude "-kanji" by adding it on this list, then define the new -kanji value on '''buildoptions.txt'''
* First exclude "-kanji" by adding it on this list, then define the new -kanji value on '''buildoptions.txt'''
* Attempting to override a default option in '''buildoptions.txt''' without excluding it first is implementation-defined behavior. The default behavior being '''undefined behavior''' dependent on how the compiler driver interprets the duplicate options, given some compiler options are known to specifically support multiple uses, while others do not.
Attempting to override a default option in '''buildoptions.txt''' without excluding it first is implementation-defined behavior.

The default behavior being '''undefined behavior''' dependent on how the compiler driver interprets the duplicate options.

(given some compiler options are known to specifically support multiple uses, while others do not)
|Empty List
|Empty List
|-
|-
Line 127: Line 140:
|Record<Integer<ref>Note that the Integer type is arbitrary and does not actually exist in YAML, as YAML treats all numbers as floats, it is required that tools instead '''validate''' (NOT coerce!) the received YAML float number as an integer, and error otherwise.</ref>><ref name=":1" />
|Record<Integer<ref>Note that the Integer type is arbitrary and does not actually exist in YAML, as YAML treats all numbers as floats, it is required that tools instead '''validate''' (NOT coerce!) the received YAML float number as an integer, and error otherwise.</ref>><ref name=":1" />
|Key/Value pairs of indefinite custom user-defined minimum alignments to use per ELF section by section name.
|Key/Value pairs of indefinite custom user-defined minimum alignments to use per ELF section by section name.
The definition of the record itself, empty or partial, does not void the default section's values unless they are explicitly redefined each. Example:
The definition of the record itself, empty or partial, does not void the default section's values unless they are explicitly redefined each.

<code>MinAlign: { .text: 0x80 }</code> does NOT affect '''.rodata''', '''.data''' and '''.bss''' default values.
''Example:'' <code>MinAlign: { .text: 0x80 }</code> does NOT affect '''.rodata''', '''.data''' and '''.bss''' default values, the defaults will still be used.


All sections not declared, whether by defaults or explicit, shall have a default alignment of <code>0</code> (no minimum alignment)
All sections not declared, whether by defaults or explicit, shall have a default alignment of <code>0</code> (no minimum alignment)

''Implementation Notes:''<code>moduleAlignment = max(module section alignment, global minimum section alignment)</code><code>sectionAlignment = max(section alignments of all modules)</code>
''Implementation Notes:''

<code>moduleAlignment = max(module section alignment, global minimum section alignment)</code>

<code>sectionAlignment = max(section alignments of all modules)</code>
|<syntaxhighlight lang="yaml">
|<syntaxhighlight lang="yaml">
{ .text: 0x20,
{ .text: 0x20,
Line 140: Line 159:
|-
|-
|Targets
|Targets
|Record<
|Record<Target<ref>Custom struct type, defined below.</ref>>
Target<ref name=":6">Custom YAML Record structure type, defined below.</ref>
|Key/Value pairs of indefinite target configurations.
YAML_NULL<ref name=":0" />
><ref name=":1" />
|Key/Value pairs of indefinite target configurations. '''KEYS''' are target names. (At least 1 non-abstract target is required to be defined)
To create an “empty” target using all default settings, give it (the field itself) <code>YAML_NULL</code> as the value.
|'''<small>''REQUIRED OPTION''</small>'''
|'''<small>''REQUIRED OPTION''</small>'''
|}
|-
! colspan="4" |Target YAML Type Structure
// TODO Finish this //<syntaxhighlight lang="yaml">Targets: # key/value pairs of indefinite target configurations
|-
Template/{TemplateName}: # a target configuration template, identified by the "Template/" prefix (use of templates is optional)
|Abstract
AddrMap: # name of conv/*.offs file to use with this target (default: {TemplateName})
|Boolean
# if YAML_NULL, no address map is used. The raw addresses from main.map are used directly.
|Whether this target is an abstract template for other targets to inherit from. Prevents target from being used directly for compilation.
BaseRpx: # name of {RpxDir}/*.rpx file to use with this target (default: {TemplateName})
|<code>false</code>
|-
|Extends
|String
List<String><ref name=":2" />
|As a '''String''', name of a target to inherit settings from, with potentially infinitely nested inheritance from the inherited target extending another, recursively.
As a '''List''', list of names of targets to multi-inherit settings from, in reverse insertion order, without nesting allowed.
''Example:'' <code>A: Extends: [C, B]</code> makes '''A''' inherit from '''B''', and then the resulting '''A+B''' intermediate target inherits from '''C''', producing the final '''A+B+C''' target.


If either '''B''' or '''C''' have an '''Extends''' field, an error must be thrown, as multi-inheriting doesn't support multiple bases for preventing circular or self extensions.
# (Remove|Add)/(Modules|BuildOptions) fields should be processed in the order of removals first, then additions.
'''''Inheritance Semantics:'''''For non-List fields, the highest definition on the inheritance sequence (nested or multi) shall take priority.
# It doesn't matter the order the user arranges the fields on their YAML, only the order specified here.
For List fields, each definition is merged onto the previous, unless they have special behavior defined, '''which the only 4 current List fields do have.'''
#
Refer to the '''Notes''' at the end of the '''Target YAML Type Structure''' table for details on inheritance semantics for the current List fields.
# 1. Before all, load the global Modules and BuildOptions lists.
# 2. If processing a target extending a template, process the template first.
# 3. Remove the unwanted modules and build options listed by the template/target from their lists.
# 4. Add the extra wanted modules and build options listed by the template/target to their lists.
# 5. If just processed a template, cycle back to step 3 and process the extending target.
#
# See each of the 4 fields below for additional rules for each step.


The fields '''Abstract''' and '''Extends''' do not participate in inheritance as their values are only in respect to the target definition itself, not the final resolved target data.
Remove/Modules:
For non-List fields of targets with extensions, two special String values can be used in place of their values:
- # list of global/template modules to exclude from compilation with this target (default: [])
- # Same rules as global Modules option apply. Plus additional ones below:
- # It is an error to attempt to exclude a module only to re-add it on the same template/target.
- # It is an error to attempt to exclude a module which exists but is not on the current modules list.
- # ...
Add/Modules:
- # list of additional modules to compile with this target (default: [])
- # Same rules as global Modules option apply. Plus additional ones below:
- # It is an error to attempt to add a template/target module already on the current modules list.
- # ...
Remove/BuildOptions:
- # works the same as ExcludeDefaultBuildOptions but excluding from all the build options
- # currently in effect for the target being processed (default: [])
- # ...
Add/BuildOptions:
- # works the same as BuildOptions but appending them to the existing ones (default: {})
- # ...


* <code>@inherit</code>: This is the default special value used for all fields not explicitly defined in a target with extensions, it signals to use the default value for the inherited target.
{TargetName}: # a single target configuration (At least 1 required)
* <code>@self</code>: Opposite to '''@inherit''', signals that the option should use its default value for the current target. This affects options whose default value is <code>{TargetName}</code><ref name=":7">The name of the target, as specified by its YAML key.</ref>.
Extends: # name of a template target to inherit settings from (default: none)
|<code>YAML_NULL</code><ref name=":0" />
# templates cannot be nested, so the Extends setting is not allowed on them.
|-
#
|AddrMap
# besides Extends, all template settings (see above) are also valid here.
|String
#
|Name of '''conv/*.offs''' file to use with this target.
# AddrMap and BaseRpx will default to {TargetName} instead of {TemplateName} here.
|<code>{TargetName}</code><ref name=":7" />
# if AddrMap/BaseRpx are set on both template and an extending target, the extending target has priority.
|-
#
|BaseRpx
# all the list settings are merged together if set on both template and extending target,
|String
# following the rules specified on the Template fields documentation above.</syntaxhighlight>To create an “empty” target/template using all default settings, give it (the field itself) <code>YAML_NULL</code> as the value.
|Name of '''{RpxDir}/*.rpx''' file to use with this target.
|<code>{TargetName}</code><ref name=":7" />
|-
|Remove/Modules
|List<String><ref name=":2" />
|List of global/template modules to remove from compilation with this target. Values follow same rules as the global '''Modules''' field. The following extra rules apply:
It is an error to attempt to remove a module only to re-add it on the same template/target.
It is an error to attempt to remove a module which exists but is not on the current modules list.
|Empty List
|-
|Add/Modules
|List<String><ref name=":2" />
|List of additional modules to compile with this target. Values follow same rules as the global '''Modules''' field.
The following extra rule applies: It is an error to attempt to add a template/target module already on the current modules list.
|Empty List
|-
|Remove/BuildOptions
|List<String><ref name=":2" />
|List of global/template build options to remove from use with this target. Values follow same rules as the global '''ExcludeDefaultBuildOptions''' field.
|Empty List
|-
|Add/BuildOptions
|List<String><ref name=":2" />
|List of additional build options to use with this target. Values follow same rules as the global '''BuildOptions''' field.
|Empty List
|-
! colspan="2" |Notes:
|'''Notes for the 4 options above (Remove/Modules, Add/Modules, Remove/BuildOptions and Add/BuildOptions):'''These 4 options should be processed in the order of removals first, then additions.
It doesn't matter the order the user arranges the fields on their YAML, only the order specified here.


# Before all, load the global Modules and BuildOptions lists.
It is an error for a user to attempt to use a Template as the build target.
# If processing a target extending a template, process the template first.

# Remove the unwanted modules and build options listed by the template/target from their lists.
In the values for '''ModulesBaseDir''', '''SourcesBaseDir''', '''IncludeDirs''' and '''RpxDir''', the paths are all relative to <code>{ProjectDir}</code>, which is simply the folder <code>project.yaml</code> is located within.
# Add the extra wanted modules and build options listed by the template/target to their lists.
*<code>.</code> = <code>{ProjectDir}</code>
# If just processed a template, cycle back to step 3 and process the extending target.
*<code>./foo</code> or <code>foo</code> (implicit <code>./</code>) refers to a folder next to <code>project.yaml</code> in the same folder as it.
| -
*Paths may also use one or more <code>../</code> to refer to folders anywhere outside of <code>{ProjectDir}</code>
|}
===Project Folder Structure===
===Project Folder Structure===
All WUAPPS-based projects must have a <code>{ProjectDir}</code> within which the <code>project.yaml</code> and other project metadata files are located, this folder follows the following structure:<syntaxhighlight lang="cmake">{ProjectDir}/
All WUAPPS-based projects must have a <code>{ProjectDir}</code><ref name=":4" /> within which the <code>project.yaml</code> and other project metadata files are located, this folder follows the following structure:
{| class="wikitable"
conv/
|'''<code>conv/</code>'''
# your WUAPPS Address Offsets (.offs) files
| colspan="2" |Your [[Other:WUAPPS#Address Offsets Maps|WUAPPS Address Offsets (.offs)]] files
syms/
|-
main.map # your primary symbol map file
| rowspan="2" |'''<code>syms/</code>'''
# temporary region-converted symbol maps by compliant tools will be placed here in the form of {TargetName}.x
| colspan="2" |Your symbol map files
linker/
|-
# temporary .ld files generated by compliant tools will be placed here in the form of {TargetName}.ld (autogenerated folder)
|'''<code>main.map</code>'''
objs/
|Your primary symbol map file
# temporary intermediate compiler object files will be placed here (autogenerated folder)
|-
project.gpj # the autogenerated project configuration file after tool processing to be given to the compiler driver
|'''<code>buildoptions.txt</code>'''
buildoptions.txt # optional file, stores extra build options (see the project.gpj section of the standard for more info)
| colspan="2" |Optional file, stores extra build options (See the [[Other:WUAPPS#Configuring project.gpj|project.gpj]] chapter)
project.yaml # the main project configuration file</syntaxhighlight>
|-
|'''<code>project.yaml</code>'''
| colspan="2" |The main project configuration file
|}

===Modules===
===Modules===
<blockquote>⚠ '''Warning: The module system will be removed once the dynamic loader system is finished.'''</blockquote>WUAPPS projects are currently structured in “modules”, enabled/disabled by the <code>project.yaml</code>, which are also defined by YAML files, which in turn declare source files (C++ / Assembly) to compile and assemble as well as [[Other:WUAPPS#Patches & Hooks|binary patches & hooks]] to apply directly to the base RPX file.
<blockquote>'''Warning''':


A module file may not be named <code>project.yaml</code> (case-insensitive) to prevent conflicts. All other names that fit within the standard-wide file name rules (See [[Other:WUAPPS#Terms and Definitions|Terms and Definitions]]) are valid.
The module system will be removed once the dynamic loader system is finished.</blockquote>WUAPPS projects are structured in “modules”, enabled/disabled by the <code>project.yaml</code>, which are defined by YAML files, which in turn declare source files (<code>.cpp</code> / <code>.S</code>) to compile and assemble as well as binary patches/hooks to apply directly to the base RPX file.


A module file may not be named <code>project.yaml</code> (case-insensitive) to prevent conflicts. All other names are valid.



Each module file is structured as follows:<syntaxhighlight lang="yaml"># Files and Hooks are both optional. However:
Each module file is structured as follows:
# Since without either field the module will be empty, compilation runs with empty modules are likely user mistakes,
{| class="wikitable"
# and as such it is recommended (but not required) that tools display a warning to the user about the empty module.
!YAML Key
---
!YAML Value Type<ref name=":3" />
Files:
!Description
- # list of paths to .cpp and .S files to compile/assemble when this module is enabled (default: [])
!Default Value?
- # list may also include folders to implicitly include all .cpp and .S files within the folder.
|-
- # folders ending in nothing, a trailing /, or /* will be included non-recursively.
|Files
- # folders ending in /** will be included recursively.
|Record<List<String><ref name=":2" />><ref name=":1" />
- # * shall NOT be used as a wildcard in the middle of paths.
|Key/Value pairs of filetypes mapped to lists of paths to source files or folders for searching source files within.
- # ...
KEYS must be one of the valid filetypes: <code>C</code>, <code>C++</code>, <code>Assembly</code>, <code>Text</code> (A key not matching one of these must error)
Hooks:
All filetypes are independently optional and can be used or omitted in any combination.
- # list of patches/hooks to apply when this module is enabled (default: [])
Folder paths ending in nothing, a trailing <code>/</code> or <code>/*</code> will be included '''non-recursively'''.
- # ...
Folder paths ending in <code>/**</code> will be included recursively.
# example patch/hook structure

- type: # the type of the patch/hook, see the Patches & Hooks section of the docs for details
The <code>*</code> character shall NOT be used as a wildcard in the middle of paths.
addr: # a stringified hex number with 0x prefix, indicating the where to apply the patch
|Record<Empty List><ref name=":1" />
????: # other hook-type specific fields exist, see the Patches & Hooks section of the docs for details</syntaxhighlight>
|-
|Hooks
|List<Hook<ref name=":6" />>
|List of [[Other:WUAPPS#Patches & Hooks|patches & hooks]] to apply when this module is enabled.
|Empty List
|-
! colspan="4" |Hook YAML Type Structure
|-
|type
|String
|The type of the patch/hook, the the [[Other:WUAPPS#Patches & Hooks|Patches & Hooks]] chapter for details.
|'''<small>''REQUIRED OPTION''</small>'''
|-
|addr
|String
|A stringified hex number with 0x prefix, indicating where to apply the patch/hook.
|'''<small>''REQUIRED OPTION''</small>'''
|-
|'''''????'''''
|'''<small>''unknown''</small>'''
|Other hook-type specific fields exist, see the [[Other:WUAPPS#Patches & Hooks|Patches & Hooks]] chapter for details.
| -
|}
<<< UNFINISHED BEYOND THIS POINT >>>

===Patches &amp; Hooks===
===Patches &amp; Hooks===
This section documents the different types of structures you can write on the <code>Hooks</code> list of a [[#modules|module]]. The current valid types, and their extra fields besides the base <code>type</code> and <code>addr</code> common to all hook types, are as follows: - <code>patch</code> - The most basic and versatile hook, simply writes <code>data</code> at <code>addr</code> (overwrites existing data) - '''Additional field:''' <code>data</code> - A value to be encoded according to <code>datatype</code> and inserted at the <code>addr</code> - '''Optional field:''' <code>datatype</code> (Default: <code>raw</code>) - A string representing a C++ data type to interpret the value of <code>data</code> as, the supported types are: - <code>raw</code>: A sequence of hex bytes, value of <code>data</code> should be a string. - <code>f32</code>/<code>f64</code>/<code>float</code>/<code>double</code>: A 32/64 bit floating point number, value of <code>data</code> should be a numeric literal within the specified type’s range. - <code>u8</code>/<code>u16</code>/<code>u32</code>/<code>u64</code>/<code>uchar</code>/<code>ushort</code>/<code>uint</code>/<code>ulonglong</code>: A 8/16/32/64 bit unsigned integer, value of <code>data</code> should be a numeric integer literal within the specified type’s range. - <code>s8</code>/<code>s16</code>/<code>s32</code>/<code>s64</code>/<code>schar</code>/<code>short</code>/<code>int</code>/<code>longlong</code>: A 8/16/32/64 bit signed integer, value of <code>data</code> should be a numeric integer literal within the specified type’s range. - <code>char</code>: A 8 bit ASCII encoded character, value of <code>data</code> should be a string literal with exactly one character located inside the ASCII character set range (<code>0x00-0x7F</code>). Invalid ASCII characters should trigger an error. - <code>wchar</code>: A 16 bit configurable-encoding encoded character, value of <code>data</code> should be a string literal with exactly one character fitting inside a 2-byte space or less in the chosen encoding (after conversion from UTF8), characters which don’t fit should trigger an error. Characters which encode to only 1-byte in the chosen encoding should be big-endian null-padded. - <code>string</code>: A null-terminated configurable-encoding C string. Single byte null terminator is automatically added. Value of <code>data</code> should be a string literal with only valid characters for the chosen encoding (after conversion from UTF8). Invalid characters of the chosen encoding should trigger an error. - <code>wstring</code>: A null-terminated configurable-encoding wide string. 2-byte null terminator is automatically added. Value of <code>data</code> should be a string literal with only valid characters for the chosen encoding (after conversion from UTF8). Invalid characters of the chosen encoding should trigger an error. - <code>#[]</code>: Where <code>#</code> is any of the types above, you may suffix a type with <code>[]</code> to make an array of it. Value of <code>data</code> will be an array of values of the respective type. - Array types enforce their <code>addr</code> to be aligned by the size of its elements. - <code>string</code> and <code>wstring</code> are 4-byte aligned, therefore all strings in an array must be null-padded to have a byte length multiple of 4, including the last string in the array. - The difference between <code>char[]</code> and <code>string</code> is a <code>char[]</code> doesn’t automatically null-terminate, uses '''ASCII''' encoding and doesn’t align. Meanwhile <code>string</code> does automatically null-terminate, can use configurable-encoding and aligns by 4. - Likewise, the difference between <code>wchar[]</code> and <code>wstring</code> is a <code>wchar[]</code> doesn’t automatically null-terminate and aligns by 2. Meanwhile <code>wstring</code> does automatically 2-byte null-terminate and aligns by 4. - To write a null character on a <code>char[]</code> or <code>wchar[]</code>, write down ''YAML_NULL'' or use standard string escape sequences. - Multidimensional arrays such as <code>int[][]</code> are not supported. - '''Optional conditional field:''' <code>encoding</code> (Default: <code>Shift-JIS</code>) - This field specifies the encoding to encode the input string value with. - This field may be included if <code>datatype</code> is either <code>string</code>, <code>wchar</code> or <code>wstring</code>. - If this field is present when <code>datatype</code> is not one of the above, an error must be thrown. - Valid encoding values and their aliases are: - <code>UTF-8</code>, <code>UTF8</code>, <code>utf-8</code>, <code>utf8</code> (ONLY valid for <code>string</code> datatype, throw error if not used with it) - <code>UCS-2</code>, <code>UCS2</code>, <code>ucs-2</code>, <code>ucs2</code> (NOT valid for <code>string</code> datatype, throw error if used with it) - <code>Shift-JIS</code>, <code>ShiftJIS</code>, <code>shift-jis</code>, <code>shiftjis</code> (valid for all 3 encoding-compatible datatypes) - '''UCS-2 NOTES:''' <code>UCS-2</code> is an obsolete predecessor of <code>UTF-16</code>, as such it may be difficult to find encoders for <code>UCS-2</code> specifically. The difference between the two encodings is simply <code>UCS-2</code> does not support multi-character codepoints (surrogates) to support characters beyond <code>U+FFFF</code>, therefore an implementation using a <code>UTF-16</code> encoder simply pre-validating all input characters to block any characters above <code>U+FFFF</code> is a valid implementation of <code>UCS-2</code>. Other things to note about <code>UCS-2</code> is that the encoding must be in '''big endian''' and a Byte Order Mark (BOM) should NOT be included. - <code>nop</code> - A shorthand for one or multiple sequential <code>patch</code> hooks with <code>60000000</code> (<code>nop</code>) as <code>data</code> - '''Optional field:''' <code>count</code> (Default: <code>1</code>) - A positive non-zero decimal integer number, specifying how many <code>nop</code>’s to apply starting from <code>addr</code> - <code>return</code> - A shortcut for a <code>patch</code> hook with <code>4E800020</code> (<code>blr</code>) as <code>data</code> - <code>branch</code> - Inserts the respective branch instruction <code>instr</code> at <code>addr</code> jumping to the address of the symbol <code>func</code> - '''Additional field:''' <code>instr</code> - The branch instruction to use: <code>b</code> or <code>bl</code> - '''Additional field:''' <code>func</code> - The symbol whose address the branch instruction should jump to - <code>funcptr</code> - Inserts the address of the symbol <code>func</code> at <code>addr</code> - '''Additional field:''' <code>func</code> - The symbol whose address to write at <code>addr</code>
This section documents the different types of structures you can write on the <code>Hooks</code> list of a [[#modules|module]]. The current valid types, and their extra fields besides the base <code>type</code> and <code>addr</code> common to all hook types, are as follows: - <code>patch</code> - The most basic and versatile hook, simply writes <code>data</code> at <code>addr</code> (overwrites existing data) - '''Additional field:''' <code>data</code> - A value to be encoded according to <code>datatype</code> and inserted at the <code>addr</code> - '''Optional field:''' <code>datatype</code> (Default: <code>raw</code>) - A string representing a C++ data type to interpret the value of <code>data</code> as, the supported types are: - <code>raw</code>: A sequence of hex bytes, value of <code>data</code> should be a string. - <code>f32</code>/<code>f64</code>/<code>float</code>/<code>double</code>: A 32/64 bit floating point number, value of <code>data</code> should be a numeric literal within the specified type’s range. - <code>u8</code>/<code>u16</code>/<code>u32</code>/<code>u64</code>/<code>uchar</code>/<code>ushort</code>/<code>uint</code>/<code>ulonglong</code>: A 8/16/32/64 bit unsigned integer, value of <code>data</code> should be a numeric integer literal within the specified type’s range. - <code>s8</code>/<code>s16</code>/<code>s32</code>/<code>s64</code>/<code>schar</code>/<code>short</code>/<code>int</code>/<code>longlong</code>: A 8/16/32/64 bit signed integer, value of <code>data</code> should be a numeric integer literal within the specified type’s range. - <code>char</code>: A 8 bit ASCII encoded character, value of <code>data</code> should be a string literal with exactly one character located inside the ASCII character set range (<code>0x00-0x7F</code>). Invalid ASCII characters should trigger an error. - <code>wchar</code>: A 16 bit configurable-encoding encoded character, value of <code>data</code> should be a string literal with exactly one character fitting inside a 2-byte space or less in the chosen encoding (after conversion from UTF8), characters which don’t fit should trigger an error. Characters which encode to only 1-byte in the chosen encoding should be big-endian null-padded. - <code>string</code>: A null-terminated configurable-encoding C string. Single byte null terminator is automatically added. Value of <code>data</code> should be a string literal with only valid characters for the chosen encoding (after conversion from UTF8). Invalid characters of the chosen encoding should trigger an error. - <code>wstring</code>: A null-terminated configurable-encoding wide string. 2-byte null terminator is automatically added. Value of <code>data</code> should be a string literal with only valid characters for the chosen encoding (after conversion from UTF8). Invalid characters of the chosen encoding should trigger an error. - <code>#[]</code>: Where <code>#</code> is any of the types above, you may suffix a type with <code>[]</code> to make an array of it. Value of <code>data</code> will be an array of values of the respective type. - Array types enforce their <code>addr</code> to be aligned by the size of its elements. - <code>string</code> and <code>wstring</code> are 4-byte aligned, therefore all strings in an array must be null-padded to have a byte length multiple of 4, including the last string in the array. - The difference between <code>char[]</code> and <code>string</code> is a <code>char[]</code> doesn’t automatically null-terminate, uses '''ASCII''' encoding and doesn’t align. Meanwhile <code>string</code> does automatically null-terminate, can use configurable-encoding and aligns by 4. - Likewise, the difference between <code>wchar[]</code> and <code>wstring</code> is a <code>wchar[]</code> doesn’t automatically null-terminate and aligns by 2. Meanwhile <code>wstring</code> does automatically 2-byte null-terminate and aligns by 4. - To write a null character on a <code>char[]</code> or <code>wchar[]</code>, write down ''YAML_NULL'' or use standard string escape sequences. - Multidimensional arrays such as <code>int[][]</code> are not supported. - '''Optional conditional field:''' <code>encoding</code> (Default: <code>Shift-JIS</code>) - This field specifies the encoding to encode the input string value with. - This field may be included if <code>datatype</code> is either <code>string</code>, <code>wchar</code> or <code>wstring</code>. - If this field is present when <code>datatype</code> is not one of the above, an error must be thrown. - Valid encoding values and their aliases are: - <code>UTF-8</code>, <code>UTF8</code>, <code>utf-8</code>, <code>utf8</code> (ONLY valid for <code>string</code> datatype, throw error if not used with it) - <code>UCS-2</code>, <code>UCS2</code>, <code>ucs-2</code>, <code>ucs2</code> (NOT valid for <code>string</code> datatype, throw error if used with it) - <code>Shift-JIS</code>, <code>ShiftJIS</code>, <code>shift-jis</code>, <code>shiftjis</code> (valid for all 3 encoding-compatible datatypes) - '''UCS-2 NOTES:''' <code>UCS-2</code> is an obsolete predecessor of <code>UTF-16</code>, as such it may be difficult to find encoders for <code>UCS-2</code> specifically. The difference between the two encodings is simply <code>UCS-2</code> does not support multi-character codepoints (surrogates) to support characters beyond <code>U+FFFF</code>, therefore an implementation using a <code>UTF-16</code> encoder simply pre-validating all input characters to block any characters above <code>U+FFFF</code> is a valid implementation of <code>UCS-2</code>. Other things to note about <code>UCS-2</code> is that the encoding must be in '''big endian''' and a Byte Order Mark (BOM) should NOT be included. - <code>nop</code> - A shorthand for one or multiple sequential <code>patch</code> hooks with <code>60000000</code> (<code>nop</code>) as <code>data</code> - '''Optional field:''' <code>count</code> (Default: <code>1</code>) - A positive non-zero decimal integer number, specifying how many <code>nop</code>’s to apply starting from <code>addr</code> - <code>return</code> - A shortcut for a <code>patch</code> hook with <code>4E800020</code> (<code>blr</code>) as <code>data</code> - <code>branch</code> - Inserts the respective branch instruction <code>instr</code> at <code>addr</code> jumping to the address of the symbol <code>func</code> - '''Additional field:''' <code>instr</code> - The branch instruction to use: <code>b</code> or <code>bl</code> - '''Additional field:''' <code>func</code> - The symbol whose address the branch instruction should jump to - <code>funcptr</code> - Inserts the address of the symbol <code>func</code> at <code>addr</code> - '''Additional field:''' <code>func</code> - The symbol whose address to write at <code>addr</code>

Revision as of 22:25, 20 July 2023

!!!!! Still unfinished as of currently !!!!!

v3.0-DRAFT

Terms and Definitions

For the purposes of this document, the following terms and definitions apply.

  • A "tool" or "tools" refers to any programs implementing this standard in compliance with it, and/or the developers working on said tool.
  • "implementation-defined" means behavior that is up to tools to define however they please, as long as it meets certain minimum expectations criteria that may be specified by the standard.
  • Currently this standard only acknowledges and accounts for GHS MULTI as compiler toolchain for use. Any mention of words such as “compiler”, “linker”, “assembler”, etc. all refer to the respective tools of GHS MULTI implicitly. Many assumptions and choices of default and possible values for configurations are based on the behavior of GHS MULTI and may not work correctly with other compiler toolchains. Support for other compiler toolchains is open for the future under a new major revision, but there are none planned at this time.
  • In all instances within this document where slashes (/) are used or referenced in the context of a file or folder path, it is implied that it is interchangeable with backslashes (\) for compatibility. Implementation note: When parsing file or folder paths tools must treat both / and \ as path separators regardless of the running platform and interpret the path as valid, including paths making mixed use of both separators.
  • In all instances within this document where file or folder names are used or referenced, including within file and folder paths, these names MUST only contain the following characters a-z, A-Z, 0-9, -_.,+(), and MUST NOT start with a - or end with a .. File and folder paths inherit these rules with only the added allowed characters /\: as path separators. This ensures compatibility with all operating systems.
  • In all instances within this document where file or folder paths are used or referenced, these paths MUST be CASE-SENSITIVE to ensure compatibility with all filesystem formats. Implementation note: This can be achieved with the following methods:
    • On a case-sensitive filesystem: No action required, process the input path as-is.
    • On a case-insensitive filesystem: If input path doesn’t exist, end. Else, fetch the stored case of the file path and compare against the input path, if the comparison is not equal, error. Else, the comparison is equal, therefore the path is valid.
  • YAML_NULL[1]
  • For any value of type STRING, an empty string is to be considered equivalent to having set YAML_NULL (which may in turn be considered invalid for values which do not allow being YAML_NULL), unless a special behavior for empty strings on that specific value is explicitly specified.
  • Any text in the form {PLACEHOLDER} is an instance of a dynamic text segment named PLACEHOLDER which can have user-defined value, with or without restrictions on a case by case basis specified near the definition of the dynamic text segment.
  • For all optional list-based YAML fields, omitting the field or explicitly setting it to YAML_NULL falls back to the default value, if any, otherwise being YAML_NULL as implicit default fallback.
    • Setting it to an empty list ([]) asserts truly no values, without using the default value, if any.
  • For all optional non-list YAML fields (including maps {}), only omitting the field falls back to the default value, if any, otherwise being YAML_NULL as implicit default fallback. The behavior of explicitly setting it to to YAML_NULL is defined by each field.
    • If a non-list YAML field does not specify a behavior for YAML_NULL then it is not allowed to be set to YAML_NULL.
  • “Valid C++98 macro identifiers”[2]
  • References of Shift-JIS [3]

Versioning

This standard’s version follows a restricted subset of SemVer, where only Major and Minor version are present, and no other extensions allowed, for a resulting version format of {StandardMajor}.{StandardMinor}. The standard does not have Patch versions or any other version extensions such as tags or metadata. (Draft revisions of the standard not effectively considered as "the standard" while in draft stage and therefore are exempt from these rules)

Tools in compliance with this standard MUST adapt part of their versioning scheme to match the version of this standard they currently support, using SemVer versioning, with the following additional conditions met:

  • The tool’s Major version MUST match the StandardMajor version it supports.
  • The tool’s Minor version MUST match the StandardMinor version it supports.
  • The tool’s Patch version MUST represent the tool’s own Major version.
  • The tool’s version Tag, if present, MUST represent the tool’s own Minor version as a valid integer.
    • If omitted, the tool’s Minor version is 0.

The final required version format for compliant tools is {StandardMajor}.{StandardMinor}.{ToolMajor}-{ToolMinor}. With the ending -{ToolMinor} being optional if it is 0. Tool Patch versions are not allowed.

An optional +{ToolMetadata} tag at the very end (After -{ToolMinor} if present) is also allowed to be included for use by the tool and may contain anything which SemVer allows on that field.

Environment Variables

The following standard environment variables should be read and used by tools for their respective purposes when applicable. All environment variables are optional to be defined or not by the user, tools should not rely on them as the ONLY source of user input.

Any environment variables set should have lower precedence against values passed through more specific sources of user input defined by the tool, such as command-line arguments.

  • GHS_ROOT: If set, should contain the absolute path to the GHS MULTI folder containing the multi.exe file. Used to locate the necessary GHS MULTI executables needed for a full project build.

Project Configuration

Project configuration is done through the main configuration file, {ProjectDir}/project.yaml, containing the following fields:

YAML Key YAML Value Type[4] Description Default Value?
WUAPPSVersion String The presence of this field indicates the project's (not the tool's!) compliance with the specified version of this standard.

Must follow the standard version format described in the Versioning chapter. Any input not matching the format must error.

Tools must compare this field against their internal supported WUAPPS versions, if the project requests a version not supported by the tool, it must error.

REQUIRED OPTION
Name String The name of the project. REQUIRED OPTION
Variables Record<String>[5] Key/Value pairs of indefinite custom user-defined configuration variables.

KEYS are variable names, which may only contain alphanumerics[6] or underscore.

VALUES are variable contents, which by themselves may contain any character, but are still bound by the restrictions of the fields they are used within.

Variables can be referenced like UNIX variables as $VARIABLE_NAME inside of any String YAML Value of any field or array within project.yaml or module YAMLs.

YAML Keys do NOT support variable interpolation. Variables may NOT be nested within each other.

The literal character $ cannot be used, as it isn't a valid character for file names and paths anyway, which are the primary use-case of variables.

If a variable which does not exist is used somewhere, an error must be thrown by the tool.

Empty Record
RpxDir String Path to the RPX files folder. (relative to {ProjectDir}[7])[8] "./rpxs"
ModulesBaseDir String Base path for module folders paths to resolve from. (relative to {ProjectDir}[7])[8]

If YAML_NULL[1], module paths will resolve from the location of the project.yaml file.

YAML_NULL[1]
SourcesBaseDir String Base path for source folders paths to resolve from. (relative to {ProjectDir}[7])[8]

If YAML_NULL[1], source paths will resolve from the location of the module file they are in.

YAML_NULL[1]
IncludeDirs List<String>[9] List of paths to header folders. (relative to {ProjectDir}[7])[8] ["./include"]
BuildOptions List<String>[9] List of build options to pass to the compiler. If both buildoptions.txt and this field are present, they are merged together.

Works the same as buildoptions.txt but inlined into the project.yaml. (Whenever buildoptions.txt is referenced it is interchangeable with this option)

YAML_NULL[1]
ExcludeDefaultBuildOptions List<String>[9]

Boolean

List of default build options defined by the standard (See the project.gpj chapter) to opt-out of.

The special value of true can be used to easily exclude ALL of the default options. The value of false equals to the default empty list.

The listed values MUST specify the FULL default option name, including the exact prefixing dashes, but EXCLUDING the option's value, as examples:

  • The option "-c99" is excluded by "-c99", but not by "--c99" nor "c99"
  • The option "-kanji=shiftjis" is excluded by "-kanji", but not by "-kanji=" nor "-kanji=shiftjis"

It is an error to attempt to exclude a non-default option or a malformed option like the above invalid examples.

For default options with values, such as -kanji, where the user desires to override the value used by the option, the process is as one would expect:

  • First exclude "-kanji" by adding it on this list, then define the new -kanji value on buildoptions.txt

Attempting to override a default option in buildoptions.txt without excluding it first is implementation-defined behavior.

The default behavior being undefined behavior dependent on how the compiler driver interprets the duplicate options.

(given some compiler options are known to specifically support multiple uses, while others do not)

Empty List
Modules List<String>[9] List of extensionless file paths of global modules to compile. If a module listed is not found or invalid, an error should be thrown. Empty List
MinAlign Record<Integer[10]>[5] Key/Value pairs of indefinite custom user-defined minimum alignments to use per ELF section by section name.

The definition of the record itself, empty or partial, does not void the default section's values unless they are explicitly redefined each.

Example: MinAlign: { .text: 0x80 } does NOT affect .rodata, .data and .bss default values, the defaults will still be used.

All sections not declared, whether by defaults or explicit, shall have a default alignment of 0 (no minimum alignment)

Implementation Notes:

moduleAlignment = max(module section alignment, global minimum section alignment)

sectionAlignment = max(section alignments of all modules)

{ .text:   0x20,
  .rodata: 0x20,
  .data:   0x20,
  .bss:    0x40 }
Targets Record<

Target[11] YAML_NULL[1] >[5]

Key/Value pairs of indefinite target configurations. KEYS are target names. (At least 1 non-abstract target is required to be defined)

To create an “empty” target using all default settings, give it (the field itself) YAML_NULL as the value.

REQUIRED OPTION
Target YAML Type Structure
Abstract Boolean Whether this target is an abstract template for other targets to inherit from. Prevents target from being used directly for compilation. false
Extends String

List<String>[9]

As a String, name of a target to inherit settings from, with potentially infinitely nested inheritance from the inherited target extending another, recursively.

As a List, list of names of targets to multi-inherit settings from, in reverse insertion order, without nesting allowed. Example: A: Extends: [C, B] makes A inherit from B, and then the resulting A+B intermediate target inherits from C, producing the final A+B+C target.

If either B or C have an Extends field, an error must be thrown, as multi-inheriting doesn't support multiple bases for preventing circular or self extensions. Inheritance Semantics:For non-List fields, the highest definition on the inheritance sequence (nested or multi) shall take priority. For List fields, each definition is merged onto the previous, unless they have special behavior defined, which the only 4 current List fields do have. Refer to the Notes at the end of the Target YAML Type Structure table for details on inheritance semantics for the current List fields.

The fields Abstract and Extends do not participate in inheritance as their values are only in respect to the target definition itself, not the final resolved target data. For non-List fields of targets with extensions, two special String values can be used in place of their values:

  • @inherit: This is the default special value used for all fields not explicitly defined in a target with extensions, it signals to use the default value for the inherited target.
  • @self: Opposite to @inherit, signals that the option should use its default value for the current target. This affects options whose default value is {TargetName}[12].
YAML_NULL[1]
AddrMap String Name of conv/*.offs file to use with this target. {TargetName}[12]
BaseRpx String Name of {RpxDir}/*.rpx file to use with this target. {TargetName}[12]
Remove/Modules List<String>[9] List of global/template modules to remove from compilation with this target. Values follow same rules as the global Modules field. The following extra rules apply:

It is an error to attempt to remove a module only to re-add it on the same template/target. It is an error to attempt to remove a module which exists but is not on the current modules list.

Empty List
Add/Modules List<String>[9] List of additional modules to compile with this target. Values follow same rules as the global Modules field.

The following extra rule applies: It is an error to attempt to add a template/target module already on the current modules list.

Empty List
Remove/BuildOptions List<String>[9] List of global/template build options to remove from use with this target. Values follow same rules as the global ExcludeDefaultBuildOptions field. Empty List
Add/BuildOptions List<String>[9] List of additional build options to use with this target. Values follow same rules as the global BuildOptions field. Empty List
Notes: Notes for the 4 options above (Remove/Modules, Add/Modules, Remove/BuildOptions and Add/BuildOptions):These 4 options should be processed in the order of removals first, then additions.

It doesn't matter the order the user arranges the fields on their YAML, only the order specified here.

  1. Before all, load the global Modules and BuildOptions lists.
  2. If processing a target extending a template, process the template first.
  3. Remove the unwanted modules and build options listed by the template/target from their lists.
  4. Add the extra wanted modules and build options listed by the template/target to their lists.
  5. If just processed a template, cycle back to step 3 and process the extending target.
-

Project Folder Structure

All WUAPPS-based projects must have a {ProjectDir}[7] within which the project.yaml and other project metadata files are located, this folder follows the following structure:

conv/ Your WUAPPS Address Offsets (.offs) files
syms/ Your symbol map files
main.map Your primary symbol map file
buildoptions.txt Optional file, stores extra build options (See the project.gpj chapter)
project.yaml The main project configuration file

Modules

Warning: The module system will be removed once the dynamic loader system is finished.

WUAPPS projects are currently structured in “modules”, enabled/disabled by the project.yaml, which are also defined by YAML files, which in turn declare source files (C++ / Assembly) to compile and assemble as well as binary patches & hooks to apply directly to the base RPX file.

A module file may not be named project.yaml (case-insensitive) to prevent conflicts. All other names that fit within the standard-wide file name rules (See Terms and Definitions) are valid.


Each module file is structured as follows:

YAML Key YAML Value Type[4] Description Default Value?
Files Record<List<String>[9]>[5] Key/Value pairs of filetypes mapped to lists of paths to source files or folders for searching source files within.

KEYS must be one of the valid filetypes: C, C++, Assembly, Text (A key not matching one of these must error) All filetypes are independently optional and can be used or omitted in any combination. Folder paths ending in nothing, a trailing / or /* will be included non-recursively. Folder paths ending in /** will be included recursively.

The * character shall NOT be used as a wildcard in the middle of paths.

Record<Empty List>[5]
Hooks List<Hook[11]> List of patches & hooks to apply when this module is enabled. Empty List
Hook YAML Type Structure
type String The type of the patch/hook, the the Patches & Hooks chapter for details. REQUIRED OPTION
addr String A stringified hex number with 0x prefix, indicating where to apply the patch/hook. REQUIRED OPTION
???? unknown Other hook-type specific fields exist, see the Patches & Hooks chapter for details. -

<<< UNFINISHED BEYOND THIS POINT >>>

Patches & Hooks

This section documents the different types of structures you can write on the Hooks list of a module. The current valid types, and their extra fields besides the base type and addr common to all hook types, are as follows: - patch - The most basic and versatile hook, simply writes data at addr (overwrites existing data) - Additional field: data - A value to be encoded according to datatype and inserted at the addr - Optional field: datatype (Default: raw) - A string representing a C++ data type to interpret the value of data as, the supported types are: - raw: A sequence of hex bytes, value of data should be a string. - f32/f64/float/double: A 32/64 bit floating point number, value of data should be a numeric literal within the specified type’s range. - u8/u16/u32/u64/uchar/ushort/uint/ulonglong: A 8/16/32/64 bit unsigned integer, value of data should be a numeric integer literal within the specified type’s range. - s8/s16/s32/s64/schar/short/int/longlong: A 8/16/32/64 bit signed integer, value of data should be a numeric integer literal within the specified type’s range. - char: A 8 bit ASCII encoded character, value of data should be a string literal with exactly one character located inside the ASCII character set range (0x00-0x7F). Invalid ASCII characters should trigger an error. - wchar: A 16 bit configurable-encoding encoded character, value of data should be a string literal with exactly one character fitting inside a 2-byte space or less in the chosen encoding (after conversion from UTF8), characters which don’t fit should trigger an error. Characters which encode to only 1-byte in the chosen encoding should be big-endian null-padded. - string: A null-terminated configurable-encoding C string. Single byte null terminator is automatically added. Value of data should be a string literal with only valid characters for the chosen encoding (after conversion from UTF8). Invalid characters of the chosen encoding should trigger an error. - wstring: A null-terminated configurable-encoding wide string. 2-byte null terminator is automatically added. Value of data should be a string literal with only valid characters for the chosen encoding (after conversion from UTF8). Invalid characters of the chosen encoding should trigger an error. - #[]: Where # is any of the types above, you may suffix a type with [] to make an array of it. Value of data will be an array of values of the respective type. - Array types enforce their addr to be aligned by the size of its elements. - string and wstring are 4-byte aligned, therefore all strings in an array must be null-padded to have a byte length multiple of 4, including the last string in the array. - The difference between char[] and string is a char[] doesn’t automatically null-terminate, uses ASCII encoding and doesn’t align. Meanwhile string does automatically null-terminate, can use configurable-encoding and aligns by 4. - Likewise, the difference between wchar[] and wstring is a wchar[] doesn’t automatically null-terminate and aligns by 2. Meanwhile wstring does automatically 2-byte null-terminate and aligns by 4. - To write a null character on a char[] or wchar[], write down YAML_NULL or use standard string escape sequences. - Multidimensional arrays such as int[][] are not supported. - Optional conditional field: encoding (Default: Shift-JIS) - This field specifies the encoding to encode the input string value with. - This field may be included if datatype is either string, wchar or wstring. - If this field is present when datatype is not one of the above, an error must be thrown. - Valid encoding values and their aliases are: - UTF-8, UTF8, utf-8, utf8 (ONLY valid for string datatype, throw error if not used with it) - UCS-2, UCS2, ucs-2, ucs2 (NOT valid for string datatype, throw error if used with it) - Shift-JIS, ShiftJIS, shift-jis, shiftjis (valid for all 3 encoding-compatible datatypes) - UCS-2 NOTES: UCS-2 is an obsolete predecessor of UTF-16, as such it may be difficult to find encoders for UCS-2 specifically. The difference between the two encodings is simply UCS-2 does not support multi-character codepoints (surrogates) to support characters beyond U+FFFF, therefore an implementation using a UTF-16 encoder simply pre-validating all input characters to block any characters above U+FFFF is a valid implementation of UCS-2. Other things to note about UCS-2 is that the encoding must be in big endian and a Byte Order Mark (BOM) should NOT be included. - nop - A shorthand for one or multiple sequential patch hooks with 60000000 (nop) as data - Optional field: count (Default: 1) - A positive non-zero decimal integer number, specifying how many nop’s to apply starting from addr - return - A shortcut for a patch hook with 4E800020 (blr) as data - branch - Inserts the respective branch instruction instr at addr jumping to the address of the symbol func - Additional field: instr - The branch instruction to use: b or bl - Additional field: func - The symbol whose address the branch instruction should jump to - funcptr - Inserts the address of the symbol func at addr - Additional field: func - The symbol whose address to write at addr

All hook fields are required unless explicitly marked as optional.

Symbol Maps

The primary symbol map for a project is located at {ProjectDir}/syms/main.map and has a very basic syntax similar to any other symbol map file.

  • Whitespace is free-form
  • # is used for comments
    • Both full line and end-of-line comments are supported
    • There is no multi-line comment support
  • Semicolon-separated list of key value pairs in the format: SYMBOL = ADDRESS;
    • Where SYMBOL is the symbol’s text
    • Where VALUE is the symbol’s address as a hex number with a required 0x prefix
      • Alternatively, if VALUE does not start with 0x, it shall be interpreted as a previously defined symbol named VALUE which instructs the parser to re-use that symbol’s address for the current one. If the referenced symbol is not found an error must be thrown.
  • Anything not fitting the above syntax rules is a syntax error

The primary symbol map is not the one actually given to the compiler, as it must be converted to the format expected by the compiler (*.x) and addresses must be converted to different regions according to the build targets through the conv/*.offs files. The resulting converted symbol maps of a compilation are placed in syms/{Target}.x. Those are temporary and can be safely deleted after compilation if desired.

Address Offsets Maps

The *.offs files inside the {ProjectDir}/conv folder are used for converting the addresses of the primary symbol map (syms/main.map) to different regions and versions of the game/app being modified.

The addresses in the primary symbol map can be of any region/version of your choosing, but must be consistent throughout the map. For build targets of the same region/version as your primary symbol map addresses, where no conversion is necessary, a matching *.offs file is not required and use a nullish value wherever an address map is requested.

The offset files support // comments at both start and end of lines, and /* ... */ multi-line comments.

Address conversion offsets are defined by the following EBNF syntax:

/*
<optional>
^ = assert start of line
$ = assert end of line
*/

/* Common */
S                     = [\r\n\t\f\v ]* ; /* optional whitespace */
Z                     = [\r\n\t\f\v ]+ ; /* required whitespace */
hex_literal           = '0x' hex_literal_no_prefix ;
hex_literal_no_0x     = [A-Fa-f0-9]{1,8} ;
decimal_literal       = '0' | [1-9] [0-9]* ;
integer_literal       = decimal_literal | hex_literal ;
word                  = [A-Za-z] ;

/* Convmap */
start                 = S <text_addr data_addr> statement* EOF ;
text_addr             = 'TextAddr' S '=' S hex_literal S ';' S ;
data_addr             = 'DataAddr' S '=' S hex_literal S ';' S ;

statement             = <range_offset | platform_directive> S ;

range_offset          = range S ':' S ('+' | '-') integer_literal S ';' ;
range                 = hex_literal_no_0x S '-' S hex_literal_no_0x ;

platform              = 'Emulator' | 'Console' <'=' word> ;
platform_directive    = ^ '.platform' Z platform <Z 'extends' Z platform> ;
  • As clarification, notice that integer_literal is a value separate from the SIGN, therefore the u32 range restriction does NOT prevent negative numbers, but rather specifies the maximum u32 value as the maximum value for both positive and negative inputs.

Besides address conversion offsets, the *.offs files also store the start address of where the custom text and data section groups will be placed in memory at runtime. For targets targetting only emulators (Cemu), these are not necessary and should be omitted to be auto-calculated, but for targets targetting real Wii U hardware, they must be provided as they cannot be auto-calculated due to real hardware shifting the addresses on load.

TextAddr = ADDRESS;
DataAddr = ADDRESS;

Where ADDRESS is a hex (0x-prefixed) integer inside the u32 value range. If these values are not provided for a console-targetting build it will cause build errors.

Anything not fitting either of the two above syntax rulesets is a syntax error.

Additionally, once these two values are obtained, whether by manual setting or automatic calculation, tools should automatically add to compilation runs the standard defines TEXT_ADDR and DATA_ADDR, respectively setting them to their corresponding value as a hex u32 literal.

Auto-calculating text and data addresses

When automatically calculating the values for TextAddr and DataAddr if they are omitted, tools should follow the following steps: 1. Open and parse the base RPX file used. 2. Locate the ELF section of type 0x80000004 (SHT_RPL_FILEINFO) 3. If the section is not found, the base RPX is malformed, abort and error. 4. Parse the found section’s data, this does not need to be a full parse and can be implementation-defined, as long as the following field values are correctly read in their respective sizes: * u32 at +0x00: MAGIC_AND_VERSION * u32 at +0x08: TEXT_ALIGN * u32 at +0x10: DATA_ALIGN 5. If MAGIC_AND_VERSION is NOT 0xCAFE0402, the RPX version is unknown or malformed, abort and error. 6. Store the other two values for later use. 7. Locate the ELF section named .text 8. If the section is not found, the base RPX is malformed, abort and error. 9. Calculate its end address through the formula: TEXT_END = section.addr + section.size + 1 10. Round up the result through the formula: TEXT_ADDR = ceil(TEXT_END / TEXT_ALIGN) * TEXT_ALIGN 11. You now have the value of TEXT_ADDR, stored it and use as needed for other operations. 12. Find all ELF sections named .data, .rodata, and/or .bss 8. If any of them are not found, ignore it and move on. 9. If NONE are found, stop here and use 0x10000000 as default value for DATA_ADDR. 10. If only one is found, skip step 11 11. Find the one with the highest start address. 12. Calculate the chosen section’s end address through the formula: DATA_END = section.addr + section.size + 1 13. Round up the result through the formula: DATA_ADDR = ceil(DATA_END / DATA_ALIGN) * DATA_ALIGN 14. You now have the value of DATA_ADDR, stored it and use as needed for other operations.

Linker Directives

The {Target}.ld files inside the {ProjectDir}/linker folder are temporary files produced during a compilation run, they should never be edited and can be safely deleted after each run, including the folder itself. (They will both always be re-created every compilation run)

Configuring project.gpj

Generation of the project.gpj is arguably the main task of a WUAPPS tool, the final result combining all of the information given to it through project.yaml and other means, for the GPJ file to then be given to the compiler driver (currently only gbuild as GPJ is a format specific to it in the first place) for the actual compilation process to be performed.

This file specifies the build options, settings to pass to the compiler, linker and assembler, the list of files to compile or perform other tasks with, among possibly other things. Please note the GPJ format syntax itself is specified by GHS MULTI and not this standard.

The generated project.gpj MUST always start with the following structure:

#!gbuild
primaryTarget=ppc_cos_ndebug.tgt
[Project]

This structure is technically all that is minimally required for a “valid GPJ file”, however it is functionally useless as it will not compile anything, essentially a no-op GPJ file.

After the starting structure, a list of tab-indented, newline separated (but with multiple space-separated entries allowed per line), global build options in CLI flags form is placed. The indentation of 1 tab at the start of each line of an option is REQUIRED, as a non-tab indented line signals the end of the global build options GPJ section. The following build options are UNCONDITIONALLY REQUIRED to be specified at all times, users should not be allowed to remove or modify them in any way: - -object_dir=objs -> Path relative to the project.gpj, configures {ProjectDir}/objs folder. - -MD -> Enables generation of {ProjectDir}/objs/*.d files for incremental compilation in future runs. To optionally provide the option to make a build without incremental compilation, tools should do so by deleting the {ProjectDir}/objs folder. - -cpu=espresso -> Espresso is the name of the CPU used on the Wii U, the only currently supported target platform. - -sda=none -> The use of SDA (or ZDA) creates additional ELF data sections in the compiled output file, which are currently not accounted for anywhere in the standard and will disrupt tool operations such as binary patching, address calculations, and others. As such use of SDA (or ZDA) is currently unsupported and should not be allowed by tools. - --no_commons -> Required for the same reason as -sda=none.

After the above options, the following REQUIRED AS DEFAULTS options are included, but each only if the user did not override or opt-out of it (the methods for overriding and opting out are defined by the ExcludeDefaultBuildOptions setting in project.yaml): - -c99 - --g++ - --link_once_templates - --enable_noinline - --max_inlining - --no_exceptions - --no_rtti - --no_implicit_include - -no_ansi_alias - -only_explicit_reg_use - -kanji=shiftjis - -Ospeed - -Onounroll - -Dcafe Please note the following: - The local order (between themselves) of the options does not matter. - The usage of - or -- DOES MATTER, even for full word options, dash styles are NOT interchangeable to GHS MULTI, the exact option dash style must be used for each option (For example, -kanji is valid but --kanji is not).

After the required and default options (with exclusions and overrides performed) have been added, the user’s own settings are added from the file ${ProjectDir}/buildoptions.txt. If the file does not exist, the user has no custom build options and tools should silently proceed. The format for the build options file is the same as the GPJ’s global build options section, minus the required tab indents. Tools should not need (but may, for extemded implementation-defined behavior) to “parse” the file contents beyond simply copying each line of it and appending them to the GPJ’s build options section, with the extra tab indent added to the start of each line.

After all global build options have been specified comes the Files List section, in which the files to be included in the compilation run are listed, in the form of non-intended (the first non-indented line in the file marks the end of the global build options and start of the File List), newline-separated file paths relative to the project.gpj. For tools, the File List should be generated from the resulting list of files from merging the Files field list of every module included in the current compilation target. Additional files may be appended from implementation-defined sources, as long as they are not required for successful compilation, so other standard compliant tools can still compile the project.

For each entry of the File List, the compiler driver generally assumes what to do with each file by mapping certain well-known file extensions to groups of File Types, from which it determines what to do with the file. There are several types, but only the following are relevant to us: - C (Extensions: .c) Action: Compile with the C compiler - C++ (Extensions: .cc, .cpp, .cxx, .c++, .C, .CXX, .CPP) Action: Compile with the C++ compiler - Assembly (Extensions: .s, .asm, .ppc) Action: Assemble with the PowerPC assembler - Note: The .ppc is special and enables further behavior of preprocessing the file with the C preprocessor. - Text (The fallback type if it can’t recognize an extension) Action: Silently ignore file

Everything above is exact and case-insensitive. This creates an issue for users which may want to use extensions not listed here for their source files, such as .S for assembly files. For this scenario the GPJ format allows at the end of a File List path entry for a [TYPE] structure to be placed, where TYPE is the appropriate File Type desired for that file.

<<<<< UNFINISHED SECTION (need to modify some other things before finishing this) >>>>>

Console vs Emulator Compilation

All compilation targets and templates are in theory, designed to be able to be compiled for both console and emulator. In practice, this may not be possible on a per-project basis due to project-specific requirements for targets of console and emulator respectively, such as special defines and modules. Implementations must support all possible scenarios, dual-compilation of the same target or console-only and emulator-only targets. Note: Simultaneous dual-compilation for both console and emulator is not required, implementations are allowed to require separate runs with different program arguments for this task.

When allowing the user to select if the target will be compiled for console or emulator, through whichever means chosen by the tool, it MUST support a variable string value (NOT a boolean toggle) for the console targetting setting input, as a future-proofing mechanism to support multiple console compilation methods. (But setting a tool-chosen default is valid)

In order to make dual-compilation of targets possible, the most common use-case of having a special define to determine whether compilation is being done for emulator or console shall be built-in into implementations under a standard define CONSOLE. The value of this define when targetting console is set to the console compilation method specified by the user, with aliases resolved to their primary value. When the console method is none or unset, meaning emulator compilation, the CONSOLE define is unset.

# -DPLATFORM_IS_EMULATOR=1
# -DPLATFORM_IS_CONSOLE=0
# -DPLATFORM_IS_CONSOLE_CAFELOADER=0

Compilation Methods

Below are the currently valid values implementations must support as input to the console compilation input setting, alongside further information on implementation details of each method. * cafeloader * -DPLATFORM_IS_EMULATOR=0 build option is set. * -DPLATFORM_IS_CONSOLE=1 build option is set. * -DPLATFORM_IS_CONSOLE_CAFELOADER=1 build option is set. * Requires: TextAddr and DataAddr to be manually provided in the selected target’s address map. * Minimal Output Files: Structurally valid Addr.bin, Code.bin, Data.bin and Patches.hax files, refer to CafeLoader documentation for their further information on correctly generating these files. * none or unset (equivalent to emulator compilation) * -DPLATFORM_IS_EMULATOR=1 build option is set. * -DPLATFORM_IS_CONSOLE=0 build option is set. * -DPLATFORM_IS_CONSOLE_CAFELOADER=0 build option is set. * Requires: nothing * Minimal Output Files: A structurally valid patched *.rpx or *.elf, the output name of which is implementation-defined other than the required .rpx or .elf extension.

Shift-JIS Standardization

Put simply, Shift-JIS has a complex and ancient history, which led to it being poorly documented and standardized, with many different variants all wanting to use the same or similar “Shift-JIS” name. The original Shift-JIS specification, known as JIS X 0201 is NOT a fully compliant superset of ASCII, since it replaces the characters \ (U+005C) and ~ (U+007E) with the characters ¥ (U+00A5) and (U+203E) respectively. This is inconsistency is problematic and for that reason when the western world adopted support for it, primarily Microsoft, those two characters were reverted to their original ASCII values, but the name “Shift-JIS” was kept, creating an ambiguous definition for the ASCII compliant Shift-JIS and the original Shift-JIS.

In the present day, the Unicode Consortium has standardized both encodings in the ICU separately: - ibm-943_P130-1999 (original Shift-JIS) - ibm-943_P15A-2003 (ASCII compliant Shift-JIS)

For the purposes of this document, outside of the current section (“Shift-JIS Standardization”), all references to “Shift-JIS” refer to the ASCII compliant Shift-JIS (ibm-943_P15A-2003) variant. Implementations should check the Shift-JIS variant their encoders/decoders are using to ensure it is the correct one. The below tests can be used to check: - \~ encoded to Shift-JIS must equal to the byte sequence 5C 7E - ¥‾ encoded to Shift-JIS must equal to the byte sequence 5C 7E - the byte sequence 5C 7E decoded from Shift-JIS must equal to \~ - Shift-JIS decoding functionality is currently not required for implementing the standard, but is included here for future-proofing.

Compatibility Appendix

This section documents features that exist solely for compatibility purposes with projects pre-dating the creation of this standard, it is strongly advised against using any of the features listed here on new projects. Standard-compliant tools still must fully support these features.

Non-standard Address Offset Maps Extensions

A project with address offset maps predating the standard may not have the correct standardized extension (.offs) in use. This can be remediated with the optional project.yaml setting: AddrMapFileExtension Its default value is offs (the . is not included), and may be overriden to any value needed. All AddrMap references which implicitly assume .offs will then switch to assuming the set non-standard extension .{AddrMapFileExtension}.

TODO: Should distribution files (TYPF, would be renamed) also be standardized?


  1. 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 Refers to a null literal inside a YAML file using either literal null word syntax or the ~ shorthand syntax.
  2. Strings which may only contain A-Z, a-z, 0-9, and _, as well as not starting with 0-9.
  3. Specifically referring to the ibm-943_P15A-2003 encoding. Read Shift-JIS Standardization section for details.
  4. 4.0 4.1 Tools must error upon encountering any mismatched types from the ones specified here. No form of type coercion should be done. Multiple types on different lines for the same field means "OR", i.e. either type is valid.
  5. 5.0 5.1 5.2 5.3 5.4 The syntax Record<T> indicates a YAML Record whose keys contain values only of the YAML Type T
  6. a-z, A-Z, 0-9
  7. 7.0 7.1 7.2 7.3 7.4 The folder project.yaml is located within
  8. 8.0 8.1 8.2 8.3 . = {ProjectDir}Paths may also use one or more ../ to refer to folders anywhere outside of {ProjectDir}
  9. 9.0 9.1 9.2 9.3 9.4 9.5 9.6 9.7 9.8 9.9 The syntax List<T> indicates a YAML List whose entries contain values only of the YAML Type T
  10. Note that the Integer type is arbitrary and does not actually exist in YAML, as YAML treats all numbers as floats, it is required that tools instead validate (NOT coerce!) the received YAML float number as an integer, and error otherwise.
  11. 11.0 11.1 Custom YAML Record structure type, defined below.
  12. 12.0 12.1 12.2 The name of the target, as specified by its YAML key.