higan Manifests2019-12-05 08:14:43
Manifests are used to describe games, as well as their PCB components.
The base format is a minimal encoding to express every possible variation of PCB that exists in licensed form, as well as most unlicensed PCBs by extension.
The format is extensible, so that new fields can be added on an as-needed basis, yet in a backward-compatible fashion for existing games.
I feel it's best to start with examples, and then move on to explanations.
game sha256: b7209ec3a5a0d28724f5867343195aef7cb85aeb453aa84a6cbe201b61b0d083 label: ドレミファンタジー ミロンのドキドキ大冒険 name: DoReMi Fantasy - Milon no Dokidoki Daibouken region: SHVC-AM4J-JPN revision: SHVC-AM4J-0 board: SHVC-1J0N-20 memory type: ROM size: 0x200000 content: Program
game sha256: 89ad4ba02a2518ca792cf96b61b36613f86baac92344c9c10d7fab5433bebc16 label: Super Mario Kart name: Super Mario Kart region: SNS-MK-USA revision: SNS-MK-0 board: SHVC-1K1B-01 memory type: ROM size: 0x80000 content: Program memory type: RAM size: 0x800 content: Save memory type: ROM size: 0x1800 content: Program manufacturer: NEC architecture: uPD7725 memory type: ROM size: 0x800 content: Data manufacturer: NEC architecture: uPD7725 memory type: RAM size: 0x200 content: Data manufacturer: NEC architecture: uPD7725 volatile oscillator frequency: 7600000 note: DSP1
Manifests are stored in plain-text files named
manifest.bml. The BML
extension refers to the markup language used. For manifests specifically, only
a simplified subset of the BML syntax is used.
The file format is always UTF-8 without a BOM (byte order marker) present. If there is a BOM (eg from Notepad), the document will be considered invalid.
Line endings should only consist of line feeds (0x0a), however it is permissible to have carriage returns + line feeds (0x0d, 0x0a). Carriage feeds alone (eg MacOS 9 format) are not permitted.
Tree-structure nesting is permitted by way of indenting, either by tabs or by spaces. It is up to the user how many tabs or spaces each indentation level uses, however it must be applied consistently to the entire document to be considered valid. In the above example, two spaces are used. This is the recommended default choice. Tabs have a tendency to display at inconsistent widths between editors, and to be eaten by web forum software.
Each node can contain [A-Za-z0-9-_], but in practice only [a-z] is used for manifests.
Each node can contain a value. To specify a value, a : separator is used. After the : separator, any values can appear outside of carriage returns and line feeds, however control characters should not be used.
For manifests specifically, all whitespace (both from tabs and spaces) will be
erased from values. In the above examples, extra whitespace is used to
vertically align the root-node elements, but this is strictly optional. If
desired, each node can be encoded as
Each node can have child nodes, even if the parent node is assigned a value.
From BML, attributes and multi-line values are not used. This is so that parsing of manifests is simpler than full BML syntax.
Every manifest must have a root-level
game node. All relevant information is
stored under this node, so as to enable direct transposition of game database
entries into individual manifest files.
sha256 field is optional for individual games, and may be omitted for
rewritable media such as BS-X Satellaview memory cartridges and Nintendo Power
cartridges. The SHA256 value itself is the combined hash of all relevant ROM
files. The hashing order for this is program ROM, then data ROM, then character
ROM, then firmware boot ROM, then firmware program ROM, then firmware data ROM.
No cases exist of games with multiple coprocessor firmware. Volatile data such
as EEPROM, Flash, and RAM components are not considered for the SHA256 hash
label field is mandatory. This is used by higan to display the game
title in the title bar. The idea is that this field represents what is printed
on the game labels. It should be in the native language of the game itself, and
should include any special characters that can be encoded in UTF-8, excluding
emoji. The simple example above has the name of the game written in Japanese,
name field is optional for standalone gamepaks, but is mandatory for
database entries, so that icarus can import the games and name them
appropriately. This field represents a filesystem-safe filename. It should only
use characters between 0x20 and 0x7e, and should not use reserved filename
characters such as
<>/:*?|. It is very strongly recommended to have a name
field, even if it is identical to the label field as in the complex example
above. It is especially useful in the case of the simple example, so that the
label can be read by those who are not fluent in Japanese.
region field is mandatory, and is used by higan to determine the
regional hardware and TV frequency variants used for proper emulation. When game
serial numbers are present that include the regional information within them,
these values are used. In the simple example above, the region ends in -JPN,
which indicates the game is from Japan, and uses the NTSC standard. In the
complex example, -USA obviously implies the game is from the US. If the game
serials are not known, the value will be up to the emulated system's
requirements. The NES/Famicom will need either NTSC or PAL, the Mega
Drive/Genesis will need NTSC-J, NTSC-U, or PAL. The PC Engine will always be
NTSC. Technically, the region could be omitted for the Wonderswan, but for
consistency, it should always be included.
revision field is mandatory, and is used to uniquely identify game
revisions. If the game cartridge ROM chip has a serial number that also includes
revision information, this should be used. In the simple example above, the -0
indicates a 1.0 revision. A -1 would indicate a 1.1 revision, and so forth. If
a game does not have clear ROM serials, then the revision will start at 1.0 and
increment with each newer release sequentially. No known cases exist of a
revision field exceeding 1.5. In the case of the Sufami Turbo, there are no
known revisions of any games released for the device, and so although the field
is technically not required here, it is included anyway for consistency. higan
will not actually change its emulation based on this field.
note field is optional. Anything can be included here.
The ordering of these fields is unimportant, and is up to user preference, so long as all required fields are present.
Specification — Boards
board field is mandatory for games that are distributed on printed
circuit boards. They describe the components (memory, oscillators, etc) that are
present on each individual game board.
If the game PCB contains a serial code that can be used to deduce the board layout, this is provided. In the complex example above, the 1K1B component indicates a DSP1 game with a 74LS memory mapper.
If the game does not contain a serial code, or it is not known due to poor game preservation of the given system, then it will be up to the individual system what appropriate value should go here to inform higan how to emulate the cartridge in question properly.
The long-term intention is to have per-system documentation to describe the board value for each. In the cases that all PCBs are uniform, as in the PC Engine, this field would just be left blank, but still present, so that child node components could be present.
memory node is used to describe memory. Instead of being per-physical
chip, it is meant to describe per-purpose functionality. So for instance, the
same game might be released initially on two 8mbit program ROM chips, but later
re-released on a single 16mbit program ROM chip. Instead of trying to encode a
detail not used by emulation, only one memory entry will be present to describe
the full 16mbit program ROM. However, a data ROM, say from the SPC7110, or a
character ROM, say from the Famicom/NES, will be described separately, as this
detail is very important to emulation.
type field encodes the type of memory used. ROM indicates truly
read-only memory; typically mask ROM or EPROM. RAM indicates rewritable memory,
such as SRAM or PSRAM. RTC indicates a real-time clock. EEPROM and Flash
represent memory which can be reprogrammed by software. Very obviously, this
field is mandatory.
Both RAM and RTC types can have an optional child specifier to indicate if a battery is connected to ensure non-volatile memory. If the battery child node is not present on type, then the memory contents will not be loaded or saved by higan. This is important as there exist PCBs that have CR2032 battery connection slots, yet are not populated. As such, this detail has to be encoded on a per-game basis.
size field encodes the number of bytes present on the memory chip. This
can be expressed as either a decimal value (eg
256), or a hex value (eg
0x100). Hex is preferred. This field does not support logical units such as
8mbit, because many memory sizes do not fit neatly into these descriptions,
such as the 0xc00-byte data ROM for the Cx4 coprocessor. The size field is
required because some boards can support multiple ROM sizes.
content field is mandatory, and describes the purpose of the memory
chip. This can contain values such as
Time, and so forth.
manufacturer field is only sometimes mandatory, to disambiguate
coprocessor firmware by the same content names, as well as to indicate specific
flash-based chips with different protocols in combination with the memory size.
architecture field is is only sometimes mandatory. It should be
populated when the manufacturer field is, if the value is known, and represents
the ISA of a given CPU. So for instance, the NEC uPD7725 contains internal
program and data ROM, plus internal data RAM. This memory may exist inside of
another chip, but as the memory nodes contain memory by use, they are listed
separately, and are tagged as being part of said coprocessor. The architecture
field can also further disambiguate edge cases such as NEC releasing both the
uPD7725 and the uPD96050.
With all of these fields combined, the exact ordering of memory nodes becomes position-independent. That is to say, it doesn't matter how they are ordered, each should be uniquely identifiable.
higan locates memory files via lowercase(architecture.content.type), or if the
architecture field is empty, lowercase(content.type). So for instance,
Overriding of filenames is not permitted, so as to enable manifest-free operation and less database fields for easier hashing and loading of individual component files.
oscillator node is used to describe clocks that exist on PCBs. A game
may optionally include a quartz crystal or ceramic clock, and this component
node documents such cases. For example, the NEC uPD96050 may ship with an
oscillator with an effective frequency of either 11MHz or 15MHz. This field
disambiguates this case.
frequency field describes the exact oscillator frequency in hz (hertz).
No suffix is added. So in the complex example above,
7600000 implies a clock
rate of 7.6MHz.
Specification — Extensibility
Manifests may contain any amount of additional fields. Although higan will not use them at the present time, it may in the future add additional fields as needed to emulate new edge cases discovered.
As an example, one may wish to describe an oscillator in more detail:
oscillator frequency: 11000000 divider: 2 type: quartz
This is still a valid manifest, the extra fields will simply be ignored.
There can even be additional root-level nodes present, as unlike XML, BML does not require only one root-level node per document.