mirror of
https://github.com/Abdess/retroarch_system.git
synced 2026-04-13 12:22:33 -05:00
docs: add wiki pages for all audiences, fix .old.yml leak
9 new wiki pages: getting-started, faq, troubleshooting, advanced-usage, verification-modes, adding-a-platform, adding-a-scraper, testing-guide, release-process. Updated architecture.md with mermaid diagrams, tools.md with full pipeline and target/exporter sections, profiling.md with missing fields, index.md with glossary and nav links. Expanded CONTRIBUTING.md from stub to full contributor guide. Filter .old.yml from load_emulator_profiles, generate_db alias collection, and generate_readme counts. Fix BizHawk sha1 mode in tools.md, fix RetroPie path, fix export_truth.py typos.
This commit is contained in:
@@ -9,6 +9,34 @@ The source code is the reference because it reflects actual behavior.
|
||||
Documentation, .info files, and wikis are useful starting points
|
||||
but are verified against the code.
|
||||
|
||||
### Source hierarchy
|
||||
|
||||
Documentation and metadata are valuable starting points, but they can
|
||||
fall out of sync with the actual code over time. The desmume2015 .info
|
||||
file is a good illustration: it declares `firmware_count=3`, but the
|
||||
source code at the pinned version opens zero firmware files. Cross-checking
|
||||
against the source helps catch that kind of gap early.
|
||||
|
||||
When sources conflict, priority follows the chain of actual execution:
|
||||
|
||||
1. **Original emulator source** (ground truth, what the code actually does)
|
||||
2. **Libretro port** (may adapt paths, add compatibility shims, or drop features)
|
||||
3. **.info metadata** (declarative, may be outdated or copied from another core)
|
||||
|
||||
For standalone emulators like BizHawk or amiberry, there is only one
|
||||
level. The emulator's own codebase is the single source of truth. No
|
||||
.info, no wrapper, no divergence to track.
|
||||
|
||||
A note on libretro port differences: the most common change is path
|
||||
resolution. The upstream emulator loads files from the current working
|
||||
directory; the libretro wrapper redirects to `retro_system_directory`.
|
||||
This is normal adaptation, not a divergence worth documenting. Similarly,
|
||||
filename changes like `naomi2_eeprom.bin` becoming `n2_eeprom.bin` are
|
||||
often deliberate. RetroArch uses a single shared system directory for
|
||||
all cores, so the port renames files to prevent collisions between cores
|
||||
that emulate different systems but happen to use the same generic
|
||||
filenames. The upstream name goes in `aliases:`.
|
||||
|
||||
## Steps
|
||||
|
||||
### 1. Find the source code
|
||||
@@ -21,9 +49,27 @@ Check these locations in order:
|
||||
|
||||
Always clone both upstream and libretro port to compare.
|
||||
|
||||
For libretro cores, cloning both repositories and diffing them reveals
|
||||
what the port changed. Path changes (fopen of a relative path becoming
|
||||
a system_dir lookup) are expected. What matters are file additions the
|
||||
port introduces, files the port dropped, or hash values that differ
|
||||
between the two codebases.
|
||||
|
||||
If the source is hosted outside GitHub, it's worth exploring further. Emulator
|
||||
source on GitLab, Codeberg, SourceForge, Bitbucket, archive.org
|
||||
snapshots, and community mirror tarballs. Inspecting copyright headers
|
||||
or license strings in the libretro fork often points to the original
|
||||
author's site. The upstream code exists somewhere; it's worth continuing the search before concluding the source is unavailable.
|
||||
|
||||
One thing worth noting: even when the same repository was analyzed for
|
||||
a related profile (e.g., fbneo for arcade systems), it helps to do a
|
||||
fresh pass for each new profile. When fbneo_neogeo was profiled, the
|
||||
NeoGeo subset referenced BIOS files that the main arcade analysis
|
||||
hadn't encountered. A fresh look avoids carrying over blind spots.
|
||||
|
||||
### 2. Trace file loading
|
||||
|
||||
Read the code flow. Don't grep keywords by assumption.
|
||||
Read the code flow, tracing from the entry point.
|
||||
Each emulator has its own way of loading files.
|
||||
|
||||
Look for:
|
||||
@@ -34,6 +80,19 @@ Look for:
|
||||
- Hash validation (MD5, CRC32, SHA1 comparisons in code)
|
||||
- Size validation (`fseek`/`ftell`, `stat`, fixed buffer sizes)
|
||||
|
||||
Grepping for "bios" or "firmware" across the source tree can be a
|
||||
useful first pass, but it may miss emulators that use different terms
|
||||
(bootrom, system ROM, IPL, program.rom) and can surface false matches
|
||||
from test fixtures or comments.
|
||||
|
||||
A more reliable approach is starting from the entry point
|
||||
(`retro_load_game` for libretro, `main()` for standalone) and tracing
|
||||
the actual file-open calls forward. Each emulator has its own loading
|
||||
flow. Dolphin loads region-specific IPL files through a boot sequence
|
||||
object. BlastEm reads a list of ROM paths from a configuration
|
||||
structure. same_cdi opens CD-i BIOS files through a machine
|
||||
initialization routine. The loading flow varies widely between emulators.
|
||||
|
||||
### 3. Determine required vs optional
|
||||
|
||||
This is decided by code behavior, not by judgment:
|
||||
@@ -42,6 +101,18 @@ This is decided by code behavior, not by judgment:
|
||||
- **optional**: the core works with degraded functionality without it
|
||||
- **hle_fallback: true**: the core has a high-level emulation path when the file is missing
|
||||
|
||||
The decision is based on the code's behavior. If the core crashes or
|
||||
refuses to boot without the file, it is required. If it continues with
|
||||
degraded functionality (missing boot animation, different fonts, reduced
|
||||
audio in menus), it is optional. This keeps the classification objective
|
||||
and consistent across all profiles.
|
||||
|
||||
When a core has HLE (high-level emulation), the real BIOS typically
|
||||
gives better accuracy, but the core functions without it. These files
|
||||
are marked with `hle_fallback: true` and `required: false`. The file
|
||||
still ships in packs (better experience for the user), but its absence
|
||||
does not raise alarms during verification.
|
||||
|
||||
### 4. Document divergences
|
||||
|
||||
When the libretro port differs from the upstream:
|
||||
@@ -54,6 +125,18 @@ Path differences (current dir vs system_dir) are normal adaptation,
|
||||
not a divergence. Name changes (e.g. `naomi2_` to `n2_`) may be intentional
|
||||
to avoid conflicts in the shared system directory.
|
||||
|
||||
RetroArch's system directory is shared by every installed core. When
|
||||
the libretro port renames a file, it is usually solving a real problem:
|
||||
two cores that both expect `bios.rom` would overwrite each other. The
|
||||
upstream name goes in `aliases:` and `mode: libretro` on the port-specific
|
||||
name, so both names are indexed.
|
||||
|
||||
True divergences worth documenting are: files the port adds that the
|
||||
upstream never loads, files the upstream loads that the port dropped
|
||||
(a gap in the port), and hash differences in embedded ROM data between
|
||||
the two codebases. These get noted in the profile because they affect
|
||||
what the user actually needs to provide.
|
||||
|
||||
### 5. Write the YAML profile
|
||||
|
||||
```yaml
|
||||
@@ -80,6 +163,46 @@ files:
|
||||
source_ref: Source/Core/Core/Boot/Boot_BS2Emu.cpp:42
|
||||
```
|
||||
|
||||
### Writing style
|
||||
|
||||
Notes in a profile describe what the core does, kept focused on:
|
||||
what files get loaded, how, and from where. Comparisons with other
|
||||
cores, disclaimers, and feature coverage beyond file requirements
|
||||
belong in external documentation. The profile is a technical spec.
|
||||
|
||||
Profiles are standalone documentation. Someone should be able to take
|
||||
a single YAML file and integrate it into their own project without
|
||||
knowing anything about this repository's database, directory layout,
|
||||
or naming conventions. The YAML documents what the emulator expects.
|
||||
The tooling resolves the YAML against the local file collection
|
||||
separately.
|
||||
|
||||
A few field conventions that protect the toolchain:
|
||||
|
||||
- `type:` is operational. `resolve_platform_cores()` uses it to filter
|
||||
which profiles apply to a platform. Valid values are `libretro`,
|
||||
`standalone + libretro`, `standalone`, `alias`, `launcher`, `game`,
|
||||
`utility`, `test`. Putting a classification concept here (like
|
||||
"bizhawk-native") breaks the filtering. A BizHawk core is
|
||||
`type: standalone`.
|
||||
|
||||
- `core_classification:` is descriptive. It documents the relationship
|
||||
between the core and the original emulator (pure_libretro,
|
||||
official_port, community_fork, frozen_snapshot, etc.). It has no
|
||||
effect on tooling behavior.
|
||||
|
||||
- Alternative filenames go in `aliases:` on the file entry (rather than
|
||||
as separate entries in platform YAMLs or `_shared.yml`). When the same
|
||||
physical ROM is known by three names across different platforms, one
|
||||
name is `name:` and the rest are `aliases:`.
|
||||
|
||||
- Hashes come from source code. If the source has a hardcoded hex
|
||||
string (like emuscv's `635a978...` in memory.cpp), that goes in. If
|
||||
the source embeds ROM data as byte arrays (like ep128emu's roms.hpp),
|
||||
the bytes can be extracted and hashed. If the source performs no hash
|
||||
check at all, the hash is omitted from the profile. The .info or docs
|
||||
may list an MD5, but source confirmation makes it more reliable.
|
||||
|
||||
### 6. Validate
|
||||
|
||||
```bash
|
||||
@@ -87,6 +210,38 @@ python scripts/cross_reference.py --emulator dolphin --json
|
||||
python scripts/verify.py --emulator dolphin
|
||||
```
|
||||
|
||||
### Lessons learned
|
||||
|
||||
These are patterns that have come up while building profiles. Sharing
|
||||
them here in case they save time.
|
||||
|
||||
**.info metadata can lag behind the code.** The desmume2015 .info
|
||||
declares `firmware_count=3`, but the core source at the pinned version
|
||||
never opens any firmware file. The .info is useful as a starting point
|
||||
but benefits from a cross-check against the actual code.
|
||||
|
||||
**Fresh analysis per profile helps.** When fbneo was profiled for
|
||||
arcade systems, NeoGeo-specific BIOS files were outside the analysis
|
||||
scope. Profiling fbneo_neogeo later surfaced files the first pass
|
||||
hadn't covered. Doing a fresh pass for each profile, even on a
|
||||
familiar codebase, avoids carrying over blind spots.
|
||||
|
||||
**Path adaptation vs real divergence.** The libretro wrapper changing
|
||||
`fopen("./rom.bin")` to load from `system_dir` is the standard
|
||||
porting pattern. The file is the same; only the directory resolution
|
||||
changed. True divergences (added/removed files, different embedded
|
||||
data) are the ones worth documenting.
|
||||
|
||||
**Each core has its own loading logic.** snes9x and bsnes both
|
||||
emulate the Super Nintendo, but they handle the Super Game Boy BIOS
|
||||
and DSP firmware through different code paths. Checking the actual
|
||||
code for each core avoids assumptions based on a related profile.
|
||||
|
||||
**Code over docs.** Wiki pages and README files sometimes reference
|
||||
files from older versions or a different fork. If the source code
|
||||
does not load a particular file, it can be left out of the profile
|
||||
even if documentation mentions it.
|
||||
|
||||
## YAML field reference
|
||||
|
||||
### Profile fields
|
||||
@@ -94,18 +249,22 @@ python scripts/verify.py --emulator dolphin
|
||||
| Field | Required | Description |
|
||||
|-------|----------|-------------|
|
||||
| `emulator` | yes | display name |
|
||||
| `type` | yes | `libretro`, `standalone`, `standalone + libretro`, `alias`, `launcher` |
|
||||
| `type` | yes | `libretro`, `standalone`, `standalone + libretro`, `alias`, `launcher`, `game`, `utility`, `test` |
|
||||
| `core_classification` | no | `pure_libretro`, `official_port`, `community_fork`, `frozen_snapshot`, `enhanced_fork`, `game_engine`, `embedded_hle`, `alias`, `launcher` |
|
||||
| `source` | yes | libretro core repository URL |
|
||||
| `upstream` | no | original emulator repository URL |
|
||||
| `profiled_date` | yes | date of source analysis |
|
||||
| `core_version` | yes | version analyzed |
|
||||
| `display_name` | no | full display name (e.g. "Sega - Mega Drive (BlastEm)") |
|
||||
| `systems` | yes | list of system IDs this core handles |
|
||||
| `cores` | no | list of core names (default: profile filename) |
|
||||
| `cores` | no | list of upstream core names for buildbot/target matching |
|
||||
| `mode` | no | default mode: `standalone`, `libretro`, or `both` |
|
||||
| `verification` | no | how the core verifies BIOS: `existence` or `md5` |
|
||||
| `files` | yes | list of file entries |
|
||||
| `notes` | no | free-form technical notes |
|
||||
| `exclusion_note` | no | why the profile has no files |
|
||||
| `data_directories` | no | references to data dirs in `_data_dirs.yml` |
|
||||
| `exclusion_note` | no | why the profile has no files despite .info declaring firmware |
|
||||
| `analysis` | no | structured per-subsystem analysis (capabilities, supported modes) |
|
||||
| `platform_details` | no | per-system platform-specific details (paths, romsets, forced systems) |
|
||||
|
||||
### File entry fields
|
||||
|
||||
@@ -113,20 +272,20 @@ python scripts/verify.py --emulator dolphin
|
||||
|-------|-------------|
|
||||
| `name` | filename as the core expects it |
|
||||
| `required` | true if the core needs this file to function |
|
||||
| `system` | system ID this file belongs to |
|
||||
| `system` | system ID this file belongs to (for multi-system profiles) |
|
||||
| `size` | expected size in bytes |
|
||||
| `min_size`, `max_size` | size range when the code accepts a range |
|
||||
| `md5`, `sha1`, `crc32`, `sha256` | expected hashes from source code |
|
||||
| `validation` | list of checks the code performs: `size`, `crc32`, `md5`, `sha1` |
|
||||
| `validation` | checks the code performs: `size`, `crc32`, `md5`, `sha1`, `adler32`, `signature`, `crypto`. Can be a list or dict `{core: [...], upstream: [...]}` for divergent checks |
|
||||
| `aliases` | alternate filenames for the same file |
|
||||
| `mode` | `libretro`, `standalone`, or `both` |
|
||||
| `hle_fallback` | true if a high-level emulation path exists |
|
||||
| `category` | `bios` (default), `game_data`, `bios_zip` |
|
||||
| `region` | geographic region (e.g. `north-america`, `japan`) |
|
||||
| `source_ref` | source file and line number |
|
||||
| `path` | path relative to system directory |
|
||||
| `source_ref` | source file and line number (e.g. `boot.cpp:42`) |
|
||||
| `path` | destination path relative to system directory |
|
||||
| `description` | what this file is |
|
||||
| `note` | additional context |
|
||||
| `archive` | parent ZIP if this file is inside an archive |
|
||||
| `contents` | structure of files inside a BIOS ZIP |
|
||||
| `storage` | `embedded` (default), `external`, `user_provided` |
|
||||
| `contents` | structure of files inside a BIOS ZIP (`name`, `description`, `size`, `crc32`) |
|
||||
| `storage` | `large_file` for files > 50 MB stored as release assets |
|
||||
|
||||
|
||||
Reference in New Issue
Block a user