mirror of
https://github.com/Abdess/retroarch_system.git
synced 2026-04-13 12:22:33 -05:00
268 lines
12 KiB
Markdown
268 lines
12 KiB
Markdown
# Architecture - RetroBIOS
|
|
|
|
## Directory structure
|
|
|
|
```
|
|
bios/ BIOS and firmware files, organized by Manufacturer/Console/
|
|
Manufacturer/Console/ canonical files (one per unique content)
|
|
.variants/ alternate versions (different hash, same purpose)
|
|
emulators/ one YAML profile per core/engine
|
|
platforms/ one YAML config per platform (scraped from upstream)
|
|
_shared.yml shared file groups across platforms
|
|
_registry.yml platform metadata (logos, scrapers, status, install config)
|
|
_data_dirs.yml data directory definitions (Dolphin Sys, PPSSPP...)
|
|
targets/ hardware target configs + _overrides.yml
|
|
scripts/ all tooling (Python, pyyaml only dependency)
|
|
scraper/ upstream scrapers (libretro, batocera, recalbox...)
|
|
scraper/targets/ hardware target scrapers (retroarch, batocera, emudeck, retropie)
|
|
exporter/ native format exporters (batocera, recalbox, emudeck...)
|
|
install/ JSON install manifests per platform
|
|
targets/ JSON target manifests per platform (cores per architecture)
|
|
data/ cached data directories (not BIOS, fetched at build)
|
|
schemas/ JSON schemas for validation
|
|
tests/ E2E test suite with synthetic fixtures
|
|
_mame_clones.json MAME parent/clone set mappings
|
|
dist/ generated packs (gitignored)
|
|
.cache/ hash cache and large file downloads (gitignored)
|
|
```
|
|
|
|
## Data flow
|
|
|
|
```
|
|
Upstream sources Scrapers parse generate_db.py scans
|
|
System.dat (libretro) + fetch versions bios/ on disk
|
|
batocera-systems builds database.json
|
|
es_bios.xml (recalbox) (SHA1 primary key,
|
|
core-info .info files indexes: by_md5, by_name,
|
|
FirmwareDatabase.cs by_crc32, by_path_suffix)
|
|
MAME/FBNeo source
|
|
|
|
emulators/*.yml verify.py checks generate_pack.py resolves
|
|
source-verified platform-native files by hash, builds ZIP
|
|
from code verification packs per platform
|
|
|
|
truth.py generates diff_truth.py export_native.py
|
|
ground truth from compares truth vs exports to native formats
|
|
emulator profiles scraped platform (DAT, XML, JSON, Bash)
|
|
```
|
|
|
|
Pipeline runs all steps in sequence: DB, data dirs, MAME/FBNeo hashes,
|
|
verify, packs, install manifests, target manifests, consistency check,
|
|
pack integrity, README, site. See [tools](tools.md) for the full pipeline reference.
|
|
|
|
```mermaid
|
|
graph LR
|
|
A[generate_db] --> B[refresh_data_dirs]
|
|
B --> C[MAME/FBNeo hashes]
|
|
C --> D[verify --all]
|
|
D --> E[generate_pack --all]
|
|
E --> F[install manifests]
|
|
F --> G[target manifests]
|
|
G --> H[consistency check]
|
|
H --> H2[pack integrity]
|
|
H2 --> I[generate_readme]
|
|
I --> J[generate_site]
|
|
|
|
style A fill:#2d333b,stroke:#adbac7,color:#adbac7
|
|
style D fill:#2d333b,stroke:#adbac7,color:#adbac7
|
|
style E fill:#2d333b,stroke:#adbac7,color:#adbac7
|
|
style H2 fill:#2d333b,stroke:#adbac7,color:#adbac7
|
|
style J fill:#2d333b,stroke:#adbac7,color:#adbac7
|
|
```
|
|
|
|
## Three layers of data
|
|
|
|
| Layer | Source | Role |
|
|
|-------|--------|------|
|
|
| Platform YAML | Scraped from upstream | What the platform declares it needs |
|
|
| `_shared.yml` | Curated | Shared files across platforms, reflects actual behavior |
|
|
| Emulator profiles | Source-verified | What the code actually loads. Used for cross-reference and gap detection |
|
|
|
|
The pack combines platform baseline (layer 1) with core requirements (layer 3).
|
|
Neither too much (no files from unused cores) nor too few (no missing files for active cores).
|
|
|
|
The emulator's source code serves as ground truth for what files are needed,
|
|
what names they use, and what validation the emulator performs. Platform YAML
|
|
configs are scraped from upstream and are generally accurate, though they can
|
|
occasionally have gaps or stale entries. The emulator profiles complement the
|
|
platform data by documenting what the code actually loads. When the two disagree,
|
|
the profile takes precedence for pack generation: files the code needs are included
|
|
even if the platform does not declare them. Files the platform declares but no
|
|
profile references are kept as well (flagged during cross-reference), since the
|
|
upstream may cover cases not yet profiled.
|
|
|
|
```mermaid
|
|
graph TD
|
|
PY[Platform YAML<br/>scraped from upstream] --> PG[Pack generation]
|
|
EP[Emulator profiles<br/>source-verified] --> PG
|
|
SH[_shared.yml<br/>curated shared files] --> PY
|
|
SH --> EP
|
|
PG --> ZIP[ZIP pack per platform]
|
|
|
|
style PY fill:#2d333b,stroke:#adbac7,color:#adbac7
|
|
style EP fill:#2d333b,stroke:#adbac7,color:#adbac7
|
|
style SH fill:#2d333b,stroke:#adbac7,color:#adbac7
|
|
style PG fill:#2d333b,stroke:#adbac7,color:#adbac7
|
|
style ZIP fill:#2d333b,stroke:#adbac7,color:#adbac7
|
|
```
|
|
|
|
## Pack grouping
|
|
|
|
Platforms that produce identical packs are grouped automatically.
|
|
RetroArch and Lakka share the same files and `base_destination` (`system/`),
|
|
so they produce one combined pack (`RetroArch_Lakka_BIOS_Pack.zip`).
|
|
RetroPie uses `BIOS/` as base path, so it gets a separate pack.
|
|
With `--target`, the fingerprint includes target cores so platforms
|
|
with different hardware filters get separate packs.
|
|
|
|
## Storage tiers
|
|
|
|
| Tier | Meaning |
|
|
|------|---------|
|
|
| `embedded` (default) | file is in the `bios/` directory, included in packs |
|
|
| `external` | file has a `source_url`, downloaded at pack build time |
|
|
| `user_provided` | user must provide the file (instructions included in pack) |
|
|
|
|
## Verification severity
|
|
|
|
How missing or mismatched files are reported:
|
|
|
|
| Mode | required + missing | optional + missing | hash mismatch |
|
|
|------|-------------------|-------------------|--------------|
|
|
| existence | WARNING | INFO | N/A |
|
|
| md5 | CRITICAL | WARNING | UNTESTED |
|
|
|
|
Files with `hle_fallback: true` are downgraded to INFO when missing
|
|
(the emulator has a software fallback).
|
|
|
|
## Discrepancy detection
|
|
|
|
When a file passes platform verification (MD5 match) but fails
|
|
emulator-level validation (wrong CRC32, wrong size), a DISCREPANCY is reported.
|
|
The pack generator searches the repo for a variant that satisfies both.
|
|
If none exists, the platform version is kept.
|
|
|
|
## Security
|
|
|
|
- `safe_extract_zip()` prevents zip-slip path traversal attacks
|
|
- `deterministic_zip` rebuilds MAME ZIPs so same ROMs always produce the same hash
|
|
- `crypto_verify.py` and `sect233r1.py` verify 3DS RSA-2048 signatures and AES-128-CBC integrity
|
|
- ZIP inner ROM verification via `checkInsideZip()` replicates Batocera's behavior
|
|
- `md5_composite()` replicates Recalbox's composite ZIP hash
|
|
|
|
## Edge cases
|
|
|
|
| Case | Handling |
|
|
|------|---------|
|
|
| Batocera truncated MD5 (29 chars) | prefix match in resolution |
|
|
| `zippedFile` entries | MD5 is of the ROM inside the ZIP, not the ZIP itself |
|
|
| Regional variants (same filename) | `by_path_suffix` index disambiguates |
|
|
| MAME BIOS ZIPs | `contents` field documents inner structure |
|
|
| RPG Maker/ScummVM | excluded from dedup (NODEDUP) to preserve directory structure |
|
|
| `strip_components` in data dirs | flattens cache prefix to match expected path |
|
|
| case-insensitive dedup | prevents `font.rom` + `FONT.ROM` conflicts on Windows/macOS |
|
|
| frozen snapshot cores | `.info` may reflect current version while code is pinned to an old one. Only the frozen source at the pinned tag is reliable (e.g. desmume2015, mame2003) |
|
|
|
|
### File resolution chain
|
|
|
|
`resolve_local_file` in `common.py` tries each strategy in order, returning the
|
|
first match. Used by both `verify.py` and `generate_pack.py`.
|
|
|
|
```mermaid
|
|
graph TD
|
|
START([resolve_local_file]) --> S0{path_suffix<br/>exact match?}
|
|
S0 -- yes --> EXACT([exact])
|
|
S0 -- no --> S1{SHA1<br/>exact match?}
|
|
S1 -- yes --> EXACT
|
|
S1 -- no --> S2{MD5 direct<br/>or truncated?}
|
|
S2 -- yes --> MD5([md5_exact])
|
|
S2 -- no --> S3{name + aliases<br/>no MD5?}
|
|
S3 -- yes --> EXACT
|
|
S3 -- no --> S4{name + aliases<br/>md5_composite /<br/>direct MD5?}
|
|
S4 -- match --> EXACT
|
|
S4 -- name only --> HM([hash_mismatch])
|
|
S4 -- no --> S5{zippedFile<br/>inner ROM MD5?}
|
|
S5 -- yes --> ZE([zip_exact])
|
|
S5 -- no --> S6{MAME clone<br/>map lookup?}
|
|
S6 -- yes --> MC([mame_clone])
|
|
S6 -- no --> S7{data_dir<br/>cache scan?}
|
|
S7 -- yes --> DD([data_dir])
|
|
S7 -- no --> S8{agnostic<br/>fallback?}
|
|
S8 -- yes --> AG([agnostic_fallback])
|
|
S8 -- no --> NF([not_found])
|
|
|
|
style START fill:#2d333b,stroke:#adbac7,color:#adbac7
|
|
style EXACT fill:#2d333b,stroke:#adbac7,color:#adbac7
|
|
style MD5 fill:#2d333b,stroke:#adbac7,color:#adbac7
|
|
style HM fill:#2d333b,stroke:#adbac7,color:#adbac7
|
|
style ZE fill:#2d333b,stroke:#adbac7,color:#adbac7
|
|
style MC fill:#2d333b,stroke:#adbac7,color:#adbac7
|
|
style DD fill:#2d333b,stroke:#adbac7,color:#adbac7
|
|
style AG fill:#2d333b,stroke:#adbac7,color:#adbac7
|
|
style NF fill:#2d333b,stroke:#adbac7,color:#adbac7
|
|
```
|
|
|
|
## Platform inheritance
|
|
|
|
Platform configs support `inherits:` to share definitions.
|
|
Lakka inherits from RetroArch, RetroPie inherits from RetroArch with `base_destination: BIOS`.
|
|
`overrides:` allows child platforms to modify specific systems from the parent.
|
|
|
|
Core resolution (`resolve_platform_cores`) uses three strategies:
|
|
|
|
- `cores: all_libretro` - include all profiles with `libretro` in their type
|
|
- `cores: [list]` - include only named profiles
|
|
- `cores:` absent - fallback to system ID intersection between platform and profiles
|
|
|
|
## Hardware target filtering
|
|
|
|
`--target TARGET` filters packs and verification by hardware (e.g. `switch`, `rpi4`, `x86_64`).
|
|
Target configs are in `platforms/targets/`. Overrides in `_overrides.yml` add aliases and
|
|
adjust core lists per target. `filter_systems_by_target` excludes systems whose cores are
|
|
not available on the target. Without `--target`, all systems are included.
|
|
|
|
## MAME clone map
|
|
|
|
`_mame_clones.json` at repo root maps MAME clone ROM names to their canonical parent.
|
|
When a clone ZIP was deduplicated, `resolve_local_file` uses this map to find the canonical file.
|
|
|
|
## Install manifests
|
|
|
|
`generate_pack.py --manifest` produces JSON manifests in `install/` for each platform.
|
|
These contain file lists with SHA1 hashes, platform detection config, and standalone copy
|
|
instructions. `install/targets/` contains per-architecture core availability.
|
|
The cross-platform installer (`install.py`) uses these manifests to auto-detect the
|
|
user's platform, filter files by hardware target, and download with SHA1 verification.
|
|
|
|
## Tests
|
|
|
|
5 test files, 259 tests total:
|
|
|
|
| File | Tests | Coverage |
|
|
|------|-------|----------|
|
|
| `test_e2e.py` | 196 | file resolution, verification, severity, cross-reference, aliases, inheritance, shared groups, data dirs, storage tiers, HLE, launchers, platform grouping, core resolution, target filtering, truth/diff, exporters |
|
|
| `test_pack_integrity.py` | 8 | extract ZIP packs to disk, verify paths + hashes per platform's native mode |
|
|
| `test_mame_parser.py` | 22 | BIOS root set detection, ROM block parsing, macro expansion |
|
|
| `test_fbneo_parser.py` | 16 | BIOS set detection, ROM info parsing |
|
|
| `test_hash_merge.py` | 17 | MAME/FBNeo YAML merge, diff detection |
|
|
|
|
```bash
|
|
python -m unittest tests.test_e2e -v
|
|
```
|
|
|
|
## CI workflows
|
|
|
|
| Workflow | File | Trigger | Role |
|
|
|----------|------|---------|------|
|
|
| Build & Release | `build.yml` | `workflow_dispatch` (manual) | restore large files, build packs, create GitHub release |
|
|
| Deploy Site | `deploy-site.yml` | push to main (platforms, emulators, wiki, scripts) + manual | generate site, build with MkDocs, deploy to GitHub Pages |
|
|
| PR Validation | `validate.yml` | pull request on `bios/`/`platforms/` | validate BIOS hashes, schema check, run tests, auto-label PR |
|
|
| Weekly Sync | `watch.yml` | cron (Monday 6 AM UTC) + manual | scrape upstream sources, detect changes, create update PR |
|
|
|
|
Build workflow has a 7-day rate limit between releases and keeps the 3 most recent.
|
|
|
|
## License
|
|
|
|
See `LICENSE` at repo root. Files are provided for personal backup and archival.
|
|
|