Files
libretro/wiki/architecture.md

145 lines
6.2 KiB
Markdown

# Architecture - RetroBIOS
## Directory structure
```
bios/ BIOS and firmware files, organized by Manufacturer/Console/
Manufacturer/Console/ canonical files (one per unique content)
.variants/ alternate versions (different hash, same purpose)
emulators/ one YAML profile per core (285 profiles)
platforms/ one YAML config per platform (scraped from upstream)
_shared.yml shared file groups across platforms
_registry.yml platform metadata (logos, scrapers, status)
_data_dirs.yml data directory definitions (Dolphin Sys, PPSSPP...)
scripts/ all tooling (Python, pyyaml only dependency)
scraper/ upstream scrapers (libretro, batocera, recalbox...)
data/ cached data directories (not BIOS, fetched at build)
schemas/ JSON schemas for validation
tests/ E2E test suite with synthetic fixtures
dist/ generated packs (gitignored)
.cache/ hash cache and large file downloads (gitignored)
```
## Data flow
```
Upstream sources Scrapers parse generate_db.py scans
System.dat (libretro) + fetch versions bios/ on disk
batocera-systems builds database.json
es_bios.xml (recalbox) (SHA1 primary key,
core-info .info files indexes: by_md5, by_name,
by_crc32, by_path_suffix)
emulators/*.yml verify.py checks generate_pack.py resolves
source-verified platform-native files by hash, builds ZIP
from code verification packs per platform
```
## Three layers of data
| Layer | Source | Role |
|-------|--------|------|
| Platform YAML | Scraped from upstream | What the platform declares it needs |
| `_shared.yml` | Curated | Shared files across platforms, reflects actual behavior |
| Emulator profiles | Source-verified | What the code actually loads. Used for cross-reference and gap detection |
The pack combines platform baseline (layer 1) with core requirements (layer 3).
Neither too much (no files from unused cores) nor too few (no missing files for active cores).
## Pack grouping
Platforms that produce identical packs are grouped automatically.
RetroArch and Lakka share the same files and `base_destination` (`system/`),
so they produce one combined pack (`RetroArch_Lakka_BIOS_Pack.zip`).
RetroPie uses `BIOS/` as base path, so it gets a separate pack.
## Storage tiers
| Tier | Meaning |
|------|---------|
| `embedded` (default) | file is in the `bios/` directory, included in packs |
| `external` | file has a `source_url`, downloaded at pack build time |
| `user_provided` | user must provide the file (instructions included in pack) |
## Verification severity
How missing or mismatched files are reported:
| Mode | required + missing | optional + missing | hash mismatch |
|------|-------------------|-------------------|--------------|
| existence | WARNING | INFO | N/A |
| md5 | CRITICAL | WARNING | UNTESTED |
Files with `hle_fallback: true` are downgraded to INFO when missing
(the emulator has a software fallback).
## Discrepancy detection
When a file passes platform verification (MD5 match) but fails
emulator-level validation (wrong CRC32, wrong size), a DISCREPANCY is reported.
The pack generator searches the repo for a variant that satisfies both.
If none exists, the platform version is kept.
## Security
- `safe_extract_zip()` prevents zip-slip path traversal attacks
- `deterministic_zip` rebuilds MAME ZIPs so same ROMs always produce the same hash
- `crypto_verify.py` and `sect233r1.py` verify 3DS RSA-2048 signatures and AES-128-CBC integrity
- ZIP inner ROM verification via `checkInsideZip()` replicates Batocera's behavior
- `md5_composite()` replicates Recalbox's composite ZIP hash
## Edge cases
| Case | Handling |
|------|---------|
| Batocera truncated MD5 (29 chars) | prefix match in resolution |
| `zippedFile` entries | MD5 is of the ROM inside the ZIP, not the ZIP itself |
| Regional variants (same filename) | `by_path_suffix` index disambiguates |
| MAME BIOS ZIPs | `contents` field documents inner structure |
| RPG Maker/ScummVM | excluded from dedup (NODEDUP) to preserve directory structure |
| `strip_components` in data dirs | flattens cache prefix to match expected path |
| case-insensitive dedup | prevents `font.rom` + `FONT.ROM` conflicts on Windows/macOS |
## Platform inheritance
Platform configs support `inherits:` to share definitions.
Lakka inherits from RetroArch, RetroPie inherits from RetroArch with `base_destination: BIOS`.
`overrides:` allows child platforms to modify specific systems from the parent.
Core resolution (`resolve_platform_cores`) uses three strategies:
- `cores: all_libretro` - include all profiles with `libretro` in their type
- `cores: [list]` - include only named profiles
- `cores:` absent - fallback to system ID intersection between platform and profiles
## MAME clone map
`_mame_clones.json` at repo root maps MAME clone ROM names to their canonical parent.
When a clone ZIP was deduplicated, `resolve_local_file` uses this map to find the canonical file.
## Tests
`tests/test_e2e.py` contains 75 end-to-end tests with synthetic fixtures.
Covers: file resolution, verification, severity, cross-reference, aliases,
inheritance, shared groups, data dirs, storage tiers, HLE, launchers,
platform grouping, core resolution (3 strategies + alias exclusion).
```bash
python -m unittest tests.test_e2e -v
```
## CI workflows
| Workflow | File | Trigger | Role |
|----------|------|---------|------|
| Build & Release | `build.yml` | `workflow_dispatch` (manual) | restore large files, build packs, deploy site, create GitHub release |
| PR Validation | `validate.yml` | pull request on `bios/`/`platforms/` | validate BIOS hashes, schema check, run tests, auto-label PR |
| Weekly Sync | `watch.yml` | cron (Monday 6 AM UTC) + manual | scrape upstream sources, detect changes, create update PR |
Build workflow has a 7-day rate limit between releases and keeps the 3 most recent.
## License
See `LICENSE` at repo root. Files are provided for personal backup and archival.