9 new wiki pages: getting-started, faq, troubleshooting, advanced-usage, verification-modes, adding-a-platform, adding-a-scraper, testing-guide, release-process. Updated architecture.md with mermaid diagrams, tools.md with full pipeline and target/exporter sections, profiling.md with missing fields, index.md with glossary and nav links. Expanded CONTRIBUTING.md from stub to full contributor guide. Filter .old.yml from load_emulator_profiles, generate_db alias collection, and generate_readme counts. Fix BizHawk sha1 mode in tools.md, fix RetroPie path, fix export_truth.py typos.
8.1 KiB
Advanced Usage
Fine-grained control over pack generation, hardware filtering, truth analysis, and verification.
Custom Packs
Build from hash
Look up a single MD5 in the database:
python scripts/generate_pack.py --from-md5 d8f1206299c48946e6ec5ef96d014eaa
Build a pack containing only files matching hashes from a list (one MD5 per line, # for comments):
python scripts/generate_pack.py --platform batocera --from-md5-file missing.txt
This is useful when a platform reports missing files and you want to generate a targeted pack rather than re-downloading the full archive.
Split packs
Generate one ZIP per system instead of a single monolithic pack:
python scripts/generate_pack.py --platform retroarch --split
Group the split ZIPs by manufacturer (Sony, Nintendo, Sega, etc.):
python scripts/generate_pack.py --platform retroarch --split --group-by manufacturer
System-specific packs
Extract only the files for a single system within a platform:
python scripts/generate_pack.py --platform retroarch --system sony-playstation
Required only
Exclude optional files from the pack:
python scripts/generate_pack.py --platform batocera --required-only
What counts as "required" depends on the platform YAML. For existence-mode platforms
(RetroArch), the distinction comes from the .info file's required field.
For MD5-mode platforms (Batocera), all declared files are treated as required unless
explicitly marked optional.
Hardware Target Filtering
What targets are
A target represents a hardware architecture where a platform runs. Each architecture has a different set of available cores. For example, the RetroArch Switch target has fewer cores than the x86_64 target because some cores are not ported to ARM.
Target data is scraped from upstream sources (buildbot nightly listings, board configs,
scriptmodules) and stored in platforms/targets/<platform>.yml.
Usage
Filter packs or verification to only include systems reachable by cores available on the target hardware:
python scripts/generate_pack.py --platform retroarch --target switch
python scripts/generate_pack.py --all --target x86_64
python scripts/verify.py --platform batocera --target rpi4
When combined with --all, platforms that define the target are filtered. Platforms
without a target file for that name are left unfiltered (no information to exclude anything).
Platforms that have target data but not the requested target are skipped with an INFO message.
How it works
The filtering pipeline has three stages:
-
load_target_config()readsplatforms/targets/<platform>.ymland returns the set of cores available on the target. Aliases from_overrides.ymlare resolved (e.g.,--target rpi4may matchbcm2711in the target file). -
resolve_platform_cores()determines which emulator profiles are relevant for the platform, then intersects the result with the target's core set. The intersection uses a reverse index built from each profile'scores:field, so that upstream names (e.g.,mednafen_psxon the buildbot) map to profile keys (e.g.,beetle_psx). -
filter_systems_by_target()removes platform systems where every core that emulates them is absent from the target. Systems with no core information are kept (benefit of the doubt). System ID normalization strips manufacturer prefixes and separators so thatxboxmatchesmicrosoft-xbox.
List available targets
python scripts/verify.py --platform retroarch --list-targets
Overrides
platforms/targets/_overrides.yml provides two mechanisms:
- Aliases: map user-facing names to internal target IDs
(e.g.,
rpi4->bcm2711). - add/remove cores: patch the scraped core list for a specific target without overwriting the entire file. Useful when a core is known to work but is not listed on the buildbot, or vice versa.
Single-target platforms
Platforms with only one target (e.g., RetroBat with windows, RomM with browser)
treat --target <their-only-target> as a no-op: the output is identical to running
without --target.
Truth Generation and Diffing
What truth is
Truth data is ground truth generated from emulator profiles. It represents what each core actually needs based on source code analysis, independent of what platform scrapers declare. The purpose is gap analysis: finding files that platforms miss or declare incorrectly.
Generate truth
Build truth YAMLs from emulator profiles for a platform or all platforms:
python scripts/generate_truth.py --platform retroarch
python scripts/generate_truth.py --all --output-dir dist/truth/
Each truth YAML lists every system with its files, hashes, and the emulator profiles that reference them. The output mirrors the platform YAML structure so the two can be diffed directly.
Diff truth vs scraped
Find divergences between generated truth and scraped platform data:
python scripts/diff_truth.py --platform retroarch
python scripts/diff_truth.py --all
The diff reports:
- Files present in truth but absent from the platform YAML (undeclared).
- Files present in the platform YAML but absent from truth (orphaned or from cores not profiled yet).
- Hash mismatches between truth and platform data.
Export to native formats
Convert truth data to the native format each platform consumes:
python scripts/export_native.py --platform batocera # Python dict (batocera-systems)
python scripts/export_native.py --platform recalbox # XML (es_bios.xml)
python scripts/export_native.py --all --output-dir dist/upstream/
This allows submitting corrections upstream in the format maintainers expect.
Emulator-Level Verification
Per-emulator checks
Verify files against a single emulator's ground truth (size, hashes, crypto):
python scripts/verify.py --emulator handy
python scripts/verify.py --emulator handy --verbose
Default output shows aggregate results per file: the core name and which checks apply.
With --verbose, each file expands to one line per core with the exact validation
parameters and source code reference:
lynxboot.img
handy validates size=512 crc32=0x0d973c9d [src/handy/system.h:45]
Per-system checks
Aggregate verification across all cores that emulate a system:
python scripts/verify.py --system atari-lynx
Standalone mode
Some cores have both libretro and standalone modes with different file requirements. Filter to standalone-only:
python scripts/verify.py --emulator dolphin --standalone
Ground truth in verbose output
The verbose report includes a coverage footer:
Ground truth: 142/160 files have emulator validation (88%)
This indicates how many files in the platform can be cross-checked against source-verified
emulator profiles. Files without ground truth rely solely on platform-level verification.
JSON output (--json) always includes the full per-emulator detail regardless of verbosity.
Offline Workflow
Full offline pipeline
Run the entire pipeline without network access:
python scripts/pipeline.py --offline
This skips data directory refresh, MAME/FBNeo hash fetch, and buildbot staleness checks. All other steps (database generation, verification, pack building, consistency check, README, site generation) run normally using cached data.
Partial runs
Skip pack generation when you only need verification results:
python scripts/pipeline.py --offline --skip-packs
Skip documentation generation:
python scripts/pipeline.py --offline --skip-docs
Truth pipeline
Include truth generation and diffing in the pipeline:
python scripts/pipeline.py --offline --with-truth
Include truth + native format export:
python scripts/pipeline.py --offline --with-export
Combining flags
Flags compose freely:
python scripts/pipeline.py --offline --skip-docs --with-truth --target switch
This runs: database generation, verification (filtered to Switch cores), truth generation and diff, consistency check. Packs and docs are skipped, no network access.