# Advanced Usage Fine-grained control over pack generation, hardware filtering, truth analysis, and verification. ## Custom Packs ### Build from hash Look up a single MD5 in the database: ```bash python scripts/generate_pack.py --from-md5 d8f1206299c48946e6ec5ef96d014eaa ``` Build a pack containing only files matching hashes from a list (one MD5 per line, `#` for comments): ```bash python scripts/generate_pack.py --platform batocera --from-md5-file missing.txt ``` This is useful when a platform reports missing files and you want to generate a targeted pack rather than re-downloading the full archive. ### Split packs Generate one ZIP per system instead of a single monolithic pack: ```bash python scripts/generate_pack.py --platform retroarch --split ``` Group the split ZIPs by manufacturer (Sony, Nintendo, Sega, etc.): ```bash python scripts/generate_pack.py --platform retroarch --split --group-by manufacturer ``` ### System-specific packs Extract only the files for a single system within a platform: ```bash python scripts/generate_pack.py --platform retroarch --system sony-playstation ``` ### Required only Exclude optional files from the pack: ```bash python scripts/generate_pack.py --platform batocera --required-only ``` What counts as "required" depends on the platform YAML. For existence-mode platforms (RetroArch), the distinction comes from the `.info` file's `required` field. For MD5-mode platforms (Batocera), all declared files are treated as required unless explicitly marked optional. ## Hardware Target Filtering ### What targets are A target represents a hardware architecture where a platform runs. Each architecture has a different set of available cores. For example, the RetroArch Switch target has fewer cores than the x86_64 target because some cores are not ported to ARM. Target data is scraped from upstream sources (buildbot nightly listings, board configs, scriptmodules) and stored in `platforms/targets/.yml`. ### Usage Filter packs or verification to only include systems reachable by cores available on the target hardware: ```bash python scripts/generate_pack.py --platform retroarch --target switch python scripts/generate_pack.py --all --target x86_64 python scripts/verify.py --platform batocera --target rpi4 ``` When combined with `--all`, platforms that define the target are filtered. Platforms without a target file for that name are left unfiltered (no information to exclude anything). Platforms that have target data but not the requested target are skipped with an INFO message. ### How it works The filtering pipeline has three stages: 1. **`load_target_config()`** reads `platforms/targets/.yml` and returns the set of cores available on the target. Aliases from `_overrides.yml` are resolved (e.g., `--target rpi4` may match `bcm2711` in the target file). 2. **`resolve_platform_cores()`** determines which emulator profiles are relevant for the platform, then intersects the result with the target's core set. The intersection uses a reverse index built from each profile's `cores:` field, so that upstream names (e.g., `mednafen_psx` on the buildbot) map to profile keys (e.g., `beetle_psx`). 3. **`filter_systems_by_target()`** removes platform systems where every core that emulates them is absent from the target. Systems with no core information are kept (benefit of the doubt). System ID normalization strips manufacturer prefixes and separators so that `xbox` matches `microsoft-xbox`. ### List available targets ```bash python scripts/verify.py --platform retroarch --list-targets ``` ### Overrides `platforms/targets/_overrides.yml` provides two mechanisms: - **Aliases**: map user-facing names to internal target IDs (e.g., `rpi4` -> `bcm2711`). - **add/remove cores**: patch the scraped core list for a specific target without overwriting the entire file. Useful when a core is known to work but is not listed on the buildbot, or vice versa. ### Single-target platforms Platforms with only one target (e.g., RetroBat with `windows`, RomM with `browser`) treat `--target ` as a no-op: the output is identical to running without `--target`. ## Truth Generation and Diffing ### What truth is Truth data is ground truth generated from emulator profiles. It represents what each core actually needs based on source code analysis, independent of what platform scrapers declare. The purpose is gap analysis: finding files that platforms miss or declare incorrectly. ### Generate truth Build truth YAMLs from emulator profiles for a platform or all platforms: ```bash python scripts/generate_truth.py --platform retroarch python scripts/generate_truth.py --all --output-dir dist/truth/ ``` Each truth YAML lists every system with its files, hashes, and the emulator profiles that reference them. The output mirrors the platform YAML structure so the two can be diffed directly. ### Diff truth vs scraped Find divergences between generated truth and scraped platform data: ```bash python scripts/diff_truth.py --platform retroarch python scripts/diff_truth.py --all ``` The diff reports: - Files present in truth but absent from the platform YAML (undeclared). - Files present in the platform YAML but absent from truth (orphaned or from cores not profiled yet). - Hash mismatches between truth and platform data. ### Export to native formats Convert truth data to the native format each platform consumes: ```bash python scripts/export_native.py --platform batocera # Python dict (batocera-systems) python scripts/export_native.py --platform recalbox # XML (es_bios.xml) python scripts/export_native.py --all --output-dir dist/upstream/ ``` This allows submitting corrections upstream in the format maintainers expect. ## Emulator-Level Verification ### Per-emulator checks Verify files against a single emulator's ground truth (size, hashes, crypto): ```bash python scripts/verify.py --emulator handy python scripts/verify.py --emulator handy --verbose ``` Default output shows aggregate results per file: the core name and which checks apply. With `--verbose`, each file expands to one line per core with the exact validation parameters and source code reference: ``` lynxboot.img handy validates size=512 crc32=0x0d973c9d [src/handy/system.h:45] ``` ### Per-system checks Aggregate verification across all cores that emulate a system: ```bash python scripts/verify.py --system atari-lynx ``` ### Standalone mode Some cores have both libretro and standalone modes with different file requirements. Filter to standalone-only: ```bash python scripts/verify.py --emulator dolphin --standalone ``` ### Ground truth in verbose output The verbose report includes a coverage footer: ``` Ground truth: 142/160 files have emulator validation (88%) ``` This indicates how many files in the platform can be cross-checked against source-verified emulator profiles. Files without ground truth rely solely on platform-level verification. JSON output (`--json`) always includes the full per-emulator detail regardless of verbosity. ## Offline Workflow ### Full offline pipeline Run the entire pipeline without network access: ```bash python scripts/pipeline.py --offline ``` This skips data directory refresh, MAME/FBNeo hash fetch, and buildbot staleness checks. All other steps (database generation, verification, pack building, consistency check, README, site generation) run normally using cached data. ### Partial runs Skip pack generation when you only need verification results: ```bash python scripts/pipeline.py --offline --skip-packs ``` Skip documentation generation: ```bash python scripts/pipeline.py --offline --skip-docs ``` ### Truth pipeline Include truth generation and diffing in the pipeline: ```bash python scripts/pipeline.py --offline --with-truth ``` Include truth + native format export: ```bash python scripts/pipeline.py --offline --with-export ``` ### Combining flags Flags compose freely: ```bash python scripts/pipeline.py --offline --skip-docs --with-truth --target switch ``` This runs: database generation, verification (filtered to Switch cores), truth generation and diff, consistency check. Packs and docs are skipped, no network access.