Move 17 case-colliding BIOS variants to .variants/ so Windows
and macOS clones no longer lose files. Existence-based platforms
keep the primary, MD5-based platforms resolve from .variants/.
Also fix resolve_local_file zipped_file resolution: when multiple
ZIPs share a name, verify the inner ROM exists before accepting
a candidate. Fixes adam_fdc.zip resolving to the wrong archive.
Batocera upstream has a truncated 29-char MD5 for zx48.rom.
The scraper now resolves truncated hashes via prefix match
against database.json, preventing schema validation failures.
resolve_platform_cores() links platforms to their cores via
three strategies: all_libretro, explicit list, system ID
fallback. Pack generation always includes core requirements
beyond platform baseline. Case-insensitive dedup prevents
conflicts on Windows/macOS. Data dir strip_components fixes
doubled paths for Dolphin and PPSSPP caches.
- fix KeyError in compute_coverage (generate_readme, generate_site)
- fix comma-separated MD5 handling in generate_pack check_inside_zip
- fix _verify_file_hash to handle multi-MD5 for large files
- fix external downloads not tracked in seen_destinations/file_status
- fix tar path traversal in _is_safe_tar_member (refresh_data_dirs)
- fix predictable tmp path in download.py
- fix _sanitize_path to filter "." components
- remove blanket data_dir suppression in find_undeclared_files
- remove blanket data_dir suppression in cross_reference
- add status_counts to verify_platform return value
- add md5_composite cache for repeated ZIP hashing
Added exclusion_note field to emulator profiles. verify.py reads
this field instead of parsing notes text with fragile keywords.
desmume2015: explains .info vs code discrepancy
dolphin_launcher: explains standalone BIOS management
All exclusion messages now come from YAML data, not Python strings.
New section "Intentional exclusions" explains why certain emulator
files are NOT in the pack:
- [frozen_snapshot]: code doesn't load .info firmware (desmume2015)
- [launcher]: BIOS managed by standalone emulator (dolphin_launcher)
- [standalone_only]: files for standalone mode, not libretro
Makes it clear that omissions are by design, not bugs.
Created desmume2015.yml with files: [] (code doesn't load BIOS).
Removed desmume2015 from desmume.yml cores list.
Removed cdi2015 from same_cdi.yml (separate profile exists).
Frozen snapshots must have their own profiles because their
firmware behavior differs from the current version.
_collect_emulator_extras() now uses find_undeclared_files() from
verify.py instead of manual emulator name lists. This gives:
- System-overlap matching (automatic, no manual config needed)
- mode: standalone filtering (no standalone files in libretro packs)
- type: launcher filtering (no launcher BIOS in system_dir)
- data_directories coverage (no false gaps)
- hle_fallback propagation
- Works for ANY platform (same logic for RetroArch, Batocera, etc.)
RetroArch --include-extras now discovers 91 extra files from
emulator profiles automatically.
Added hle_fallback: true/false per file in emulator profiles.
When a core has HLE and the file is missing, severity downgrades
to INFO instead of CRITICAL — core works without it.
verify.py builds an HLE index from emulator profiles and applies
it during severity computation. Cross-reference now skips launcher
profiles (type: launcher) and includes hle_fallback in undeclared
file reports.
33 E2E tests (4 new: HLE severity, HLE index, launcher skip,
cross-ref HLE). 0 regressions.
Based on source code analysis:
- RetroArch core_info.c:2233 — existence check only, no blocking
- PCSX ReARMed psxbios.c:28 — full HLE BIOS replacement
- Dolphin CommonPaths.h — all files optional with HLE
- snes9x — DSP HLE built-in, coprocessor files optional
Removed 7 test files (135 tests) fully covered by test_e2e.py.
The E2E creates all fixtures in setUp, exercises every code path
with real files and real functions. Single source of regression.
17 new tests covering remaining scenarios:
- md5_composite: manual calc, dirs excluded, compression-independent,
used in resolve_local_file
- storage tiers: external, user_provided, embedded, fetch_large_file
with cache hit and hash rejection
- data_directories: suppress cross-ref gaps when matched, show gaps
when unmatched
- shared groups: _shared.yml includes injection, empty group safe
- YAML inheritance: systems inherited, verification_mode override
- platform grouping: identical merged, different base_dest separated
verify.py output now uses platform-native terminology:
- md5 platforms: X/Y OK, N untested, M missing
- existence platforms: X/Y present, M missing
Each problem shows (required/optional) from platform YAML.
Core gaps section summarizes undeclared files by severity:
- required NOT in repo: critical gaps needing sourcing
- required in repo: can be added to platform config
- optional: informational
Consistency check in pipeline.py updated to match new format.
All 7 platforms verified, consistency OK across verify and pack.
Batocera uses exactly 2 statuses (batocera-systems:967-969):
- MISSING: file not found on disk
- UNTESTED: file present but hash not confirmed
Removed the wrong_hash/untested split — both are UNTESTED per
Batocera's design (file accepted by emulator, just not guaranteed
correct). Fixed duplicate count bug from rename. Reason detail
(MD5 mismatch vs inner file not found) preserved in the message.
Verified against Batocera source: checkBios() lines 1062-1091,
checkInsideZip() lines 978-1009, BiosStatus class lines 967-969.
scripts/pipeline.py runs the full retrobios pipeline in one command:
1. generate_db --force (rebuild database.json)
2. refresh_data_dirs (update data directories, skippable with --offline)
3. verify --all (check all platforms)
4. generate_pack --all (build ZIP packs)
5. consistency check (verify counts == pack counts per platform)
Flags: --offline, --skip-packs, --include-archived, --include-extras.
Summary table shows OK/FAILED per step with total elapsed time.
verify.py now uses the same platform listing as generate_pack.py:
--all shows active platforms, --include-archived adds archived ones.
Before, verify --all listed all .yml files without filtering.
Platforms sharing the same pack (same files + base_destination)
are grouped on one line: "Lakka / RetroArch: 449/449 files OK".
RetroPie stays separate (different base_destination BIOS/ vs system/).
Archived platforms (RetroPie) excluded from --all, available via
--platform retropie. Grouping matches generate_pack behavior.
Both tools now count by unique destination (what the user sees on
disk), not by YAML entry or internal check. Same file shared by
multiple systems = counted once. Same file checked for multiple
inner ROMs = counted once with worst-case status.
Output format:
verify: "Platform: X/Y files OK, N wrong hash, M missing [mode]"
pack: "pack.zip: P files packed, X/Y files OK, N wrong hash [mode]"
X/Y is the same number in both tools for the same platform.
"files packed" differs from "files OK" when data_directories or
EmuDeck MD5-only entries are involved — this is expected and clear
from the numbers (e.g. 34 packed but 161 verified for EmuDeck).
Both tools now report: X files, Y/Z checks verified (N duplicate/inner
checks), with the same check counts for the same platform. The
duplicate/inner detail explains why checks > files (multiple YAML
entries per ZIP for inner ROM verification, EmuDeck MD5 whitelists).
File counts differ legitimately (verify counts resolved files on disk,
pack counts files in the ZIP including data_directories).
generate_pack now reports both file count and verification check
count, matching verify.py's accounting. All YAML entries are counted
as checks, including duplicate destinations (verified but not packed
twice) and EmuDeck-style no-filename entries (verified by MD5 in DB).
Before: verify 679/680 vs pack 358/359 (confusing discrepancy)
After: verify 679/680 vs pack 679/680 checks (consistent)
generate_pack.py skipped duplicate destination entries before
running verification, hiding untested files that verify.py caught.
Now all entries are verified even when the file is already packed,
ensuring both tools report the same untested count.
Batocera: verify 679/680 (1 untested), pack 358/359 (1 untested).
Both report sc3000.zip as the single untested file.
resolve_local_file returns hash_mismatch for zipped_file entries
because container MD5 differs from inner ROM MD5. This is expected.
Reverted the flawed deferral approach in common.py that resolved
to wrong ZIPs via zip_contents flat index (electron64.zip instead
of bbcb.zip when inner ROMs share the same MD5).
Fixed generate_pack.py to verify inner ZIP content via
check_inside_zip before marking as untested, matching verify.py
behavior. pc6001/bbcb/fm7 ZIPs now correctly verified.
verify.py: 679/680 Batocera (1 untested: sc3000 true mismatch)
generate_pack.py: 359/359 Batocera (0 untested)
generate_db.py now reads aliases from emulator YAMLs and indexes
them in database.json by_name. resolve_local_file in common.py
tries all alias names when the primary name fails to match.
beetle_psx alt_names renamed to aliases (was not indexed before).
snes9x BS-X.bios, np2kai FONT.ROM/ide.rom/pci.rom fallbacks,
all now formally declared as aliases and indexed.
verify --all and generate_pack --all pass with 0 regressions.