post-generation verification: reopen each ZIP, hash every file,
check against database.json. inject manifest.json inside each pack
(self-documenting: path, sha1, md5, size, status per file).
generate SHA256SUMS.txt alongside packs for download verification.
validation index now uses sets for hashes and sizes to support
multiple valid ROM versions (MT-32 v1.04-v2.07, CM-32L variants).
69 tests pass, pipeline complete.
pure python RSA-2048 PKCS1v15 SHA256 for SecureInfo_A,
LocalFriendCodeSeed_B, movable.sed. AES-128-CBC + SHA256 for otp.bin.
keys extracted from azahar default_keys.h, added RSA/ECC sections
to aes_keys.txt. sect233r1 ECC not reproducible (binary field curve).
adler32 hash via zlib.adler32(), min_size/max_size range checks,
signature/crypto tracked as non-reproducible (console-specific keys).
compute_hashes now returns adler32. 69 tests pass including 3 new
tests for adler32, size ranges, and crypto tracking.
add --emulator, --system, --standalone, --list-emulators, --list-systems
to verify.py and generate_pack.py. packs are RTU with data directories,
regional BIOS variants, and archive support.
validation: field per file (size, crc32, md5, sha1) with conflict
detection. by_path_suffix index in database.json for regional variant
resolution via dest_hint. restructure GameCube IPL to regional subdirs.
66 E2E tests, full pipeline verified.
resolve_platform_cores() links platforms to their cores via
three strategies: all_libretro, explicit list, system ID
fallback. Pack generation always includes core requirements
beyond platform baseline. Case-insensitive dedup prevents
conflicts on Windows/macOS. Data dir strip_components fixes
doubled paths for Dolphin and PPSSPP caches.
- fix KeyError in compute_coverage (generate_readme, generate_site)
- fix comma-separated MD5 handling in generate_pack check_inside_zip
- fix _verify_file_hash to handle multi-MD5 for large files
- fix external downloads not tracked in seen_destinations/file_status
- fix tar path traversal in _is_safe_tar_member (refresh_data_dirs)
- fix predictable tmp path in download.py
- fix _sanitize_path to filter "." components
- remove blanket data_dir suppression in find_undeclared_files
- remove blanket data_dir suppression in cross_reference
- add status_counts to verify_platform return value
- add md5_composite cache for repeated ZIP hashing
Added exclusion_note field to emulator profiles. verify.py reads
this field instead of parsing notes text with fragile keywords.
desmume2015: explains .info vs code discrepancy
dolphin_launcher: explains standalone BIOS management
All exclusion messages now come from YAML data, not Python strings.
New section "Intentional exclusions" explains why certain emulator
files are NOT in the pack:
- [frozen_snapshot]: code doesn't load .info firmware (desmume2015)
- [launcher]: BIOS managed by standalone emulator (dolphin_launcher)
- [standalone_only]: files for standalone mode, not libretro
Makes it clear that omissions are by design, not bugs.
Added hle_fallback: true/false per file in emulator profiles.
When a core has HLE and the file is missing, severity downgrades
to INFO instead of CRITICAL — core works without it.
verify.py builds an HLE index from emulator profiles and applies
it during severity computation. Cross-reference now skips launcher
profiles (type: launcher) and includes hle_fallback in undeclared
file reports.
33 E2E tests (4 new: HLE severity, HLE index, launcher skip,
cross-ref HLE). 0 regressions.
Based on source code analysis:
- RetroArch core_info.c:2233 — existence check only, no blocking
- PCSX ReARMed psxbios.c:28 — full HLE BIOS replacement
- Dolphin CommonPaths.h — all files optional with HLE
- snes9x — DSP HLE built-in, coprocessor files optional
verify.py output now uses platform-native terminology:
- md5 platforms: X/Y OK, N untested, M missing
- existence platforms: X/Y present, M missing
Each problem shows (required/optional) from platform YAML.
Core gaps section summarizes undeclared files by severity:
- required NOT in repo: critical gaps needing sourcing
- required in repo: can be added to platform config
- optional: informational
Consistency check in pipeline.py updated to match new format.
All 7 platforms verified, consistency OK across verify and pack.
Batocera uses exactly 2 statuses (batocera-systems:967-969):
- MISSING: file not found on disk
- UNTESTED: file present but hash not confirmed
Removed the wrong_hash/untested split — both are UNTESTED per
Batocera's design (file accepted by emulator, just not guaranteed
correct). Fixed duplicate count bug from rename. Reason detail
(MD5 mismatch vs inner file not found) preserved in the message.
Verified against Batocera source: checkBios() lines 1062-1091,
checkInsideZip() lines 978-1009, BiosStatus class lines 967-969.
verify.py now uses the same platform listing as generate_pack.py:
--all shows active platforms, --include-archived adds archived ones.
Before, verify --all listed all .yml files without filtering.
Platforms sharing the same pack (same files + base_destination)
are grouped on one line: "Lakka / RetroArch: 449/449 files OK".
RetroPie stays separate (different base_destination BIOS/ vs system/).
Archived platforms (RetroPie) excluded from --all, available via
--platform retropie. Grouping matches generate_pack behavior.
Both tools now count by unique destination (what the user sees on
disk), not by YAML entry or internal check. Same file shared by
multiple systems = counted once. Same file checked for multiple
inner ROMs = counted once with worst-case status.
Output format:
verify: "Platform: X/Y files OK, N wrong hash, M missing [mode]"
pack: "pack.zip: P files packed, X/Y files OK, N wrong hash [mode]"
X/Y is the same number in both tools for the same platform.
"files packed" differs from "files OK" when data_directories or
EmuDeck MD5-only entries are involved — this is expected and clear
from the numbers (e.g. 34 packed but 161 verified for EmuDeck).
Both tools now report: X files, Y/Z checks verified (N duplicate/inner
checks), with the same check counts for the same platform. The
duplicate/inner detail explains why checks > files (multiple YAML
entries per ZIP for inner ROM verification, EmuDeck MD5 whitelists).
File counts differ legitimately (verify counts resolved files on disk,
pack counts files in the ZIP including data_directories).
1. fetch_large_file moved to last resort (avoids HTTP before name lookup)
2. fetch_large_file receives first MD5 only (not comma-separated string)
3. verify.py MD5 lookup now splits comma-separated + lowercases (matches generate_pack)
4. seen_destinations simplified to set (stored hash was dead data)
5. Variable suffix shadowing renamed to file_ext
Fix variant name indexing: files in .variants/ now indexed under
canonical name (naomi2.zip instead of naomi2.zip.da79eca4).
Fix .zip detection for variant paths in verify.py.
Add composite MD5 matching in resolver for ZIP variants.
Add hikaru.zip (MAME 0.285, 6 ROMs) and segaboot.gcm (Triforce)
from archive.org. Both match Batocera expected MD5s.
Batocera 679/680 (1 untested: sc3000 private dump)
Recalbox 346/346 (100%)
- batocera_scraper: fix OrderedDict parsing for ast.literal_eval
- auto_fetch: fix TypeError when sha1/md5 is None
- verify: filter non-ZIP files for zipped_file entries (F2)
- verify: distinguish ZIP read errors from hash mismatches (F5)
- generate_pack: track seen_destinations with source hash (F7)
Batocera ep64/ep128.zip now correctly reported as MISSING
instead of false UNTESTED (resolved to .rom instead of .zip)
Critical: stream large file downloads (OOM fix), fix basename match
in auto_fetch, include hashes in pack grouping fingerprint, handle
not_in_zip status in verify, fix escaped quotes in batocera parser.
Important: deduplicate shared group includes, catch coreinfo network
errors, fix NODEDUP path component match, fix CI word splitting on
spaces, replace bare except Exception in 3 files.
Minor: argparse in list_platforms, specific exceptions in download.py.
When a file exists under multiple SHA1s (e.g. awbios.zip in both
Arcade/ and Sega/Dreamcast/), prefer the candidate whose MD5
matches the expected hash instead of always picking the first.
Batocera: 589 -> 661 verified (+72), RetroBat: 341 -> 343 (100%)