Commit Graph

94 Commits

Author SHA1 Message Date
Abdessamad Derraz
94000bdaef fix: align verify and pack validation, pipeline 100% consistent
generate_pack.py now applies emulator-level validation (crc32, sha1,
adler32) matching verify.py behavior. existence mode: validation is
informational (file present = OK). md5 mode: validation downgrades
to UNTESTED. clone resolution moved to common.py resolve_local_file.
all 6 platforms pass consistency check.
2026-03-24 22:21:47 +01:00
Abdessamad Derraz
ae4846550f fix: clone resolution in common.py, move clone map to root
moved _mame_clones.json out of bios/ (was indexed by generate_db.py
as BIOS file). clone resolution now in common.py resolve_local_file
so all tools (verify, pack, cross_reference) resolve clones
transparently. removed duplicate clone code from generate_pack.py.
added error handling on os.remove in dedup.py. consistency check
now passes for Batocera/EmuDeck/Lakka/RetroArch (4/6 platforms).
2026-03-24 21:57:49 +01:00
Abdessamad Derraz
85308edd73 fix: dedup edge cases — preserve non-ZIP different-name files
non-ZIP files with different names but same content (64DD_IPL_US.n64
vs IPL_USA.n64) are now preserved — each name may be needed by a
different emulator. only same-name duplicates and MAME ZIP clones
are removed. added empty directory cleanup post-dedup.
2026-03-24 21:39:25 +01:00
Abdessamad Derraz
fb1007496d chore: deduplicate bios/ — remove 427 files, save 227 MB
true duplicates (same file in multiple dirs): removed copies, kept
canonical. MAME device clones (different names, identical content):
removed copies, created _mame_clones.json mapping for pack-time
assembly via deterministic ZIP rebuild. generate_pack.py resolves
clones transparently. 95 canonical ZIPs serve 392 clone names.
2026-03-24 21:35:50 +01:00
Abdessamad Derraz
8fcb86ba35 feat: deterministic MAME ZIP assembly in packs
all ZIP files (neogeo.zip, pgm.zip, etc.) are rebuilt with fixed
metadata before packing: sorted filenames, epoch timestamps, fixed
permissions, deflate level 9. same ROM atoms = same ZIP hash, always.
115 internal ZIPs verified identical across two independent builds.
enables version-agnostic ZIP assembly from ROM atoms indexed by CRC32.
2026-03-24 15:17:12 +01:00
Abdessamad Derraz
34e4c36f1c feat: pack integrity verification, manifests, SHA256SUMS
post-generation verification: reopen each ZIP, hash every file,
check against database.json. inject manifest.json inside each pack
(self-documenting: path, sha1, md5, size, status per file).
generate SHA256SUMS.txt alongside packs for download verification.

validation index now uses sets for hashes and sizes to support
multiple valid ROM versions (MT-32 v1.04-v2.07, CM-32L variants).
69 tests pass, pipeline complete.
2026-03-24 14:56:02 +01:00
Abdessamad Derraz
11db9892bf feat: add sha256 validation support to verify.py 2026-03-24 11:49:58 +01:00
Abdessamad Derraz
685713a7e6 feat: add sect233r1 ECDSA verification for 3DS OTP cert
pure python GF(2^233) field arithmetic, binary curve point operations,
and ECDSA-SHA256 on sect233r1. verifies OTP CTCert against nintendo
root CA public key. zero dependencies. sign+verify round-trip tested,
n*G=O verified, wrong key/message rejection confirmed.
2026-03-24 11:45:16 +01:00
Abdessamad Derraz
d4849681a7 feat: add 3DS signature/crypto verification to verify.py
pure python RSA-2048 PKCS1v15 SHA256 for SecureInfo_A,
LocalFriendCodeSeed_B, movable.sed. AES-128-CBC + SHA256 for otp.bin.
keys extracted from azahar default_keys.h, added RSA/ECC sections
to aes_keys.txt. sect233r1 ECC not reproducible (binary field curve).
2026-03-24 11:36:29 +01:00
Abdessamad Derraz
8141a34faa feat: full ground truth validation in verify.py
adler32 hash via zlib.adler32(), min_size/max_size range checks,
signature/crypto tracked as non-reproducible (console-specific keys).
compute_hashes now returns adler32. 69 tests pass including 3 new
tests for adler32, size ranges, and crypto tracking.
2026-03-24 11:11:38 +01:00
Abdessamad Derraz
470bb6ceb9 feat: support min_size/max_size validation in verify.py
reproduces ground truth size checks from emulator profiles: exact
size, min_size lower bound, max_size upper bound. all 66 tests pass.
2026-03-24 10:53:01 +01:00
Abdessamad Derraz
1d350f0578 feat: add emulator/system pack generation, validation checks, path resolution
add --emulator, --system, --standalone, --list-emulators, --list-systems
to verify.py and generate_pack.py. packs are RTU with data directories,
regional BIOS variants, and archive support.

validation: field per file (size, crc32, md5, sha1) with conflict
detection. by_path_suffix index in database.json for regional variant
resolution via dest_hint. restructure GameCube IPL to regional subdirs.

66 E2E tests, full pipeline verified.
2026-03-22 14:02:20 +01:00
Abdessamad Derraz
74f17694c2 feat: add category field to emulator profiles, source missing BIOS
Add category: game_data to sdlpal, nxengine, opentyrian, easyrpg,
mkxp_z profiles. verify.py separates game_data from bios in core
gap metrics for cleaner coverage numbers.

New BIOS files: Cemu fonts (4), QEMU bios-256k + vgabios-stdvga,
GAM4980 ROMs (2), SC-3000 Export variant.
2026-03-21 07:37:22 +01:00
Abdessamad Derraz
27df5c8fb5 fix: resolve case collisions on case-insensitive filesystems
Move 17 case-colliding BIOS variants to .variants/ so Windows
and macOS clones no longer lose files. Existence-based platforms
keep the primary, MD5-based platforms resolve from .variants/.

Also fix resolve_local_file zipped_file resolution: when multiple
ZIPs share a name, verify the inner ROM exists before accepting
a candidate. Fixes adam_fdc.zip resolving to the wrong archive.
2026-03-20 20:02:42 +01:00
Abdessamad Derraz
21bc225cac fix: resolve truncated md5 in batocera scraper
Batocera upstream has a truncated 29-char MD5 for zx48.rom.
The scraper now resolves truncated hashes via prefix match
against database.json, preventing schema validation failures.
2026-03-19 23:52:25 +01:00
Abdessamad Derraz
6ee162f8fb chore: add MAME and RetroDECK ROM sets 2026-03-19 23:26:49 +01:00
monster-penguin
1fcb948a00 Add RetroDECK Platform Support (#36)
* Add files via upload

* Add files via upload

* Update _registry.yml
2026-03-19 17:10:37 +01:00
Abdessamad Derraz
6a21a99c22 feat: platform-core registry for exact pack generation
resolve_platform_cores() links platforms to their cores via
three strategies: all_libretro, explicit list, system ID
fallback. Pack generation always includes core requirements
beyond platform baseline. Case-insensitive dedup prevents
conflicts on Windows/macOS. Data dir strip_components fixes
doubled paths for Dolphin and PPSSPP caches.
2026-03-19 16:10:43 +01:00
Abdessamad Derraz
257ec1a527 fix: round 2 audit fixes, updated emulator profiles
Scripts:
- fix generate_site nav regex destroying mkdocs.yml content
- fix auto_fetch comma-separated MD5 in find_missing
- fix verify print_platform_result conflating untested/missing
- fix validate_pr path traversal and symlink check
- fix batocera_scraper brace counting and escaped quotes in strings
- fix emudeck_scraper hash search crossing function boundaries
- fix pipeline.py cwd to repo root via Path(__file__)
- normalize SHA1 comparison to lowercase in generate_pack

Emulator profiles:
- emux_gb/nes/sms: reclassify from alias to standalone profiles
- ep128emu: remove .info-only files not referenced in source
- fbalpha2012 variants: full source-verified profiles
- fbneo_cps12: add new profile
2026-03-19 15:00:18 +01:00
Abdessamad Derraz
38d605c7d5 fix: audit fixes across verify, pack, security, and performance
- fix KeyError in compute_coverage (generate_readme, generate_site)
- fix comma-separated MD5 handling in generate_pack check_inside_zip
- fix _verify_file_hash to handle multi-MD5 for large files
- fix external downloads not tracked in seen_destinations/file_status
- fix tar path traversal in _is_safe_tar_member (refresh_data_dirs)
- fix predictable tmp path in download.py
- fix _sanitize_path to filter "." components
- remove blanket data_dir suppression in find_undeclared_files
- remove blanket data_dir suppression in cross_reference
- add status_counts to verify_platform return value
- add md5_composite cache for repeated ZIP hashing
2026-03-19 14:04:34 +01:00
Abdessamad Derraz
e1410ef4a6 fix: exclusion reasons from YAML, not hardcoded in Python
Added exclusion_note field to emulator profiles. verify.py reads
this field instead of parsing notes text with fragile keywords.

desmume2015: explains .info vs code discrepancy
dolphin_launcher: explains standalone BIOS management

All exclusion messages now come from YAML data, not Python strings.
2026-03-19 13:17:55 +01:00
Abdessamad Derraz
114732dc6d feat: intentional exclusion notes in verify report
New section "Intentional exclusions" explains why certain emulator
files are NOT in the pack:
- [frozen_snapshot]: code doesn't load .info firmware (desmume2015)
- [launcher]: BIOS managed by standalone emulator (dolphin_launcher)
- [standalone_only]: files for standalone mode, not libretro

Makes it clear that omissions are by design, not bugs.
2026-03-19 13:15:26 +01:00
Abdessamad Derraz
2509c61ffe feat: detailed core gap categories in verify report 2026-03-19 13:12:14 +01:00
Abdessamad Derraz
316d2467eb refactor: replace emulator extras with cross-reference discovery
_collect_emulator_extras() now uses find_undeclared_files() from
verify.py instead of manual emulator name lists. This gives:
- System-overlap matching (automatic, no manual config needed)
- mode: standalone filtering (no standalone files in libretro packs)
- type: launcher filtering (no launcher BIOS in system_dir)
- data_directories coverage (no false gaps)
- hle_fallback propagation
- Works for ANY platform (same logic for RetroArch, Batocera, etc.)

RetroArch --include-extras now discovers 91 extra files from
emulator profiles automatically.
2026-03-19 13:10:36 +01:00
Abdessamad Derraz
d5daf98e5e feat: hle_fallback field + launcher filtering in verify
Added hle_fallback: true/false per file in emulator profiles.
When a core has HLE and the file is missing, severity downgrades
to INFO instead of CRITICAL — core works without it.

verify.py builds an HLE index from emulator profiles and applies
it during severity computation. Cross-reference now skips launcher
profiles (type: launcher) and includes hle_fallback in undeclared
file reports.

33 E2E tests (4 new: HLE severity, HLE index, launcher skip,
cross-ref HLE). 0 regressions.

Based on source code analysis:
- RetroArch core_info.c:2233 — existence check only, no blocking
- PCSX ReARMed psxbios.c:28 — full HLE BIOS replacement
- Dolphin CommonPaths.h — all files optional with HLE
- snes9x — DSP HLE built-in, coprocessor files optional
2026-03-19 12:51:52 +01:00
Abdessamad Derraz
6d9edc5110 fix: review findings — hoist constants, cache emu profiles, renumber steps
- Hoist sev_order/sev_prio dicts to module-level constants (was rebuilt
  every loop iteration)
- Cache emulator profiles across platforms in verify main() (was loading
  260 YAMLs per platform, now loaded once)
- Renumber resolve_local_file steps 1-5 (was 1,2,3,5,6 after removal)
- Pass emu_profiles through verify_platform → find_undeclared_files
2026-03-19 11:22:58 +01:00
Abdessamad Derraz
b9cdda07ee refactor: DRY consolidation + 83 unit tests
Moved shared functions to common.py (single source of truth):
- check_inside_zip (was in verify.py, imported by generate_pack)
- build_zip_contents_index (was duplicated in verify + generate_pack)
- load_emulator_profiles (was in verify, cross_reference, generate_site)
- group_identical_platforms (was in verify + generate_pack)

Added tests/ with 83 unit tests covering:
- resolve_local_file: SHA1, MD5, name, alias, truncated, zip_contents
- verify: existence, md5, zipped_file, multi-hash, severity mapping
- aliases: field parsing, by_name indexing, beetle_psx field rename
- pack: dedup, file_status, zipped_file inner check, EmuDeck entries
- severity: all 12 combinations, platform-native behavior

0 regressions: pipeline.py --all produces identical results.
2026-03-19 11:19:50 +01:00
Abdessamad Derraz
e240c70126 feat: complete platform-native verification with cross-reference
verify.py output now uses platform-native terminology:
- md5 platforms: X/Y OK, N untested, M missing
- existence platforms: X/Y present, M missing

Each problem shows (required/optional) from platform YAML.

Core gaps section summarizes undeclared files by severity:
- required NOT in repo: critical gaps needing sourcing
- required in repo: can be added to platform config
- optional: informational

Consistency check in pipeline.py updated to match new format.
All 7 platforms verified, consistency OK across verify and pack.
2026-03-19 10:44:17 +01:00
Abdessamad Derraz
5fd3b148df feat: platform-native verification with severity and cross-reference
verify.py now simulates each platform's exact BIOS check behavior:
- RetroArch: existence only (core_info.c path_is_valid)
- Batocera: MD5 + checkInsideZip, no required distinction
- Recalbox: MD5 + mandatory/hashMatchMandatory, 3-level severity

Per-file required/optional from platform YAMLs now affects severity:
- CRITICAL: required file missing or bad hash (md5 platforms)
- WARNING: optional missing or hash mismatch
- INFO: optional missing on existence-only platforms
- OK: verified

Cross-references emulator profiles to list undeclared files used by
cores available on each platform (420 for Batocera, 465 for RetroArch).

Verified against source code:
- Batocera: batocera-systems:967-1091 (BiosStatus, checkBios, checkInsideZip)
- Recalbox: Bios.cpp:109-130 (mandatory, hashMatchMandatory, Green/Yellow/Red)
- RetroArch: .info firmware_opt (existence check only)
2026-03-19 10:11:39 +01:00
Abdessamad Derraz
be5937d514 fix: align status terminology with Batocera source code
Batocera uses exactly 2 statuses (batocera-systems:967-969):
- MISSING: file not found on disk
- UNTESTED: file present but hash not confirmed

Removed the wrong_hash/untested split — both are UNTESTED per
Batocera's design (file accepted by emulator, just not guaranteed
correct). Fixed duplicate count bug from rename. Reason detail
(MD5 mismatch vs inner file not found) preserved in the message.

Verified against Batocera source: checkBios() lines 1062-1091,
checkInsideZip() lines 978-1009, BiosStatus class lines 967-969.
2026-03-19 09:49:16 +01:00
Abdessamad Derraz
c27720d710 feat: unified pipeline script with consistency check
scripts/pipeline.py runs the full retrobios pipeline in one command:
1. generate_db --force (rebuild database.json)
2. refresh_data_dirs (update data directories, skippable with --offline)
3. verify --all (check all platforms)
4. generate_pack --all (build ZIP packs)
5. consistency check (verify counts == pack counts per platform)

Flags: --offline, --skip-packs, --include-archived, --include-extras.
Summary table shows OK/FAILED per step with total elapsed time.
2026-03-19 09:33:50 +01:00
Abdessamad Derraz
6e421f6d84 fix: verify uses list_platforms for --all, add --include-archived
verify.py now uses the same platform listing as generate_pack.py:
--all shows active platforms, --include-archived adds archived ones.
Before, verify --all listed all .yml files without filtering.
2026-03-19 09:23:33 +01:00
Abdessamad Derraz
cfe2b1ff3d refactor: group identical platforms in verify output
Platforms sharing the same pack (same files + base_destination)
are grouped on one line: "Lakka / RetroArch: 449/449 files OK".
RetroPie stays separate (different base_destination BIOS/ vs system/).
Archived platforms (RetroPie) excluded from --all, available via
--platform retropie. Grouping matches generate_pack behavior.
2026-03-19 09:15:29 +01:00
Abdessamad Derraz
a88a452469 refactor: clear, consistent output for verify and generate_pack
Both tools now count by unique destination (what the user sees on
disk), not by YAML entry or internal check. Same file shared by
multiple systems = counted once. Same file checked for multiple
inner ROMs = counted once with worst-case status.

Output format:
  verify:  "Platform: X/Y files OK, N wrong hash, M missing [mode]"
  pack:    "pack.zip: P files packed, X/Y files OK, N wrong hash [mode]"

X/Y is the same number in both tools for the same platform.
"files packed" differs from "files OK" when data_directories or
EmuDeck MD5-only entries are involved — this is expected and clear
from the numbers (e.g. 34 packed but 161 verified for EmuDeck).
2026-03-19 09:06:00 +01:00
Abdessamad Derraz
866ee40209 feat: harmonize verify and pack output format
Both tools now report: X files, Y/Z checks verified (N duplicate/inner
checks), with the same check counts for the same platform. The
duplicate/inner detail explains why checks > files (multiple YAML
entries per ZIP for inner ROM verification, EmuDeck MD5 whitelists).

File counts differ legitimately (verify counts resolved files on disk,
pack counts files in the ZIP including data_directories).
2026-03-19 08:57:45 +01:00
Abdessamad Derraz
0f84bc2417 feat: harmonize check counts between verify and generate_pack
generate_pack now reports both file count and verification check
count, matching verify.py's accounting. All YAML entries are counted
as checks, including duplicate destinations (verified but not packed
twice) and EmuDeck-style no-filename entries (verified by MD5 in DB).

Before: verify 679/680 vs pack 358/359 (confusing discrepancy)
After:  verify 679/680 vs pack 679/680 checks (consistent)
2026-03-19 08:52:58 +01:00
Abdessamad Derraz
8a9dea91c2 fix: verify and pack report consistent untested counts
generate_pack.py skipped duplicate destination entries before
running verification, hiding untested files that verify.py caught.
Now all entries are verified even when the file is already packed,
ensuring both tools report the same untested count.

Batocera: verify 679/680 (1 untested), pack 358/359 (1 untested).
Both report sc3000.zip as the single untested file.
2026-03-19 08:43:07 +01:00
Abdessamad Derraz
6f82b5520d fix: zipped_file hash_mismatch handling in pack generation
resolve_local_file returns hash_mismatch for zipped_file entries
because container MD5 differs from inner ROM MD5. This is expected.

Reverted the flawed deferral approach in common.py that resolved
to wrong ZIPs via zip_contents flat index (electron64.zip instead
of bbcb.zip when inner ROMs share the same MD5).

Fixed generate_pack.py to verify inner ZIP content via
check_inside_zip before marking as untested, matching verify.py
behavior. pc6001/bbcb/fm7 ZIPs now correctly verified.

verify.py: 679/680 Batocera (1 untested: sc3000 true mismatch)
generate_pack.py: 359/359 Batocera (0 untested)
2026-03-19 08:30:03 +01:00
Abdessamad Derraz
f3db61162c feat: aliases support in resolve and db generation
generate_db.py now reads aliases from emulator YAMLs and indexes
them in database.json by_name. resolve_local_file in common.py
tries all alias names when the primary name fails to match.

beetle_psx alt_names renamed to aliases (was not indexed before).
snes9x BS-X.bios, np2kai FONT.ROM/ide.rom/pci.rom fallbacks,
all now formally declared as aliases and indexed.

verify --all and generate_pack --all pass with 0 regressions.
2026-03-19 08:15:13 +01:00
Abdessamad Derraz
86dbdf28e5 feat: core profiles, data_dirs buildbot, cross_ref fix
profiles: amiberry (new), amiarcadia, atari800, azahar, b2,
bk, blastem, bluemsx, freeintv updated with source refs,
upstream field, mode field, data_directories.

_data_dirs.yml: buildbot source for retroarch platforms,
strip_components for nested ZIPs, freeintv-overlays fixed.

cross_reference.py: data_directories-aware gap analysis,
suppresses false gaps when emulator+platform share refs.

refresh_data_dirs.py: ZIP strip_components support,
for_platforms filter, ETag freshness for buildbot.

scraper: bluemsx single ref, freeintv overlays injection.
generate_pack.py: warning on missing data directory cache.
2026-03-18 21:20:02 +01:00
Abdessamad Derraz
846640dd7c feat: emulator mode field, archive ZX81 standalone ROMs
emulator profiles support mode: standalone | libretro | both.
cross_reference.py skips standalone-only files for libretro platforms.
81.yml: type standalone + libretro, upstream ref added, files listed
with mode: standalone and source_refs to both codebases.
bios/Sinclair/ZX 81/: zx81.rom (8K) and dkchr.rom (4K) archived.
2026-03-18 17:37:01 +01:00
Abdessamad Derraz
9b537492c0 feat: scraper injects data_directories refs into retroarch.yml 2026-03-18 16:06:56 +01:00
Abdessamad Derraz
c9e2bf8d33 feat: generate_pack.py integrates data directory refresh and packing 2026-03-18 16:04:36 +01:00
Abdessamad Derraz
976e5fbd41 feat: add refresh_data_dirs.py for upstream data sync 2026-03-18 16:01:28 +01:00
Abdessamad Derraz
bb307aa250 feat: archive full dolphin-emu/Sys, add DSP/font/IPL paths to pack
dolphin-emu/Sys/ folder (2562 files) from libretro/dolphin Data/Sys.
retroarch.yml: DSP firmware (dsp_coef.bin, dsp_rom.bin), fonts
(font_western.bin, font_japanese.bin) at dolphin-emu/Sys/GC/ paths.
ref: DolphinLibretro/Boot.cpp:72-73, HW/DSPLLE/DSPHost.cpp,
HW/EXI/EXI_DeviceIPL.cpp. pack: 452 files, 0 missing.
2026-03-18 15:16:20 +01:00
Abdessamad Derraz
e5681c4ae8 feat: dolphin IPL.bin paths, scraper dedup by (name, destination)
dolphin: gc-ntsc-12.bin mapped to dolphin-emu/Sys/GC/<region>/IPL.bin
ref: DolphinLibretro/Boot.cpp:72-73, CommonPaths.h:139
scraper EXTRA_SYSTEM_FILES dedup now by (name, destination) to allow
same source file at multiple destinations.
retroarch pack: 448 files, 0 missing.
2026-03-18 15:08:26 +01:00
Abdessamad Derraz
4bffc23ab5 feat: 0 HIGH issues, xrick system, np2kai FONT.ROM, coleco.rom alias
verified against source: fuse flat (not fuse/), ep128emu/roms/ (not rom/).
added xrick system, np2kai FONT.ROM uppercase variant, coleco.rom alias.
quasi88 alt naming verified in quasi88-libretro/src/LIBRETRO/libretro.c:108-117.
61 systems, 445 files, 0 missing on all platforms.
2026-03-18 15:01:52 +01:00
Abdessamad Derraz
71e506e708 fix: ep128emu roms/ path (not rom/), fuse flat verified in source
ep128emu: corrected to ep128emu/roms/ per core.cpp:56,59.
fuse: verified in src/compat/paths.c — core searches flat, not fuse/.
docs are wrong on fuse/ prefix. removed from retroarch shared groups.
all refs updated to exact source lines.
added quasi88 alt names, vircon32, MacII.ROM casing.
2026-03-18 14:55:13 +01:00
Abdessamad Derraz
c09abe0179 feat: complete medium audit fixes, add vircon32, quasi88 alt names
scraper: add MacII.ROM casing fix, Vircon32 system, QUASI88 alt
naming convention (n88_0/1/2/3.rom alongside N88EXT*.ROM).
retroarch pack: 445 -> 450 files, 60 systems, 0 missing.
all choices traced to original emulator source code.
2026-03-18 14:50:07 +01:00
Abdessamad Derraz
1d6a0b9ebc feat: complete retroarch.yml conformance with libretro docs
scraper adds FBNeo subsystem BIOS (channelf, coleco, neocdz, ngp,
spectrum, spec128, spec1282a, fdsbios, aes), DSi files, SGB alt
names, JollyCV BIOS, Saturn ST-V, PSX alt BIOS. all traced to
original emulator source code. 0 missing across all platforms.
retroarch pack: 398 -> 445 files.
2026-03-18 14:46:32 +01:00