Commit Graph

266 Commits

Author SHA1 Message Date
Abdessamad Derraz
4b09205bc9 fix: zero warnings on mkdocs build, update actions to v4/v5 2026-03-25 16:30:08 +01:00
Abdessamad Derraz
c5b267a6fb fix: anchor mismatches in platform and emulator index pages 2026-03-25 16:23:25 +01:00
Abdessamad Derraz
851f53ba7f refactor: extract wiki to source files, use deploy-pages action 2026-03-25 16:02:11 +01:00
Abdessamad Derraz
a6150a43bd feat: group emulators by classification, slim mkdocs nav, add pymdownx extensions 2026-03-25 15:29:58 +01:00
Abdessamad Derraz
0196fff8c7 feat: improve site UX (quick start, system summary, collapsible sections, wiki index, actionable gaps) 2026-03-25 15:24:38 +01:00
Abdessamad Derraz
904edd65e4 docs: document CI workflows, inheritance, MAME clones, tests, scrapers 2026-03-25 15:14:29 +01:00
Abdessamad Derraz
02a7c58fca docs: complete wiki coverage, document all scripts and edge cases 2026-03-25 15:02:23 +01:00
Abdessamad Derraz
f8a325260f feat: add wiki pages (architecture, tools, profiling, data model) 2026-03-25 14:56:37 +01:00
Abdessamad Derraz
313637663a docs: soften tone, explain methodology without dismissing other sources 2026-03-25 14:51:57 +01:00
Abdessamad Derraz
593466b655 feat: add methodology and ground truth narrative to readme and site 2026-03-25 14:50:09 +01:00
Abdessamad Derraz
23d76d54fd fix: correct rendering of complex YAML fields in site pages 2026-03-25 14:38:40 +01:00
Abdessamad Derraz
1cd43c3224 feat: exploit 100% of emulator YAML fields in site generation 2026-03-25 14:32:48 +01:00
Abdessamad Derraz
b3c1462a5e feat: exploit all emulator YAML fields in site generation 2026-03-25 14:28:02 +01:00
Abdessamad Derraz
75bfd04687 feat: full cross-linking web between all site pages 2026-03-25 14:17:10 +01:00
Abdessamad Derraz
3d2762bbc3 feat: cross-reference platform -> core -> systems -> upstream 2026-03-25 14:03:36 +01:00
Abdessamad Derraz
0f4fed2f47 feat: enrich site with full YAML data, cross-references, classification stats 2026-03-25 13:56:28 +01:00
Abdessamad Derraz
dbc26b11c1 refactor: move fetch_large_file to common, auto-download on db rebuild 2026-03-25 13:19:12 +01:00
Abdessamad Derraz
910428c6f1 fix: resolve large files from cache in database paths 2026-03-25 12:52:20 +01:00
Abdessamad Derraz
21465effff feat: add readme and site generation to pipeline 2026-03-25 12:34:03 +01:00
Abdessamad Derraz
47e6174ed4 fix: pack naming, large file preservation, discrepancy reporting 2026-03-25 12:23:40 +01:00
Abdessamad Derraz
ebb55a445b feat: re-profile 40 emulators, harden CI workflows
profile emulators pd777 through tic80, add frozen snapshots
(puae2021, snes9x2002/2005/2010, stella2014/2023).

CI: replace github-script with gh CLI, add test execution,
job-level permissions, propagate changed output, pin jsonschema.
2026-03-25 07:00:17 +01:00
Abdessamad Derraz
0543165ed2 feat: re-profile 22 emulators, refactor validation to common.py
batch re-profiled nekop2 through pokemini. mupen64plus renamed to
mupen64plus_next. new profiles: nes, mupen64plus_next.
validation functions (_build_validation_index, check_file_validation)
consolidated in common.py — single source of truth for verify.py
and generate_pack.py. pipeline 100% consistent on all 6 platforms.
2026-03-24 22:31:22 +01:00
Abdessamad Derraz
94000bdaef fix: align verify and pack validation, pipeline 100% consistent
generate_pack.py now applies emulator-level validation (crc32, sha1,
adler32) matching verify.py behavior. existence mode: validation is
informational (file present = OK). md5 mode: validation downgrades
to UNTESTED. clone resolution moved to common.py resolve_local_file.
all 6 platforms pass consistency check.
2026-03-24 22:21:47 +01:00
Abdessamad Derraz
ae4846550f fix: clone resolution in common.py, move clone map to root
moved _mame_clones.json out of bios/ (was indexed by generate_db.py
as BIOS file). clone resolution now in common.py resolve_local_file
so all tools (verify, pack, cross_reference) resolve clones
transparently. removed duplicate clone code from generate_pack.py.
added error handling on os.remove in dedup.py. consistency check
now passes for Batocera/EmuDeck/Lakka/RetroArch (4/6 platforms).
2026-03-24 21:57:49 +01:00
Abdessamad Derraz
85308edd73 fix: dedup edge cases — preserve non-ZIP different-name files
non-ZIP files with different names but same content (64DD_IPL_US.n64
vs IPL_USA.n64) are now preserved — each name may be needed by a
different emulator. only same-name duplicates and MAME ZIP clones
are removed. added empty directory cleanup post-dedup.
2026-03-24 21:39:25 +01:00
Abdessamad Derraz
fb1007496d chore: deduplicate bios/ — remove 427 files, save 227 MB
true duplicates (same file in multiple dirs): removed copies, kept
canonical. MAME device clones (different names, identical content):
removed copies, created _mame_clones.json mapping for pack-time
assembly via deterministic ZIP rebuild. generate_pack.py resolves
clones transparently. 95 canonical ZIPs serve 392 clone names.
2026-03-24 21:35:50 +01:00
Abdessamad Derraz
8fcb86ba35 feat: deterministic MAME ZIP assembly in packs
all ZIP files (neogeo.zip, pgm.zip, etc.) are rebuilt with fixed
metadata before packing: sorted filenames, epoch timestamps, fixed
permissions, deflate level 9. same ROM atoms = same ZIP hash, always.
115 internal ZIPs verified identical across two independent builds.
enables version-agnostic ZIP assembly from ROM atoms indexed by CRC32.
2026-03-24 15:17:12 +01:00
Abdessamad Derraz
34e4c36f1c feat: pack integrity verification, manifests, SHA256SUMS
post-generation verification: reopen each ZIP, hash every file,
check against database.json. inject manifest.json inside each pack
(self-documenting: path, sha1, md5, size, status per file).
generate SHA256SUMS.txt alongside packs for download verification.

validation index now uses sets for hashes and sizes to support
multiple valid ROM versions (MT-32 v1.04-v2.07, CM-32L variants).
69 tests pass, pipeline complete.
2026-03-24 14:56:02 +01:00
Abdessamad Derraz
11db9892bf feat: add sha256 validation support to verify.py 2026-03-24 11:49:58 +01:00
Abdessamad Derraz
685713a7e6 feat: add sect233r1 ECDSA verification for 3DS OTP cert
pure python GF(2^233) field arithmetic, binary curve point operations,
and ECDSA-SHA256 on sect233r1. verifies OTP CTCert against nintendo
root CA public key. zero dependencies. sign+verify round-trip tested,
n*G=O verified, wrong key/message rejection confirmed.
2026-03-24 11:45:16 +01:00
Abdessamad Derraz
d4849681a7 feat: add 3DS signature/crypto verification to verify.py
pure python RSA-2048 PKCS1v15 SHA256 for SecureInfo_A,
LocalFriendCodeSeed_B, movable.sed. AES-128-CBC + SHA256 for otp.bin.
keys extracted from azahar default_keys.h, added RSA/ECC sections
to aes_keys.txt. sect233r1 ECC not reproducible (binary field curve).
2026-03-24 11:36:29 +01:00
Abdessamad Derraz
8141a34faa feat: full ground truth validation in verify.py
adler32 hash via zlib.adler32(), min_size/max_size range checks,
signature/crypto tracked as non-reproducible (console-specific keys).
compute_hashes now returns adler32. 69 tests pass including 3 new
tests for adler32, size ranges, and crypto tracking.
2026-03-24 11:11:38 +01:00
Abdessamad Derraz
470bb6ceb9 feat: support min_size/max_size validation in verify.py
reproduces ground truth size checks from emulator profiles: exact
size, min_size lower bound, max_size upper bound. all 66 tests pass.
2026-03-24 10:53:01 +01:00
Abdessamad Derraz
1d350f0578 feat: add emulator/system pack generation, validation checks, path resolution
add --emulator, --system, --standalone, --list-emulators, --list-systems
to verify.py and generate_pack.py. packs are RTU with data directories,
regional BIOS variants, and archive support.

validation: field per file (size, crc32, md5, sha1) with conflict
detection. by_path_suffix index in database.json for regional variant
resolution via dest_hint. restructure GameCube IPL to regional subdirs.

66 E2E tests, full pipeline verified.
2026-03-22 14:02:20 +01:00
Abdessamad Derraz
74f17694c2 feat: add category field to emulator profiles, source missing BIOS
Add category: game_data to sdlpal, nxengine, opentyrian, easyrpg,
mkxp_z profiles. verify.py separates game_data from bios in core
gap metrics for cleaner coverage numbers.

New BIOS files: Cemu fonts (4), QEMU bios-256k + vgabios-stdvga,
GAM4980 ROMs (2), SC-3000 Export variant.
2026-03-21 07:37:22 +01:00
Abdessamad Derraz
27df5c8fb5 fix: resolve case collisions on case-insensitive filesystems
Move 17 case-colliding BIOS variants to .variants/ so Windows
and macOS clones no longer lose files. Existence-based platforms
keep the primary, MD5-based platforms resolve from .variants/.

Also fix resolve_local_file zipped_file resolution: when multiple
ZIPs share a name, verify the inner ROM exists before accepting
a candidate. Fixes adam_fdc.zip resolving to the wrong archive.
2026-03-20 20:02:42 +01:00
Abdessamad Derraz
21bc225cac fix: resolve truncated md5 in batocera scraper
Batocera upstream has a truncated 29-char MD5 for zx48.rom.
The scraper now resolves truncated hashes via prefix match
against database.json, preventing schema validation failures.
2026-03-19 23:52:25 +01:00
Abdessamad Derraz
6ee162f8fb chore: add MAME and RetroDECK ROM sets 2026-03-19 23:26:49 +01:00
monster-penguin
1fcb948a00 Add RetroDECK Platform Support (#36)
* Add files via upload

* Add files via upload

* Update _registry.yml
2026-03-19 17:10:37 +01:00
Abdessamad Derraz
6a21a99c22 feat: platform-core registry for exact pack generation
resolve_platform_cores() links platforms to their cores via
three strategies: all_libretro, explicit list, system ID
fallback. Pack generation always includes core requirements
beyond platform baseline. Case-insensitive dedup prevents
conflicts on Windows/macOS. Data dir strip_components fixes
doubled paths for Dolphin and PPSSPP caches.
2026-03-19 16:10:43 +01:00
Abdessamad Derraz
257ec1a527 fix: round 2 audit fixes, updated emulator profiles
Scripts:
- fix generate_site nav regex destroying mkdocs.yml content
- fix auto_fetch comma-separated MD5 in find_missing
- fix verify print_platform_result conflating untested/missing
- fix validate_pr path traversal and symlink check
- fix batocera_scraper brace counting and escaped quotes in strings
- fix emudeck_scraper hash search crossing function boundaries
- fix pipeline.py cwd to repo root via Path(__file__)
- normalize SHA1 comparison to lowercase in generate_pack

Emulator profiles:
- emux_gb/nes/sms: reclassify from alias to standalone profiles
- ep128emu: remove .info-only files not referenced in source
- fbalpha2012 variants: full source-verified profiles
- fbneo_cps12: add new profile
2026-03-19 15:00:18 +01:00
Abdessamad Derraz
38d605c7d5 fix: audit fixes across verify, pack, security, and performance
- fix KeyError in compute_coverage (generate_readme, generate_site)
- fix comma-separated MD5 handling in generate_pack check_inside_zip
- fix _verify_file_hash to handle multi-MD5 for large files
- fix external downloads not tracked in seen_destinations/file_status
- fix tar path traversal in _is_safe_tar_member (refresh_data_dirs)
- fix predictable tmp path in download.py
- fix _sanitize_path to filter "." components
- remove blanket data_dir suppression in find_undeclared_files
- remove blanket data_dir suppression in cross_reference
- add status_counts to verify_platform return value
- add md5_composite cache for repeated ZIP hashing
2026-03-19 14:04:34 +01:00
Abdessamad Derraz
e1410ef4a6 fix: exclusion reasons from YAML, not hardcoded in Python
Added exclusion_note field to emulator profiles. verify.py reads
this field instead of parsing notes text with fragile keywords.

desmume2015: explains .info vs code discrepancy
dolphin_launcher: explains standalone BIOS management

All exclusion messages now come from YAML data, not Python strings.
2026-03-19 13:17:55 +01:00
Abdessamad Derraz
114732dc6d feat: intentional exclusion notes in verify report
New section "Intentional exclusions" explains why certain emulator
files are NOT in the pack:
- [frozen_snapshot]: code doesn't load .info firmware (desmume2015)
- [launcher]: BIOS managed by standalone emulator (dolphin_launcher)
- [standalone_only]: files for standalone mode, not libretro

Makes it clear that omissions are by design, not bugs.
2026-03-19 13:15:26 +01:00
Abdessamad Derraz
2509c61ffe feat: detailed core gap categories in verify report 2026-03-19 13:12:14 +01:00
Abdessamad Derraz
316d2467eb refactor: replace emulator extras with cross-reference discovery
_collect_emulator_extras() now uses find_undeclared_files() from
verify.py instead of manual emulator name lists. This gives:
- System-overlap matching (automatic, no manual config needed)
- mode: standalone filtering (no standalone files in libretro packs)
- type: launcher filtering (no launcher BIOS in system_dir)
- data_directories coverage (no false gaps)
- hle_fallback propagation
- Works for ANY platform (same logic for RetroArch, Batocera, etc.)

RetroArch --include-extras now discovers 91 extra files from
emulator profiles automatically.
2026-03-19 13:10:36 +01:00
Abdessamad Derraz
d5daf98e5e feat: hle_fallback field + launcher filtering in verify
Added hle_fallback: true/false per file in emulator profiles.
When a core has HLE and the file is missing, severity downgrades
to INFO instead of CRITICAL — core works without it.

verify.py builds an HLE index from emulator profiles and applies
it during severity computation. Cross-reference now skips launcher
profiles (type: launcher) and includes hle_fallback in undeclared
file reports.

33 E2E tests (4 new: HLE severity, HLE index, launcher skip,
cross-ref HLE). 0 regressions.

Based on source code analysis:
- RetroArch core_info.c:2233 — existence check only, no blocking
- PCSX ReARMed psxbios.c:28 — full HLE BIOS replacement
- Dolphin CommonPaths.h — all files optional with HLE
- snes9x — DSP HLE built-in, coprocessor files optional
2026-03-19 12:51:52 +01:00
Abdessamad Derraz
6d9edc5110 fix: review findings — hoist constants, cache emu profiles, renumber steps
- Hoist sev_order/sev_prio dicts to module-level constants (was rebuilt
  every loop iteration)
- Cache emulator profiles across platforms in verify main() (was loading
  260 YAMLs per platform, now loaded once)
- Renumber resolve_local_file steps 1-5 (was 1,2,3,5,6 after removal)
- Pass emu_profiles through verify_platform → find_undeclared_files
2026-03-19 11:22:58 +01:00
Abdessamad Derraz
b9cdda07ee refactor: DRY consolidation + 83 unit tests
Moved shared functions to common.py (single source of truth):
- check_inside_zip (was in verify.py, imported by generate_pack)
- build_zip_contents_index (was duplicated in verify + generate_pack)
- load_emulator_profiles (was in verify, cross_reference, generate_site)
- group_identical_platforms (was in verify + generate_pack)

Added tests/ with 83 unit tests covering:
- resolve_local_file: SHA1, MD5, name, alias, truncated, zip_contents
- verify: existence, md5, zipped_file, multi-hash, severity mapping
- aliases: field parsing, by_name indexing, beetle_psx field rename
- pack: dedup, file_status, zipped_file inner check, EmuDeck entries
- severity: all 12 combinations, platform-native behavior

0 regressions: pipeline.py --all produces identical results.
2026-03-19 11:19:50 +01:00
Abdessamad Derraz
e240c70126 feat: complete platform-native verification with cross-reference
verify.py output now uses platform-native terminology:
- md5 platforms: X/Y OK, N untested, M missing
- existence platforms: X/Y present, M missing

Each problem shows (required/optional) from platform YAML.

Core gaps section summarizes undeclared files by severity:
- required NOT in repo: critical gaps needing sourcing
- required in repo: can be added to platform config
- optional: informational

Consistency check in pipeline.py updated to match new format.
All 7 platforms verified, consistency OK across verify and pack.
2026-03-19 10:44:17 +01:00