Commit Graph

19 Commits

Author SHA1 Message Date
Abdessamad Derraz
73ccb216f5 feat: align gap analysis coherence, add 7 BIOS files, unsourceable field
cross_reference.py: add bios_mode/agnostic/load_from filters, archive
grouping, unsourceable field support. verify.py: case-insensitive
by_name lookup, storage:release in_repo, unsourceable skip, trailing
slash fix. generate_site.py: enriched all_declared, platform-relevant
profile filtering, proper in_repo resolution on emulator pages,
acknowledged gaps section.

New BIOS: delta2.rom (XRoar), tilekey.dat + sprites.sif (NXEngine),
Gram Kracker.ctg + cf7+.ctg + ti-pcard.ctg (ti99sim), desc.dat
(SDLPAL). Profiles: hle_fallback on tilekey.dat/key.txt, unsourceable
on 7 files with source-verified reasons.
2026-04-02 15:35:24 +02:00
Abdessamad Derraz
91925120c9 feat: unify gap analysis with verify results and source provenance
Single source of truth for gap page: verification status from
verify.py (verified/untested/missing/mismatch), file provenance
from cross_reference (bios/data/large_file/missing).

cross_reference.py: _find_in_repo -> _resolve_source returning
source category, stop skipping storage: release/large_file,
add by_path_suffix lookup, all_declared param for global check.

generate_site.py: gap page now shows verification by platform,
18 hash mismatches, and core complement with provenance breakdown.
2026-04-01 22:33:37 +02:00
Abdessamad Derraz
0a272dc4e9 chore: lint and format entire codebase
Run ruff check --fix: remove unused imports (F401), fix f-strings
without placeholders (F541), remove unused variables (F841), fix
duplicate dict key (F601).

Run isort --profile black: normalize import ordering across all files.

Run ruff format: apply consistent formatting (black-compatible) to
all 58 Python files.

3 intentional E402 remain (imports after require_yaml() must execute
after yaml is available).
2026-04-01 13:17:55 +02:00
Abdessamad Derraz
b4c5d77e4b refactor: deduplicate yaml import pattern via require_yaml() 2026-03-29 17:07:27 +02:00
Abdessamad Derraz
7492777b47 fix: skip storage: release files in cross-reference
Files with storage: release are in GitHub release assets,
not in bios/. Eliminates donpachi/sfz3mix/twotiger false
positives. 149/149 tests pass. Cross-ref: 10 -> 7.
2026-03-29 07:28:12 +02:00
Abdessamad Derraz
a369defc15 fix: skip path: null entries in cross-reference
Files with explicit path: null are UI-imported (Dolphin NAND,
Hatari cartridge) and not resolvable by pack placement. Skip
them in find_undeclared_files and cross_reference. Also add
desc.dat (SDLPAL fan-made descriptions) to data/. 149/149 OK.
2026-03-29 07:26:40 +02:00
Abdessamad Derraz
ddf2937f41 fix: eliminate cross-reference false positives
Skip placeholder names (<bios>.bin), resolve by MD5/SHA1 hash
match for alias files, fix directory basename extraction for
trailing slash entries, index bios/ directory names for
directory-type file entries. 1011 -> 113 true missing.
149/149 tests pass.
2026-03-28 19:24:16 +01:00
Abdessamad Derraz
42f2cc5617 fix: cross-reference resolves by path: field as fallback
Many emulator profiles use descriptive names (e.g., "SeaBIOS
(128 KB)") while files exist under their path: field basename
(e.g., "bios.bin"). Try path: when name: fails. Eliminates
206 false positives. True missing: 448 -> 242.
2026-03-28 18:06:04 +01:00
Abdessamad Derraz
76fe7dd76f fix: cross-reference checks inside ZIP archives
_build_supplemental_index scans both data/ directories and
contents of bios/ ZIP files. Eliminates 197 false positives
where files existed inside archive ZIPs (neogeo.zip, pgm.zip,
stvbios.zip, etc.) but were counted as missing. True missing
drops from 645 to 448.
2026-03-28 18:00:11 +01:00
Abdessamad Derraz
3092d73122 fix: cross-reference checks data/ directories for false positives
_find_in_repo and _name_in_index now scan data/ in addition to
bios/ via database.json. Eliminates 129 false positives from
game data migrated to data/ (OpenTyrian, ScummVM, SDLPAL, Cave
Story, Syobon Action). True missing: 782 -> 653.
2026-03-28 17:31:22 +01:00
Abdessamad Derraz
de58f3f28e feat: add --platform and --target to cross_reference.py 2026-03-26 08:48:41 +01:00
Abdessamad Derraz
d2cc9b8f29 feat: add doom engine wad files, emulatorjs base config 2026-03-25 23:12:53 +01:00
Abdessamad Derraz
38d605c7d5 fix: audit fixes across verify, pack, security, and performance
- fix KeyError in compute_coverage (generate_readme, generate_site)
- fix comma-separated MD5 handling in generate_pack check_inside_zip
- fix _verify_file_hash to handle multi-MD5 for large files
- fix external downloads not tracked in seen_destinations/file_status
- fix tar path traversal in _is_safe_tar_member (refresh_data_dirs)
- fix predictable tmp path in download.py
- fix _sanitize_path to filter "." components
- remove blanket data_dir suppression in find_undeclared_files
- remove blanket data_dir suppression in cross_reference
- add status_counts to verify_platform return value
- add md5_composite cache for repeated ZIP hashing
2026-03-19 14:04:34 +01:00
Abdessamad Derraz
b9cdda07ee refactor: DRY consolidation + 83 unit tests
Moved shared functions to common.py (single source of truth):
- check_inside_zip (was in verify.py, imported by generate_pack)
- build_zip_contents_index (was duplicated in verify + generate_pack)
- load_emulator_profiles (was in verify, cross_reference, generate_site)
- group_identical_platforms (was in verify + generate_pack)

Added tests/ with 83 unit tests covering:
- resolve_local_file: SHA1, MD5, name, alias, truncated, zip_contents
- verify: existence, md5, zipped_file, multi-hash, severity mapping
- aliases: field parsing, by_name indexing, beetle_psx field rename
- pack: dedup, file_status, zipped_file inner check, EmuDeck entries
- severity: all 12 combinations, platform-native behavior

0 regressions: pipeline.py --all produces identical results.
2026-03-19 11:19:50 +01:00
Abdessamad Derraz
86dbdf28e5 feat: core profiles, data_dirs buildbot, cross_ref fix
profiles: amiberry (new), amiarcadia, atari800, azahar, b2,
bk, blastem, bluemsx, freeintv updated with source refs,
upstream field, mode field, data_directories.

_data_dirs.yml: buildbot source for retroarch platforms,
strip_components for nested ZIPs, freeintv-overlays fixed.

cross_reference.py: data_directories-aware gap analysis,
suppresses false gaps when emulator+platform share refs.

refresh_data_dirs.py: ZIP strip_components support,
for_platforms filter, ETag freshness for buildbot.

scraper: bluemsx single ref, freeintv overlays injection.
generate_pack.py: warning on missing data directory cache.
2026-03-18 21:20:02 +01:00
Abdessamad Derraz
846640dd7c feat: emulator mode field, archive ZX81 standalone ROMs
emulator profiles support mode: standalone | libretro | both.
cross_reference.py skips standalone-only files for libretro platforms.
81.yml: type standalone + libretro, upstream ref added, files listed
with mode: standalone and source_refs to both codebases.
bios/Sinclair/ZX 81/: zx81.rom (8K) and dkchr.rom (4K) archived.
2026-03-18 17:37:01 +01:00
Abdessamad Derraz
7653d5d108 feat: add 19 BIOS files, fix cross_reference resolution
New files: OpenTyrian data (11), Cave Story (2), SeaBIOS,
VGA BIOS, OpenSBI, Cromwell, xbox_hdd, Sega CD Model 2 (3),
NGP Color BIOS, Pentagon 128p-1.rom, X1 font, BK TERAK.
cross_reference.py: basename + case-insensitive lookup.
2026-03-18 12:50:55 +01:00
Abdessamad Derraz
08f68e792d refactor: centralize hash logic, fix circular imports and perf bottlenecks 2026-03-18 11:51:12 +01:00
Abdessamad Derraz
9052a6b750 feat: add emulator profiles and cross-reference engine (tier 2)
New two-tier architecture:
- Tier 1: Platform configs (what the UI checks) - unchanged
- Tier 2: Emulator profiles (what the code actually loads)

11 emulator profiles from source code analysis:
  cemu, citra, dolphin, duckstation, flycast,
  melonds, pcsx2, ppsspp, rpcs3, vita3k, xemu

Each profile documents every file the emulator loads with
source code references (file:line), hashes, and notes.

New scripts/cross_reference.py computes gaps between what
platforms declare and what emulators need.

Current gap: 200 undeclared files, 24 already in repo.
DuckStation alone recognizes 105 PS1/PS2 BIOS variants.

generate_pack.py gains --include-extras flag (future use).
_registry.yml maps platforms to their emulators.
2026-03-17 20:08:27 +01:00