Replace grep-based restore with SHA1 matching via database.json.
The old grep heuristic failed for assets with renamed basenames
(dsi_nand_batocera42.bin) or special characters (MAME dots vs
spaces), and only restored to the first .gitignore match when
multiple paths shared a basename.
Fix 3 broken data directory sources:
- opentyrian: buildbot URL 404, use release asset
- syobonaction: invalid git_subtree URL, use GitHub archive
- stonesoup: same fix, adds 532 game data files
generate_site.py resolves files on disk for gap analysis.
Without large files and data directories, the deployed site
showed 148 missing platform files and 207 unsourced core
complement files.
Add and reorder BIOS path entries in the site generator (BizHawk, EmuDeck, RetroPie, RomM). Update the add-platform pipeline steps and CI workflow notes. Document verification behavior changes: FirmwareDatabase index now includes sha256; RomM uses MD5 verification (verify.py checks MD5 only); BizHawk uses SHA1; severity label for GREEN adjusted to WARNING. Clarify troubleshooting/verify output semantics (UNTESTED and mismatch reporting), add profiling fields (core_classification option and adler32), fix several path and link typos (RetroDECK path, README/CONTRIBUTING links), and other small docs polishing.
Add an "Add a new platform" section to CONTRIBUTING.md (instructions to write a scraper in scripts/scraper/, create platform YAML in platforms/, register in platforms/_registry.yml, and submit a PR) and note contributor crediting. Update README: bump verified files from 7,296 to 7,302, add RomM and RetroDECK contributor credits with PR links, and refresh the auto-generated timestamp. Add sdlpal to the mkdocs.yml navigation.
Add MSX2J.rom (sha1: 0081ea0d25bc5cd8d70b60ad8cfdc7307812c0fd, size: 32768) to multiple install manifests and the RetroDECK bios list. Update generated timestamps and adjust total_files/total_size counts in batocera, lakka, recalbox, retroarch, retrobat, and retrodeck manifests. Also bump README verified file count and regenerate the auto-generated timestamp to reflect the new entry.
SwanStation accepts PS1 (512KB), PS2 (4MB), and PS3 (0x3E66F0)
BIOS sizes but only uses the first 512KB. MD5 validates the
extracted content, not the full file. List all accepted sizes
to eliminate the false size mismatch discrepancy.
validation.py: support size as list in emulator profiles.
generate_site.py: handle list sizes in emulator page display.
All 18 original hash mismatches are now resolved: 0 discrepancies.
scph3000.bin v2.1J and scph3500.bin v2.2J already existed under
different primary names (scph3500.bin and scph5000.bin respectively).
Add .variants/ entries so by_name resolves both filenames.
verify_single_emulator now calls _find_best_variant on hash mismatch,
matching the platform-level verification path.
Source: Subtixx/RetroStation MSX2J.rom
SHA256 0c672d86 matches ares desktop-ui/emulator/msx2.cpp:15.
Resolves last MSX2.ROM discrepancy across all platforms.
generate_db: add by_sha256 index for O(1) variant lookup.
verify: _find_best_variant uses indexed sha256 instead of O(n) scan.
validation: check_file_validation returns (reason, emulators) tuple,
attributing mismatch only to emulators whose check actually failed.
beetle_psx: remove incorrect size field for ps1_rom.bin (code does
not validate size, swanstation is sole size authority).
Dolphin computes adler32 on byte-swapped (16-bit) data, not raw
file bytes. Add adler32_byteswap flag to dolphin/primehack/ishiiruka
profiles and support it in validation.py.
Reduces hash mismatch discrepancies from 18 to 2.
_find_best_variant now searches by hash (md5, sha1, crc32, sha256)
across the entire database instead of only by filename. Finds
variants stored under different names (e.g. eu_mcd2_9306.bin for
bios_CD_E.bin, scph1001_v20.bin for scph1001.bin).
verify_entry_existence now also calls _find_best_variant to
suppress discrepancies when a matching variant exists in the repo.
Reduces false discrepancies from 22 to 11 (4 unique files where
the variant genuinely does not exist in the repo).
Single source of truth for gap page: verification status from
verify.py (verified/untested/missing/mismatch), file provenance
from cross_reference (bios/data/large_file/missing).
cross_reference.py: _find_in_repo -> _resolve_source returning
source category, stop skipping storage: release/large_file,
add by_path_suffix lookup, all_declared param for global check.
generate_site.py: gap page now shows verification by platform,
18 hash mismatches, and core complement with provenance breakdown.
find_undeclared_files was enriching declared_names with DB aliases,
filtering core extras that were never packed by Phase 1 under that
name. Pass strict YAML names to _collect_emulator_extras so alias-
only files (dc_bios.bin, amiga-os-310-a1200.rom, scph102.bin, etc.)
get packed at the emulator's expected path. Also fix truth mode
output message and --all-variants --verify-packs quick-exit bypass.
Run ruff check --fix: remove unused imports (F401), fix f-strings
without placeholders (F541), remove unused variables (F841), fix
duplicate dict key (F601).
Run isort --profile black: normalize import ordering across all files.
Run ruff format: apply consistent formatting (black-compatible) to
all 58 Python files.
3 intentional E402 remain (imports after require_yaml() must execute
after yaml is available).
Update wiki source files (the single source of truth for the site):
- tools.md: renumber pipeline steps 1-8, add step 6 (pack integrity),
add missing CLI flags for cross_reference.py and refresh_data_dirs.py
- architecture.md: update mermaid diagram with pack integrity step,
fix test file count (5 files, 249 tests)
- testing-guide.md: add test_pack_integrity section, add step 5 to
verification discipline checklist
Remove 4 unused functions from generate_site.py (generate_wiki_index,
generate_wiki_architecture, generate_wiki_tools, generate_wiki_profiling)
that contained stale data. Wiki pages are sourced from wiki/ directory.
Update generate_site.py contributing section with correct test counts
(249 total, 186 E2E, 8 pack integrity) and pack integrity documentation.
Move verification logic to generate_pack.py --verify-packs (single
source of truth). test_pack_integrity.py is now a thin wrapper that
calls the CLI. Pipeline step 6/8 uses the same CLI entry point.
Renumber all pipeline steps 1-8 (was skipping from 5 to 8/9).
Update generate_site.py with pack integrity test documentation.
Extract each platform ZIP to tmp/ (real filesystem, not /tmp tmpfs)
and verify every declared file exists at the correct path with the
correct hash per the platform's native verification mode.
Handles ZIP inner content verification (checkInsideZip, md5_composite,
inner ROM MD5) and path collision deduplication.
Integrated as pipeline step 6/8. Renumber all pipeline steps to be
sequential (was skipping from 5 to 8).
RetroDECK: core extras with subdirectory paths (e.g. vice/C64/,
fbneo/, dc/) were placed outside bios/ because the prefix was only
inferred for bare filenames. Add _detect_extras_prefix() to infer
the dominant BIOS prefix from YAML destinations.
RomM: core extras landed flat at bios/{file} instead of the required
bios/{platform_slug}/{file}. Add _detect_slug_structure() to detect
per-system slug layouts and _map_emulator_to_slug() to route each
extra to the correct slug subfolder.
Also skip manifest writes when only the generated timestamp changed,
preventing unnecessary diffs in install/*.json.
Closes#43
FBNeo and Kronos expect BIOS archives in core-specific subdirectories
(system/fbneo/, system/kronos/). RetroArch firmware check uses .info
paths which include these prefixes, so files at root show as Missing.
Add archive_prefix field to emulator profiles. The pack code now places
archive copies in the prefixed subdirectory while keeping root copies
for cores that expect them there (e.g. Geolith for neogeo.zip).
9 new wiki pages: getting-started, faq, troubleshooting,
advanced-usage, verification-modes, adding-a-platform,
adding-a-scraper, testing-guide, release-process.
Updated architecture.md with mermaid diagrams, tools.md with
full pipeline and target/exporter sections, profiling.md with
missing fields, index.md with glossary and nav links.
Expanded CONTRIBUTING.md from stub to full contributor guide.
Filter .old.yml from load_emulator_profiles, generate_db alias
collection, and generate_readme counts. Fix BizHawk sha1 mode
in tools.md, fix RetroPie path, fix export_truth.py typos.
9 new wiki pages: getting-started, faq, troubleshooting,
advanced-usage, verification-modes, adding-a-platform,
adding-a-scraper, testing-guide, release-process.
Updated architecture.md with mermaid diagrams, tools.md with
full pipeline and target/exporter sections, profiling.md with
missing fields, index.md with glossary and nav links.
Expanded CONTRIBUTING.md from stub to full contributor guide.
Filter .old.yml from load_emulator_profiles, generate_db alias
collection, and generate_readme counts. Fix BizHawk sha1 mode
in tools.md, fix RetroPie path, fix export_truth.py typos.
auto-fetched from mamedev/mame 0.287 and finalburnneo/FBNeo v1.0.0.2.
mame: +20 new BIOS root sets, 96 entries enriched with contents.
mamearcade: 47 entries enriched with contents.
mamemess: 20 entries enriched with contents.
fbneo: +13 new ROM entries from upstream BIOS sets.
replace yaml.dump with surgical text edits for contents/source_ref.
preserves comments, block scalars, quoting, indentation.
fix FBNeo new entry detection using parsed keys instead of text search.
sparse clone upstream repos, parse BIOS root sets from C source,
cache as JSON, merge into emulator profiles with backup.
covers macro expansion, version detection, subset profile protection.
write_if_changed in common.py compares content after stripping
timestamps (generated_at, Auto-generated on, Generated on).
applied to generate_db, generate_readme, generate_site.
eliminates timestamp-only diffs in database.json, README.md,
mkdocs.yml, and 423 docs pages.
bios_mode: agnostic (profile) and agnostic: true (file) for
emulators that accept any valid BIOS without specific filename.
find_undeclared_files skips agnostic entries, pack extras scan
includes all matching DB files by path prefix + size criteria,
resolve_local_file has agnostic fallback with rename README.
applied to pcsx2, lrps2 (bios_mode), melonds dsi_nand (file).
aliases must be same-SHA1 alternative names, not distinct files.
pcsx2: 164 different BIOS dumps are separate DB entries, not aliases.
melonds: 6 regional NAND dumps are separate DB entries, not aliases.
also cleans pcsx2 non-standard fields, fixes display_name.
expand_platform_declared_names resolves platform file MD5s
through the database to recover canonical names and aliases,
eliminating false positive undeclared files when a platform
renames a file (e.g. Batocera ROM1 vs gsplus ROM).
Bump database.json generated_at timestamp and add a new emulator metadata file emulators/ti99sim.yml. The new YAML registers the TI-99/Sim standalone emulator (ti99sim) with system info, core version, notes on ROM handling, and multiple required/optional ROM entries (with SHA1s, sizes and validation notes) for TI-99/4A support.
Add Atari 800 OS Rev B NTSC (CRC32 0e86d61d, canonical sysrom.c
match) and National FS-5500 disk controller ROM for openMSX.
Remove ROM_400/800_CUSTOM from retrodeck.yml (config slot key with
forward slash in name, not a real file).
Add many MAME/MESS BIOS entries (TRS-80 family, Bandai RX-78, Sega AI) and update docs/navigation counts (README, mkdocs). Remove empty supplemental file references from database.json and update generated timestamps and totals. Harden and refactor tooling: add MAX_RESPONSE_SIZE limited reader in base_scraper, make target scrapers an abstract base, narrow exception handling in the Batocera targets parser, and switch generate_pack.py and verify.py to use build_target_cores_cache (simplifies target config loading and error handling). verify.py also loads supplemental cross-reference names and accepts them through verify_platform. Update tests to import from updated modules (validation/truth). Misc: small bugfix for case-insensitive path conflict check.
Add Casio PV-2000 BIOS entry (pv2000.zip) to MAME and MESS profiles and update system lists/counts. Add Funtech Super A'Can BIOS entries (supracan.zip and umc6650.zip) with ROM contents to mamemess. Simplify and condense Vita3K emulator profile (rename fields, update profiled_date, add PSVUPDAT.PUP and optional PSP2UPDAT.PUP file entries, and clarify install/partition behavior). Bump database generated_at timestamp and add a system alias mapping "psvita" -> "sony-playstation-vita" in scripts/common.py.
Regenerate database.json and update README counts/timestamps; add and normalize numerous BIOS entries and hashes. Key changes: update generated_at timestamp and system count (355→357) in README; add OpenBIOS / HLE fallback and additional aliases to beetle_psx, include beetle_psx core name and profiled_date update; add laseractive to ares systems; adjust atari800 systems and source_ref line numbers; mark dinothawr as a system and expand its note; update gsplus upstream/profile date, add apple-iie system and tweak source_refs; add pcsx2 core to lrps2; refresh mame profiled_date and add multiple systems and BIOS root sets. Miscellaneous script changes and other JSON normalization to reflect newly discovered/merged BIOS files.
Add support for Coleco Adam, Entex Adventure Vision and APF M-1000 BIOS/ROM sets in MAME and MESS metadata (multiple Adam device MCU ROMs and optional FDC/SPI variants, Advision and APF BIOS entries). Update generated metadata across the repo: README coverage numbers and per-platform coverage rows, database generated timestamp and totals (total_files 7245), and various install manifests (notably batocera.json) with new timestamps, adjusted file counts/sizes, SHA1s, repo_path fixes and an added adam_fdc_320kb.zip entry. Also update notes to reflect the new system ROM sets in the emulators entries.
resolve_local_file step 2 (pure MD5 lookup) now verifies that the
found file's name matches the requested name or is a .variants/
derivative. Prevents serving wrong files when an unrelated file
shares the same MD5 in the index (e.g. spi.zip returned for
a7ports.zip because RetroDECK expected an MD5 we don't have).
verify_pack now verifies files against data directory caches and
validates rebuilt MAME ZIPs by comparing inner ROM CRC32s against
source. Reduces false untracked count from 6242 to 0 for RetroArch.
Replaces mode: standalone hack with load_from: save_dir on Panda3DS
files. The load_from field documents which libretro directory callback
provides the base path (system_dir default, save_dir, content_dir).
Pack generator and cross-reference skip files not targeting system_dir.
Files verified by MD5 to be identical to their buildbot-fetched
copies in data/. resolve_local_file data directory fallback ensures
they remain resolvable for verify and pack generation.
240 file-level entries used notes: instead of the canonical note:
field. verify.py and cross_reference.py only read note:, so these
were silently ignored.
49 libretro cores had type: game/utility/test instead of type: libretro,
breaking the all_libretro filtering in resolve_platform_cores and
excluding them from platform packs (e.g. cannonball missing from
RetroArch). core_classification already carries the descriptive role.
9 profiles with subdirectory-loading cores (cannonball/, nxengine/,
Citra/sysdata/, mame2003/, mame2003-plus/, mame2010/) now have path:
fields so cross-reference places files at the correct destination.
resolve_local_file now tries basename when name contains a path
separator (e.g. res/tilemap.bin -> tilemap.bin), fixing resolution
of files with subdirectory names.
RetroDECK had stale buildbot hash a17e0e01 for scummvm.zip
(from old 9.5MB build, current is 79MB) copied to cpc464.rom.
RomM had same stale hash. Updated to current verified values.
All 8 platforms now 100% OK, 0 untested.
disksys.rom non-Rev1 variant (SHA256 fdc1a76e, ares-compatible)
from Myrient No-Intro. GameCube dsp_rom.bin + dsp_coef.bin real
hardware dumps (adler32 match Dolphin) from Redump collection.
All placed as .variants/ — primaries unchanged.
Files with storage: release are in GitHub release assets,
not in bios/. Eliminates donpachi/sfz3mix/twotiger false
positives. 149/149 tests pass. Cross-ref: 10 -> 7.
Files with explicit path: null are UI-imported (Dolphin NAND,
Hatari cartridge) and not resolvable by pack placement. Skip
them in find_undeclared_files and cross_reference. Also add
desc.dat (SDLPAL fan-made descriptions) to data/. 149/149 OK.
squirreljme-0.3.0-test.jar and squirreljme-test.jar compiled
from SquirrelJME trunk using Gradle + JDK 17. No prebuilt
artifacts exist (v0.3.0 never released). Built with
romTestSpringCoatRelease target.
mips_bios.bin and mipsel_bios.bin cross-compiled from U-Boot
v2024.10 (malta/maltael_defconfig). u-boot.bin for LEON3 SPARC
cross-compiled from U-Boot v2016.11 (grsim_defconfig) using
Gaisler sparc-gaisler-linux5.10 toolchain. All built in Docker.
4MB barebox bootloader from frantony/barebox GitHub repo.
Only publicly available ROM for QEMU canon-a1100 machine.
Matches FLASH_K8P3215UQB_SIZE in hw/arm/digic_boards.c.
Cores that load files from system_dir subdirectories (same_cdi/bios/,
neocd/, cannonball/, Citra/sysdata/, mame2003/, etc.) need path: on
each file entry so cross-reference and pack generation place files at
the correct destination. Also fixes neocd.yml using non-standard dest:
field instead of path:.
Closes#41
Agent placed Tyrian 2000 files in bios/Game Engines/ instead
of data/opentyrian/. Synced and removed duplicate. OpenTyrian
data lives in data/ via _data_dirs.yml, not in bios/.
FNT1616.X1 (306KB) from NeoKobe X1turboZ BIOS via HTTP range
request on 934MB archive. font2/font3.rom split from M88 6KB
font.rom. pci.rom (32KB) from MAME 0.278 pc9821cx3.zip. All
previously declared "impossible" by multiple agents.
Remove 30 phantom shapes[2-v].dat entries. shapeFile[34] in
lvlmast.c maps to newsh?.shp (enemy sprites), NOT shapes?.dat
(level tilesets). Only 5 shapes exist: ), W, X, Y, Z — the
characters actually referenced by level data. Verified against
trapexit/libretro-opentyrian source. Cross-ref: 54 -> 24.
cd32fmv.rom (v40.030, SHA1 03ca81c7) extracted from Amiga
Forever 11.0.22 ROM pack on archive.org. Verified against
WinUAE rommgr.cpp ROMTYPE_CD32CART entry. Previously
declared "impossible" by 4 separate agents.
satar4mp.bin (256KB) extracted from existing ar_bios.zip
MAME ROM. virtex-ml507.dtb compiled from Linux 4.19 DTS
with dtc. Both resolve long-standing optional missing files.
CLK os.rom (Acorn Electron MOS), exos10.bin and basic10.bin
(Enterprise EXOS/BASIC 1.0) from TOSEC/Myrient — zero
required missing across all platforms. Also add Enterprise
ROMs, ZX Spectrum 128/+2/+3, MSX CLK variants, FBA2012
samples from progettosnaps.
PC-98 ide.rom and scsi.rom from archive.org np2kai_bios pack.
Tyrian 2000 Episode 5 data + shapes + sprites moved to data/
opentyrian/ (53 files from archive.org msdos_Tyrian_2000).
Built from individual ROMs (FMT_SYS, FMT_DOS, FMT_FNT,
FMT_F20, FMT_DIC, MYTOWNS, MAR_EX0-3) using the tsugaru
structured container format (12-byte name + length + data).
Wii SSL certs (clientca/clientcakey/rootca.pem) from archived
Nintendo HTTPS certs. SYSCONF and setting.txt generated with
correct format. Konami GameMaster 1+2 cartridge ROMs for fMSX.
cdos20.rom (Cumana DOS 2.0), yados.rom (YA-DOS 0.5a),
dragonfly-1.3.rom (Ikon Ultra Drive v1.3) from World of
Dragon Archive and CCHDD GitHub. delta-premier.rom variant.
mameinfo.dat for MAME 2000/2003/2003-Plus game info.
8 Xbox BIOS variants (mcpx-1.1, xbox-3944 through 5838)
from Myrient MAME bios-devices. PC-98 SASI controller ROM
for NP2kai/nekop2. Cross-reference: 146 true missing.
resolve_local_file now tries the path: field basename when name:
lookup fails. Fixes 139 PCem false positives where descriptive
names ("MDA font ROM") didn't match actual filenames (mda.rom).
Also add 3 QEMU firmware (MacROM.bin, bios_loongson3, pmon_2e).
Cross-reference path: fallback already added. 149/149 tests pass.
Many emulator profiles use descriptive names (e.g., "SeaBIOS
(128 KB)") while files exist under their path: field basename
(e.g., "bios.bin"). Try path: when name: fails. Eliminates
206 false positives. True missing: 448 -> 242.
_build_supplemental_index scans both data/ directories and
contents of bios/ ZIP files. Eliminates 197 false positives
where files existed inside archive ZIPs (neogeo.zip, pgm.zip,
stvbios.zip, etc.) but were counted as missing. True missing
drops from 645 to 448.
Previous commit failed to fully revert fuse.yml (agent-added
MD5/SHA1 still present). Also revert qemu.yml and higan_sfc.yml.
All profiles now match their pre-collection state (2666ebd).
Emulator profiles are source-verified ground truth — agents
must never modify them without code verification.
Agents added MD5/SHA1 hashes to fuse.yml computed from downloaded
files, not from source code (ground truth violation: validation
is [size] only). Agents added firmware entries to qemu.yml without
source code verification. Profiles must reflect code, not files.
_find_in_repo and _name_in_index now scan data/ in addition to
bios/ via database.json. Eliminates 129 false positives from
game data migrated to data/ (OpenTyrian, ScummVM, SDLPAL, Cave
Story, Syobon Action). True missing: 782 -> 653.
Group archived files by archive unit in find_undeclared_files instead
of reporting individual ROMs. Add path-based fallback for descriptive
names (e.g. "SeaBIOS (128 KB)" resolves via path: bios.bin). Update
_collect_extras to use archive name for pack resolution. Regenerate
database with new bios files. 6 new E2E tests covering archive
in_repo, missing archives, descriptive names, and pack extras.
Add BIOS files for CLK (Apple II, Mac, Atari ST, Enterprise, MSX,
Commodore, Thomson, PC, Acorn), FBNeo (CPS3 redearthn/sfiii2h),
and QEMU variants. Sources: Asimov mirror, MAME chip extraction,
86Box, Theodore, XRoar upstream, official QEMU repo.
Fix cross-reference in verify.py to group archived files by archive
unit instead of reporting individual ROMs as missing. Add path-based
fallback for descriptive names. Update generate_pack.py extras to
use archive name for resolution. 6 new E2E tests.
MISSING required: 321 -> 6 (zero false positives).
VICE: basic, chargen, kernal for x64sdl libretro core.
QEMU: SeaBIOS 1.16.3 + VGA BIOS from system package.
PCem: MDA font ROM from PCem v17 ROM set.
Freedoom: freedoom1.wad + freedoom2.wad for prboom.
Commander X16: rom.bin R49 from x16-rom releases.
Apple IIGS: ROM01 from asimov mirrors.
DSi: NDS_Bios7i, NDS_Bios9i, DSi_Firmware, SGB_(JU).
- **7,302 files** verified with MD5, SHA1, CRC32 checksums
- **8765 MB** total collection size
## Supported systems
NES, SNES, Nintendo 64, GameCube, Wii, Game Boy, Game Boy Advance, Nintendo DS, Nintendo 3DS, Switch, PlayStation, PlayStation 2, PlayStation 3, PSP, PS Vita, Mega Drive, Saturn, Dreamcast, Game Gear, Master System, Neo Geo, Atari 2600, Atari 7800, Atari Lynx, Atari ST, MSX, PC Engine, TurboGrafx-16, ColecoVision, Intellivision, Commodore 64, Amiga, ZX Spectrum, Arcade (MAME), and 318+ more.
NES, SNES, Nintendo 64, GameCube, Wii, Game Boy, Game Boy Advance, Nintendo DS, Nintendo 3DS, Switch, PlayStation, PlayStation 2, PlayStation 3, PSP, PS Vita, Mega Drive, Saturn, Dreamcast, Game Gear, Master System, Neo Geo, Atari 2600, Atari 7800, Atari Lynx, Atari ST, MSX, PC Engine, TurboGrafx-16, ColecoVision, Intellivision, Commodore 64, Amiga, ZX Spectrum, Arcade (MAME), and 362+ more.
Full list with per-file details: **[https://abdess.github.io/retrobios/](https://abdess.github.io/retrobios/)**
@@ -41,15 +59,16 @@ Full list with per-file details: **[https://abdess.github.io/retrobios/](https:/
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.