1.9 KiB
1.9 KiB
WIP: Software Hashes
Branch: feature/software-hashes
Started: 2026-02-17
Status: In Progress
Plan
Implements docs/plans/software-hashes.md — a derived software_hashes table storing MD5, CRC32 and size for tape-image contents extracted from download zips.
Tasks
- Create
data/zxdb/directory (for JSON snapshot) - Add
software_hashesDrizzle schema model - Create
bin/update-software-hashes.mjs— main pipeline script- DB query for tape-image downloads (filetype_id IN 8, 22)
- Resolve local zip path via CDN mapping
- Extract
_CONTENTS(skip if exists) - Find tape file (.tap/.tzx/.pzx/.csw) with priority order
- Compute MD5, CRC32, size_bytes
- Upsert into software_hashes
- State file for resume support
- JSON export after bulk update (atomic write)
- Update
bin/import_mysql.shto reimport snapshot on DB wipe - Add pnpm script entries
Progress Log
2026-02-17T16:00Z
- Started work. Branch created from
mainatb361201. - Explored codebase: understood DB schema, CDN mapping, import pipeline.
- Key findings:
- filetype_id 8 = "Tape image" (33,427 rows), 22 = "BUGFIX tape image" (98 rows)
- CDN_CACHE = /Volumes/McFiver/CDN, paths: SC/ (zxdb) and WoS/ (pub)
_CONTENTSdirs exist in WoS but not yet in SC- data/zxdb/ directory needs creation
- import_mysql.sh needs software_hashes reimport step
Decisions & Notes
- Target filetype IDs: 8 and 22 (tape image + bugfix tape image).
- Tape file priority: .tap > .tzx > .pzx > .csw (most common first).
- CDN_CACHE hard-coded to /Volumes/McFiver/CDN (same as sync-downloads).
- JSON snapshot at data/zxdb/software_hashes.json.
- Use Node.js built-in crypto for MD5, crc32 from buffer-based calculation.
Blockers
None currently.
Commits
b361201 - Ready to start adding hashes