Skip to content

Conversation

@prosdev
Copy link
Contributor

@prosdev prosdev commented Dec 7, 2025

Summary

Wire CLI extract command to storage layer, completing the core pipeline:

User Input → AI Extraction → SQLite Persistence

Changes

Storage Schema

  • pathpathHash (SHA256, unique, indexed) + filename (display)
  • No PII stored (usernames, absolute paths)
  • Upsert by pathHash: re-extracting same file updates, not duplicates

CLI

  • Add --dry-run flag to skip persistence
  • Print Saved: filename (ID: ...) on success
  • Add @doc-agent/storage dependency

Infrastructure

  • Lazy singleton for storage (no noise in tests/CI)
  • Add storage to TypeScript paths config

Testing

  • 35 tests, all passing
  • New tests for pathHash computation and upsert behavior

Closes

Closes #4

Wire CLI extract command to storage layer, completing the core pipeline:
User Input → AI Extraction → SQLite Persistence

Implementation:
- Add pathHash (SHA256) + filename columns, replacing raw path (PII-safe)
- Upsert on pathHash: re-extracting same file updates, not duplicates
- Add --dry-run flag to skip persistence
- Add @doc-agent/storage dependency to CLI
- Lazy singleton initialization (no noise in tests/CI)

Schema changes:
- path → pathHash (unique, indexed)
- path → filename (display only)
- hash → contentHash (renamed for clarity)

Security:
- No PII (usernames, paths) stored in database
- Added Security & Privacy section to AGENTS.md

Testing:
- 35 tests, all passing
- New tests for pathHash computation and upsert behavior

Closes #4
@prosdev prosdev merged commit 35eb09f into main Dec 7, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Integration: Persist Extraction Results

2 participants