Skip to content

feat(adr-102): add kg admin verify-backup — validate a backup without restoring#493

Merged
aaronsb merged 2 commits into
mainfrom
feat/adr-102-backup-verify-command
Jun 2, 2026
Merged

feat(adr-102): add kg admin verify-backup — validate a backup without restoring#493
aaronsb merged 2 commits into
mainfrom
feat/adr-102-backup-verify-command

Conversation

@aaronsb
Copy link
Copy Markdown
Owner

@aaronsb aaronsb commented Jun 2, 2026

What

A first-class CLI command to validate a kg-backup/2 file without restoring it, backed by a new server-side endpoint. Implements the maintainer-chosen architecture (Option A): the validation rules live once, in Python (scripts/development/lint/lint_backup.py); the CLI does zero validation logic — it uploads the file and renders the server's report. No cross-language drift.

$ kg admin verify-backup backup_full_20260601_231935.tar.gz
🔎 Verify Backup
  Format:   kg-backup/2
  Contents: 3994 concepts, 313 sources, 7108 instances, 8705 relationships, 68 vocabulary
  ✓ Valid backup (0 warning(s), 0 notice(s))

Pieces

Layer Change
Image api/Dockerfile + Dockerfile.rocm-host: COPY scripts/development/lint/ into the image. The oracle was only present in dev via the repo mount; this ships it so the endpoint works in production. Oracle stays stdlib-only / standalone.
Adapter api/app/lib/backup_oracle.py (new): loads lint_backup by path (cached importlib), exposes validate_backup_object(){ok, format_version, errors, warnings, notices, issues}.
Route POST /admin/backup/verify: accepts .tar.gz/.json upload (same containers as /restore), runs the oracle + best-effort de-interned statistics, returns the report. Read-only — no graph access, nothing queued.
CLI client.verifyBackup() + kg admin verify-backup [file] (positional path / --file from backup dir / interactive pick). Renders issues with codes + JSON-path locations, record counts, pass/fail verdict; exits nonzero on errors.

Permissions

Gated by backups:read — admin-default (migration 028/037 grant it to admin + platform_admin), and an admin can grant backups:read to any other role to delegate verification. Verify is read-only and strictly less privileged than restore, so it intentionally does not reuse backups:restore. (If you'd prefer a distinct backups:verify action, that's a small migration — flagged, not done.)

Naming

verify-backup is a clean sibling of backup/restore/backups (which are flat under admin). Converting backup into a subcommand group would have risked the existing leaf command, so a sibling was the low-risk choice closest to the requested kg admin backup verify.

Tests

tests/api/test_backup_verify.py (5): valid → ok; dimension-mismatch surfaced (E_CONCEPT_EMBEDDING_DIM); bad extension → 400; invalid JSON → 400; legacy format refused (E_LOWER_MAJOR). Live end-to-end verified against a real 16 MB archive. CLI builds clean.

⚠️ Deployment note

The COPY scripts/development/lint/ line is new, so the API image must be rebuilt/republished for verify to work in production (./publish.sh images). Not published from here — needs maintainer approval.

Follow-on (parked)

"Pick one ontology to restore out of a full backup" — a filtered restore that builds on this inspect/verify flow. Deserves its own design/ADR.

🤖 Generated with Claude Code

aaronsb added 2 commits June 1, 2026 20:21
…ut restoring

Exposes the offline backup-object oracle as a first-class CLI command, backed by a
new server-side endpoint — single source of truth, no cross-language drift.

Architecture (per maintainer decision): the validation rules live once, in Python
(scripts/development/lint/lint_backup.py). The CLI does ZERO validation logic; it
uploads the file and renders the server's report.

- api/Dockerfile + Dockerfile.rocm-host: COPY scripts/development/lint/ into the API
  image. The oracle was only present in dev via the repo mount; this ships it so the
  endpoint works in production too. Oracle stays stdlib-only / standalone.
- api/app/lib/backup_oracle.py (new): thin adapter that loads lint_backup by path
  (importlib spec_from_file_location, cached) and exposes validate_backup_object()
  returning a JSON report {ok, format_version, errors, warnings, notices, issues}.
- POST /admin/backup/verify (new route): accepts a .tar.gz or .json upload (same
  containers as /restore), runs the oracle + best-effort de-interned statistics,
  returns the report. Read-only — no graph access, nothing queued. Gated by
  backups:read (admin-default; grant to another role to delegate verification).
- CLI: client.verifyBackup() + `kg admin verify-backup [file]` (positional path,
  --file from the backup dir, or interactive pick). Renders errors/warnings/notices
  with codes + JSON-path locations, record counts, and a pass/fail verdict; exits
  nonzero on errors.

Why a sibling command (verify-backup) rather than `backup verify`: backup/restore/
backups are flat siblings under admin; converting backup into a group risked the
existing leaf command, so verify-backup is a clean, discoverable sibling.

Tests: tests/api/test_backup_verify.py (5) — valid ok, dimension-mismatch surfaced,
bad extension 400, invalid JSON 400, legacy format refused. Live end-to-end verified
against a real 16 MB archive (3994 concepts) → "Valid backup".

NOTE: the API image must be rebuilt/republished for verify to work in production
(the COPY line is new). Do not publish without maintainer approval.
- Clean up the saved .tar.gz in the finally block: extraction failures (corrupt
  gzip / missing manifest) previously leaked the uploaded archive in /tmp — exactly
  the malformed-archive case verify exists to catch. (The same pre-existing pattern
  in /restore is left for a separate fix.)
- Guard file.filename None -> treat as bad extension (400) instead of AttributeError 500.
- Set external_deps=0 in the stats-fallback branch for response-shape symmetry.
- Dockerfiles: COPY just lint_backup.py (not the whole lint dir) to avoid shipping
  __pycache__ and unrelated lint scripts into the API image.

5/5 verify route tests still pass.
@aaronsb aaronsb merged commit c934873 into main Jun 2, 2026
4 checks passed
@aaronsb aaronsb deleted the feat/adr-102-backup-verify-command branch June 2, 2026 01:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant