Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 15 additions & 3 deletions api/app/lib/backup_integrity.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,9 +9,21 @@
- External dependencies (concept references not contained in this backup — partial backups)
- Statistics (record counts surfaced for logging)

This is the *runtime* checker. The exhaustive *offline* oracle is
``scripts/development/lint/lint_backup.py`` (used in tests/linting). Consolidating
the two v2 validators is tracked for ADR-102 P6.
This is the *runtime* checker, deliberately distinct from the offline oracle
``scripts/development/lint/lint_backup.py``. They validate DIFFERENT layers and are
intentionally NOT merged (evaluated and decided in ADR-102 P6 — do not "consolidate"
them):
- THIS module checks the DE-INTERNED logical graph via :class:`KgBackupV2Reader`
(referential integrity, external deps, statistics) — the cheap runtime gate run
before a backup is streamed / archived / restored.
- ``lint_backup`` checks the RAW interned on-disk structure (dictionary index
ranges, interning integrity, §6 derived-product exclusions, format negotiation)
and carries NO api-package dependency by design (ADR-102 Track D), so the
pytest oracle can load it standalone by file path.

A shared core would force one layer's representation onto the other and couple the
standalone oracle to the api package; the narrow overlap (reference / external-dep
checks) does not justify the cost.
"""

from typing import Dict, List, Optional, Any, Set
Expand Down
13 changes: 12 additions & 1 deletion scripts/development/lint/lint_backup.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,17 @@
wrapper that reads a path, calls the core, prints issues, and exits non-zero if
any ERROR was found (CI convention, matching the sibling lint tools).

Relationship to the runtime checker (ADR-102 P6 — intentionally NOT merged):
this is the *independent* verification tool — a deliberately standalone oracle
with NO api-package dependency, which is what makes it a convenient CI / test /
build gate and an outside check on the serializer. It validates the RAW interned
on-disk structure (dictionary index resolution, interning integrity, §6 exclusions,
format negotiation). Its runtime counterpart, ``api/app/lib/backup_integrity.py``,
validates the DE-INTERNED logical graph via :class:`KgBackupV2Reader` as the cheap
gate before stream/archive/restore. The two check different layers for different
consumers; consolidating them would couple this oracle to the api package and
strip its interning-layer coverage, so they are kept separate by design.

Checks implemented (see ``CHECK_CODES`` for the stable code registry):
- HEADER presence / well-formedness and format_version family negotiation (§7)
- kg-backup/2 required header fields (§3.1)
Expand All @@ -34,7 +45,7 @@
python3 scripts/development/lint/lint_backup.py <path-to-archive.tar.gz>
python3 scripts/development/lint/lint_backup.py --selftest

@verified cffa180b
@verified b832d59d
"""

import argparse
Expand Down
Loading