Skip to content

refactor(adr-102): split serialization.py into a serialization/ package (P6d)#489

Merged
aaronsb merged 1 commit into
mainfrom
chore/adr-102-p6d-split-serialization
Jun 1, 2026
Merged

refactor(adr-102): split serialization.py into a serialization/ package (P6d)#489
aaronsb merged 1 commit into
mainfrom
chore/adr-102-p6d-split-serialization

Conversation

@aaronsb
Copy link
Copy Markdown
Owner

@aaronsb aaronsb commented Jun 1, 2026

ADR-102 P6d — split the 1573-line serialization.py

Part of P6 (last ADR-102 spine phase). Follows P6a (#487) and P6b (#488).

api/lib/serialization.py was 1573 lines and repeatedly flagged by the >800-line
quality gate. It held four distinct responsibilities that map cleanly to modules,
so this splits it at the natural class seams into a package — zero caller changes.

Layout

Module Contents Lines
serialization/format.py BackupFormat + KgBackupV2Reader + KG_BACKUP_FORMAT_VERSION (schema probe + pure read path) 196
serialization/exporter.py DataExporter + _parse_nullable_int (export path) 703
serialization/importer.py DataImporter + _execute_with_age_retry/_progress/_run_parallel + epoch primitives (clone/merge writer) 698
serialization/__init__.py re-exports the public surface 19

Import-compat (the whole risk)

All existing from api.lib.serialization import ... callers keep working via the
__init__ re-export. Verified consumers: backup_archive, backup_streaming,
gexf_exporter, backup_integrity, restore_worker, restore_modes, plus 7 test
modules. The re-export covers every symbol any of them import
(DataExporter, KgBackupV2Reader, DataImporter, KG_BACKUP_FORMAT_VERSION; BackupFormat also exported).

No cross-class references exist between the three modules (verified by grep), so no
circular imports. format.py imports nothing from exporter/importer.

Incidental cleanup

Dropped the vestigial sys.path.insert(0, str(Path(__file__).parent.parent)) hack
and the now-unused sys / Path / Colors imports. The hack inserted api/
which doesn't even serve the absolute from api.app.lib.age_client import AGEClient
import (that resolves via the real PYTHONPATH). Confirmed harmless by the test run.

Why exporter/importer are still ~700 lines

Each is a single cohesive class (DataExporter / DataImporter). The three-class
boundary is the natural seam; splitting a single class further would be artificial.
Both are now under the 800-line priority threshold.

Tests

  • ADR-102 + backup slice (kg_backup_v2 / reader / id_remap / restore_modes / restore_worker_epoch / kg_backup_v2_restore / epoch_reconciliation / backup_integrity / backup_streaming) — 93 passed, 1 skipped
  • import api.app.main + public-surface hasattr assertions — OK
  • All 4 modules ast.parse clean

🤖 Generated with Claude Code

…ge (P6d)

The 1573-line api/lib/serialization.py held four distinct responsibilities and
was flagged repeatedly by the >800-line quality gate. Split at the natural class
seams into a package with zero caller changes:

- serialization/format.py   — BackupFormat + KgBackupV2Reader + KG_BACKUP_FORMAT_VERSION
                              (schema probe + pure read path; 196 lines)
- serialization/exporter.py — DataExporter + _parse_nullable_int (export path; 703 lines)
- serialization/importer.py — DataImporter + _execute_with_age_retry/_progress/_run_parallel
                              + epoch primitives (clone/merge writer; 698 lines)
- serialization/__init__.py — re-exports the public surface (BackupFormat, KgBackupV2Reader,
                              DataExporter, DataImporter, KG_BACKUP_FORMAT_VERSION)

All existing 'from api.lib.serialization import ...' callers (8 modules + tests)
keep working unchanged via the __init__ re-export. No cross-class references
between the three modules, so no circular imports.

Incidental cleanup: dropped the vestigial 'sys.path.insert(...)' hack and the
now-unused sys/Path/Colors imports — the absolute 'from api.app.lib.age_client
import AGEClient' resolves via the real PYTHONPATH, not the hack (which inserted
the wrong dir anyway). Behavior-preserving: full ADR-102 + backup test slice
(93 passed, 1 skipped) green; app import + public-surface assertions pass.

exporter/importer remain ~700 lines each but are single cohesive classes — the
three-class seam is the natural split; no further non-artificial seam exists.
@aaronsb aaronsb merged commit b832d59 into main Jun 1, 2026
4 checks passed
@aaronsb aaronsb deleted the chore/adr-102-p6d-split-serialization branch June 1, 2026 22:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant