refactor(adr-102): split serialization.py into a serialization/ package (P6d)#489
Merged
Merged
Conversation
…ge (P6d)
The 1573-line api/lib/serialization.py held four distinct responsibilities and
was flagged repeatedly by the >800-line quality gate. Split at the natural class
seams into a package with zero caller changes:
- serialization/format.py — BackupFormat + KgBackupV2Reader + KG_BACKUP_FORMAT_VERSION
(schema probe + pure read path; 196 lines)
- serialization/exporter.py — DataExporter + _parse_nullable_int (export path; 703 lines)
- serialization/importer.py — DataImporter + _execute_with_age_retry/_progress/_run_parallel
+ epoch primitives (clone/merge writer; 698 lines)
- serialization/__init__.py — re-exports the public surface (BackupFormat, KgBackupV2Reader,
DataExporter, DataImporter, KG_BACKUP_FORMAT_VERSION)
All existing 'from api.lib.serialization import ...' callers (8 modules + tests)
keep working unchanged via the __init__ re-export. No cross-class references
between the three modules, so no circular imports.
Incidental cleanup: dropped the vestigial 'sys.path.insert(...)' hack and the
now-unused sys/Path/Colors imports — the absolute 'from api.app.lib.age_client
import AGEClient' resolves via the real PYTHONPATH, not the hack (which inserted
the wrong dir anyway). Behavior-preserving: full ADR-102 + backup test slice
(93 passed, 1 skipped) green; app import + public-surface assertions pass.
exporter/importer remain ~700 lines each but are single cohesive classes — the
three-class seam is the natural split; no further non-artificial seam exists.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
ADR-102 P6d — split the 1573-line
serialization.pyPart of P6 (last ADR-102 spine phase). Follows P6a (#487) and P6b (#488).
api/lib/serialization.pywas 1573 lines and repeatedly flagged by the >800-linequality gate. It held four distinct responsibilities that map cleanly to modules,
so this splits it at the natural class seams into a package — zero caller changes.
Layout
serialization/format.pyBackupFormat+KgBackupV2Reader+KG_BACKUP_FORMAT_VERSION(schema probe + pure read path)serialization/exporter.pyDataExporter+_parse_nullable_int(export path)serialization/importer.pyDataImporter+_execute_with_age_retry/_progress/_run_parallel+ epoch primitives (clone/merge writer)serialization/__init__.pyImport-compat (the whole risk)
All existing
from api.lib.serialization import ...callers keep working via the__init__re-export. Verified consumers:backup_archive,backup_streaming,gexf_exporter,backup_integrity,restore_worker,restore_modes, plus 7 testmodules. The re-export covers every symbol any of them import
(
DataExporter,KgBackupV2Reader,DataImporter,KG_BACKUP_FORMAT_VERSION;BackupFormatalso exported).No cross-class references exist between the three modules (verified by grep), so no
circular imports.
format.pyimports nothing fromexporter/importer.Incidental cleanup
Dropped the vestigial
sys.path.insert(0, str(Path(__file__).parent.parent))hackand the now-unused
sys/Path/Colorsimports. The hack insertedapi/—which doesn't even serve the absolute
from api.app.lib.age_client import AGEClientimport (that resolves via the real PYTHONPATH). Confirmed harmless by the test run.
Why exporter/importer are still ~700 lines
Each is a single cohesive class (
DataExporter/DataImporter). The three-classboundary is the natural seam; splitting a single class further would be artificial.
Both are now under the 800-line priority threshold.
Tests
import api.app.main+ public-surfacehasattrassertions — OKast.parseclean🤖 Generated with Claude Code