[#18] xbrain diff between snapshots#28
Merged
Merged
Conversation
Adds `xbrain diff <snapshot-a> [snapshot-b]` (B defaults to live `data/`). Reports primary_topic reassignments with top transitions, per-topic membership shifts with growth flags, topic-overview drift via pure-Python TF cosine similarity, and vocab slug changes. `--format text|json`. The diff module is pure I/O: `diff_snapshots(a_dir, b_dir) -> DiffReport`. Snapshot resolution stays in the CLI; the module diffs two data-shaped directories, so the live `data/` is a first-class B side without a special case. No new dependencies — TF cosine fits in ~40 lines over collections.Counter. Embeddings / LLM-judged similarity are explicit follow-ups gated on WS3 (#8). Rename detection is out of scope for v1. 27 new tests (unit + CLI integration); full suite 326/326. Quality gate all-green: ruff, ruff format, mypy, bandit, vulture, interrogate, deptry, detect-secrets, pytest, 89% coverage. Refs: #18
Addresses every HIGH/MEDIUM finding from the 6-reviewer panel on PR #28 (code-reviewer + python-code-reviewer + spec-compliance + test-analyzer APPROVED; silent-failure-hunter and simplifier flagged actionable items): silent-failure-hunter MEDIUM #1: silent empty-diff on missing dirs. - `diff_snapshots` now validates both directories exist on disk; raises FileNotFoundError with the missing path if not. - Validates that at least one artifact exists on either side; if both are fully empty, raises FileNotFoundError naming both dirs (guards against the "data/ deleted out-of-band" scenario where diff would otherwise silently report 'everything was removed'). silent-failure-hunter MEDIUM #2: corrupt-file errors lacked context. - Each loader call inside `diff_snapshots` is wrapped to add the path to the ValueError message ("failed to load <path>: <orig msg>"), so a malformed items.json / vocab.yaml / topics.json surfaces with the specific file rather than a bare pydantic / json traceback. simplifier #1: `_tfidf_cosine` renamed to `_tf_cosine`. - The function uses plain TF cosine, not TF-IDF (with only 2 documents, IDF degenerates). Docstring already explained this; renaming the symbol stops the name from lying. Module docstring + every call site + import updated. simplifier #2: `VocabDiff.unchanged: list[str]` was only ever consumed as `len(...)`. Replaced with `unchanged_count: int`. JSON output is slightly smaller on large vocabs and the data shape stops promising information the consumers don't read. spec-compliance follow-up: `diff.py` added to the "Where things live" tree in ARCHITECTURE.md. Tests: - Three new tests for the validation: missing dir → FileNotFoundError, both-empty → FileNotFoundError, corrupt items.json → ValueError with path in message. - Existing tests updated for the `unchanged_count` rename. Skipped (out of scope for this round, documented in task #88): - test-analyzer polish (French tokenizer test, JSON schema-stability deeper assertions, secondary-topic-no-reassign test) — improvements not blockers. - simplifier #3, #4 (drop TopicChange.unchanged, drop DiffSummary) — borderline derivability vs. JSON consumer ergonomics. - simplifier #5, #6 (Literal["text","json"] dispatch, drop diff_snapshots threshold kwargs) — internal-only style. - code-reviewer/python-code-reviewer naming nit (`reassigned_pct` carrying a fraction) — purely cosmetic, internal consistency intact. Total: 329 tests (up from 326), coverage 89%, `uv run poe check` all-green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This was referenced May 22, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds
xbrain diff <snapshot-a> [snapshot-b]— the missing tool to answer"what changed between two states of
data/?" after a re-enrichment orvocab rebuild. Builds on the snapshot system from #17.
snapshot-b= the livedata/so the common case is one shortcommand:
xbrain diff <pre-snapshot>after a destructive run.identical / similar / different / not_comparable.--format jsonemits the pydantic model as JSON; consumers canjqor build CI thresholds.What ships
src/xbrain/diff.py— pure I/O, no CLI side-effects. Pydantic models (DiffReport,ItemsDiff,TopicChange, etc.), pure-Python TF cosine helper, text + JSON renderers.diffinsrc/xbrain/cli.py.tests/test_diff.py— 27 tests (unit module + CLI integration viaCliRunner).Commandstable row and aSnapshots & safetyparagraph + example invocations.Spec deviations
None of substance. Implementation matches the PRD on every acceptance criterion.
scikit-learn, no embeddings, no API call). Decision documented in PRD §6. Embeddings / LLM-judged similarity remain follow-ups gated on WS3 (WS3 — enrichment evaluation harness #8).--judgeflag: intentionally not stubbed — adds dead code that would have to be redesigned once WS3 — enrichment evaluation harness #8 lands.Test plan
tests/test_diff.py— 27 new tests, all green.uv run poe check— all-green:validate_judgment,_render_index,_archive_tweet_to_item); every new function indiff.pyis A/Bdiff.pyat 95%)uv run xbrain diff --helprenders correctly. Manual round-trip with two snapshots produces the expected report.Links
zz-support-files/docs/prds/2026-05-22-xbrain-18-diff-snapshots.mdzz-support-files/docs/implementation-plans/2026-05-22-xbrain-18-diff-snapshots.md