Skip to content

Wave: #19 transient retry + #13 hashtags + #18 xbrain diff + #20 tagged unions#30

Merged
VGonPa merged 12 commits into
mainfrom
develop
May 22, 2026
Merged

Wave: #19 transient retry + #13 hashtags + #18 xbrain diff + #20 tagged unions#30
VGonPa merged 12 commits into
mainfrom
develop

Conversation

@VGonPa
Copy link
Copy Markdown
Owner

@VGonPa VGonPa commented May 22, 2026

Promotes four completed issues from develop to main.

Issues included

Combined stats

  • Tests: 245 (pre-wave) → 348 (now). +103.
  • Coverage: 87% → 89%.
  • uv run poe check all-green on develop tip.

Reviewer panels

Each PR ran a sized review pipeline (3-6 reviewers). HIGH/CRITICAL findings on PR #29's tagged-unions refactor (silent regression of #19's transient-retry semantics for uncategorised failures) were caught and fixed before merge — new unknown_error reason added as a transient bucket so an uncaught extractor exception still self-heals.

Test plan

🤖 Generated with Claude Code

VGonPa and others added 12 commits May 22, 2026 18:43
Today `xbrain fetch` skips every item whose `content` field is non-empty,
regardless of whether the recorded sources succeeded. To retry a single
network blip the user has to use `--force`, which re-fetches *everything*
(wasteful, rate-limit risk).

This change adds an automatic retry path for items whose only recorded
failures are TRANSIENT (timeout, dns_error). Items with success or with
TERMINAL failures (not_found, paywall, forbidden, js_required, empty_content)
stay skipped until `--force` is set.

Implementation:
- New constant `_TRANSIENT_FAILURES: frozenset[FailureReason] =
  frozenset({"timeout", "dns_error"})` in `xbrain.fetch`.
- New pure helper `_should_refetch(content, force) -> bool` that
  encapsulates the skip decision (truth-table-driven).
- One-line wiring in `fetch_pending`: the `if item.content is not None
  and not force: continue` line is replaced with
  `if not _should_refetch(item.content, force): continue`.

Scope:
- Only `xbrain.fetch` (external_article links). `fetch_x.py` (x.com
  articles + threads) has a different failure-reason set and its own
  skip logic — out of scope here, separate issue if needed.
- No new CLI flag — the new behaviour falls out of the existing
  `xbrain fetch` invocation.
- `--force` semantics UNCHANGED (still refetches everything).

Spec decisions documented in the PRD:
- `dns_error` classified as transient (DNS provider blips happen; dead
  domains will cost one extra HTTP probe per run — acceptable).
- `empty_content` classified as terminal (Trafilatura returned no body;
  if extraction improves, the user can `--force`).

Tests (18 new):
- 13 truth-table tests on `_should_refetch` covering every cell from the
  PRD §5 matrix (content=None, force ON/OFF, ok, transient, terminal,
  mixed transient+terminal, mixed transient+success, only-x sources).
- 5 integration tests on `fetch_pending`:
  - timeout item retried without `--force`.
  - not_found item skipped without `--force`.
  - not_found item retried with `--force`.
  - successful fetch skipped without `--force`.
  - timeout item outside `since`/`until` still skipped (filter precedence).

Total: 296 tests (up from 278), coverage 88%, `uv run poe check`
all-green.

Docs:
- README Commands table: `fetch` row gains a sentence describing the
  transient/terminal distinction.
- ARCHITECTURE step-by-step `fetch` card: new "Transient retries" bullet.

PRD:  vault/zz-support-files/docs/prds/2026-05-22-xbrain-19-retry-transient-fetches.md
Plan: vault/zz-support-files/docs/implementation-plans/2026-05-22-xbrain-19-retry-transient-fetches.md

Closes #19.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Addresses every HIGH/MEDIUM finding from the 4-reviewer panel on PR #26
(code-reviewer + spec-compliance APPROVED; silent-failure-hunter and
pr-test-analyzer flagged the same issue plus minor cosmetics):

silent-failure-hunter HIGH (also pr-test-analyzer M1 + code-reviewer nit):
- ContentSource(ok=False, failure_reason=None) was silently classified
  as terminal. Pre-Fase-2 records (where failure_reason defaulted None)
  and any future code path that records a failure without categorising
  it (e.g. an uncaught extractor exception via `error="..."`) would stay
  invisibly stuck — exactly the silent-failure shape the spec invariant
  tries to prevent.
- Fix: treat failure_reason=None as transient. The predicate becomes
  `failure_reason is None or failure_reason in _TRANSIENT_FAILURES`.
  Rationale: an `ok=False` without a categorised reason is anomalous;
  re-fetching gives it a chance to land on a categorised result rather
  than getting permanently stuck.
- Docstring updated to explain the policy.

pr-test-analyzer M2: missing test for mixed external_article + x_article.
- Added `test_should_refetch_external_transient_alongside_xcom_source`:
  a transient-failed external_article + an x_article (any state) must
  retry on the external (x.com is fetch_x's job, must not block).

pr-test-analyzer M3: missing test for "re-fetch replaces, not appends".
- Added `test_fetch_pending_replaces_sources_does_not_append`: a
  transient-failed item with 1 source stays at 1 source after retry,
  not 2.

Also added `test_should_refetch_uncategorized_failure_treated_as_transient`
to pin the new failure_reason=None behaviour explicitly.

Cosmetic cleanup (pr-test-analyzer + code-reviewer):
- Hoisted `from xbrain.models import Content, ContentSource` and
  `from xbrain.fetch import _should_refetch` to the file top — dropped
  4 inline imports inside test bodies + 1 redundant local wrapper.

spec-compliance noted (NO change needed): the x-only + force case where
the helper returns True but fetch_pending filters x-only items before
consulting _should_refetch. User-visible behaviour matches the spec;
helper alone returning True is a curiosity, not a bug.

code-reviewer APPROVED with no blockers; radon B(8) on _should_refetch
is unchanged from the helper's original shape.

Total: 299 tests (up from 296), coverage 88% unchanged, all-green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
[#19] Retry transiently-failed fetches without --force
Add `[output] topic_style = "wikilink" | "hashtag"` config flag (default
`wikilink`, backwards-compatible). When set to `hashtag`, the in-body
`**Topics:**` line is emitted as `#ai-coding #software-engineering`
instead of `[[ai-coding]] · [[software-engineering]]` — so the line is a
tappable Obsidian tag in addition to the frontmatter `tags:` entries.

Scope is the in-body line only. Frontmatter `tags:`, `_index.md`'s `##
Topics` ranking and topic-page post lists keep wikilinks — those are
navigational by design.

The flag is validated at config-load time and re-validated at the
`generate()` boundary so programmatic callers (tests, notebooks) fail
fast on a typo.

PRD: docs/prds/2026-05-22-xbrain-13-hashtag-topics.md (in vault)
Plan: docs/implementation-plans/2026-05-22-xbrain-13-hashtag-topics.md

Tests:
- 4 new in test_generate.py: wikilink default, hashtag mode, hashtag
  mode does not bleed into the index, unknown style raises.
- 3 new in test_config.py: defaults to wikilink, hashtag round-trips,
  unknown style rejected.

Quality gate: all critical checks pass · 306 tests · 88% coverage.
Adds `xbrain diff <snapshot-a> [snapshot-b]` (B defaults to live `data/`).
Reports primary_topic reassignments with top transitions, per-topic
membership shifts with growth flags, topic-overview drift via pure-Python
TF cosine similarity, and vocab slug changes. `--format text|json`.

The diff module is pure I/O: `diff_snapshots(a_dir, b_dir) -> DiffReport`.
Snapshot resolution stays in the CLI; the module diffs two data-shaped
directories, so the live `data/` is a first-class B side without a special
case.

No new dependencies — TF cosine fits in ~40 lines over collections.Counter.
Embeddings / LLM-judged similarity are explicit follow-ups gated on WS3 (#8).
Rename detection is out of scope for v1.

27 new tests (unit + CLI integration); full suite 326/326. Quality gate
all-green: ruff, ruff format, mypy, bandit, vulture, interrogate, deptry,
detect-secrets, pytest, 89% coverage.

Refs: #18
[#13] Optional: render topics as Obsidian #hashtags
Addresses every HIGH/MEDIUM finding from the 6-reviewer panel on PR #28
(code-reviewer + python-code-reviewer + spec-compliance + test-analyzer
APPROVED; silent-failure-hunter and simplifier flagged actionable items):

silent-failure-hunter MEDIUM #1: silent empty-diff on missing dirs.
- `diff_snapshots` now validates both directories exist on disk; raises
  FileNotFoundError with the missing path if not.
- Validates that at least one artifact exists on either side; if both are
  fully empty, raises FileNotFoundError naming both dirs (guards against
  the "data/ deleted out-of-band" scenario where diff would otherwise
  silently report 'everything was removed').

silent-failure-hunter MEDIUM #2: corrupt-file errors lacked context.
- Each loader call inside `diff_snapshots` is wrapped to add the path to
  the ValueError message ("failed to load <path>: <orig msg>"), so a
  malformed items.json / vocab.yaml / topics.json surfaces with the
  specific file rather than a bare pydantic / json traceback.

simplifier #1: `_tfidf_cosine` renamed to `_tf_cosine`.
- The function uses plain TF cosine, not TF-IDF (with only 2 documents,
  IDF degenerates). Docstring already explained this; renaming the
  symbol stops the name from lying. Module docstring + every call site
  + import updated.

simplifier #2: `VocabDiff.unchanged: list[str]` was only ever consumed
as `len(...)`. Replaced with `unchanged_count: int`. JSON output is
slightly smaller on large vocabs and the data shape stops promising
information the consumers don't read.

spec-compliance follow-up: `diff.py` added to the "Where things live"
tree in ARCHITECTURE.md.

Tests:
- Three new tests for the validation: missing dir → FileNotFoundError,
  both-empty → FileNotFoundError, corrupt items.json → ValueError with
  path in message.
- Existing tests updated for the `unchanged_count` rename.

Skipped (out of scope for this round, documented in task #88):
- test-analyzer polish (French tokenizer test, JSON schema-stability
  deeper assertions, secondary-topic-no-reassign test) — improvements
  not blockers.
- simplifier #3, #4 (drop TopicChange.unchanged, drop DiffSummary) —
  borderline derivability vs. JSON consumer ergonomics.
- simplifier #5, #6 (Literal["text","json"] dispatch, drop
  diff_snapshots threshold kwargs) — internal-only style.
- code-reviewer/python-code-reviewer naming nit (`reassigned_pct`
  carrying a fraction) — purely cosmetic, internal consistency intact.

Total: 329 tests (up from 326), coverage 89%, `uv run poe check`
all-green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
[#18] xbrain diff between snapshots
Split ContentSource into ContentSourceSuccess and ContentSourceFailure
with an `outcome` discriminator. Same for the internal FetchResult,
now FetchSuccess | FetchFailure. The type system enforces that
success-only fields (text, title) and failure-only fields
(failure_reason, error) cannot coexist — illegal states are
unrepresentable.

Backward-compatible deserialisation: legacy items.json records
(pre-#20, with `ok: bool`) are normalised on read by a BeforeValidator
on the union. Uncategorised legacy failures (failure_reason=None)
bucket to "timeout" so #19's auto-retry gives them one chance to
land on a proper category.

mypy now catches missing required fields and reading the wrong
field off a narrowed variant — verified by a probe at
tests/type_probes/illegal_states.py and a regression test
that runs mypy on it.

Quality gate: 288 tests, 88% coverage, all critical checks pass.
Addresses the HIGH findings from the silent-failure-hunter on PR #29
(silent regression of #19's invariant). The other 5 reviewers approved
or flagged the same area as "doc-only"; this commit closes the real
behavioural gap.

The bug: `_content_source_from` and the bare-except in `fetch_item`
bucketed an uncategorised failure to `"empty_content"` (a TERMINAL
failure_reason). #19 guaranteed that a failure without a categorised
reason was retry-worthy ("anomalous failures get one auto-retry");
post-#20 those records became permanently stuck without `--force`. A
silent regression: tests pass, the wiki renders, but real-world
uncaught extractor exceptions self-heal no more.

Fix: new `FailureReason = "unknown_error"` Literal value, added to
`_TRANSIENT_FAILURES`. The two collapse sites and the legacy migrator
all bucket uncategorised failures here.

Changes:
- `src/xbrain/models.py`: `FailureReason` gains `"unknown_error"`.
  `_normalise_legacy_content_source` now buckets legacy `ok=False,
  failure_reason=None` records to `"unknown_error"` (was `"timeout"` —
  misleading because the actual cause was unknown, not a timeout) and
  emits a `logger.warning` per migrated record (silent-failure HIGH 2:
  the legacy bucketing was invisible). The "missing both discriminators"
  ValueError now includes the offending URL (silent-failure HIGH 1
  partial fix: better error context; per-item resilience is a separate
  larger change deferred to a follow-up issue).
- `src/xbrain/fetch.py`: `_TRANSIENT_FAILURES` gains `"unknown_error"`;
  the bare-except in `fetch_item` and the `_content_source_from`
  fallback both target `"unknown_error"` instead of `"empty_content"`;
  `FetchFailure` docstring updated.
- `src/xbrain/generate.py`: `_FAILURE_ES` gains the Spanish translation
  `"error desconocido"`.
- `tests/test_models.py`, `tests/test_fetch.py`: legacy-migration test
  updated to expect `unknown_error` instead of `timeout`. TWO new
  regression tests pin the new invariant:
  - `test_extractor_exception_persists_as_transient_failure`: bare-except
    persists `unknown_error` AND `_should_refetch` returns True.
  - `test_content_source_from_uncategorised_failure_is_transient`: the
    `FetchFailure → ContentSourceFailure` mapping uses `unknown_error`.

Skipped (low-priority, follow-up):
- Per-item resilience in load_store (HIGH 1 full fix): structural
  change, separate issue if real.
- M4 (generate.py silent elision of thread/quoted_tweet failures):
  pre-existing pattern, out of scope.
- python-code-reviewer M1/M2/M3: doc-tier polish.
- simplifier suggestions (drop pop() defensive, hoist local import in
  notes_io): doc/style polish.
- test-analyzer polish (3 model_copy / JSON round-trip assertions):
  improvements, not blockers.

Total: 348 tests (up from 346), coverage 89%, quality gate all-green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
[#20] refactor: ContentSource/FetchResult as tagged unions
@VGonPa VGonPa merged commit bd0a4e7 into main May 22, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant