with_synopsis — LLM-derived synopsis surfaces as collapsed-tree routing fuel (#48)#55
Merged
Merged
Conversation
… surfaces (#48) The document-summary-index / collapsed-tree fuel (report 12, ADR #43): run a summarizer over each artifact at build time, index it as a 'synopsis' surface, and let the collapsed-tree policy (#47) route a synopsis match to the artifact's chunks. Build-time cost, ~free at query time, incremental. ir/synopsis.py: - with_synopsis(strategy, *, synthesize=None, synthesizer_id=None) wraps any IndexingStrategy and PREPENDS one synopsis surface (so it is the first summary surface -> the router). Empty synopsis is dropped. - make_llm_synthesizer (Artifact -> str) mirrors make_llm_formulator/selector: injectable summarize double, lazy oa on first synthesis (import ir stays offline), errors/empty -> '' (never a fabricated summary). - Synthesizer type exported. Staleness reuses the ledger mechanism: the wrapper exposes scalar synthesizer_id + synopsis_kind and holds the inner strategy; index._strategy_id now RECURSES into nested strategies, so a model/prompt change OR an inner-strategy-param change re-synthesizes exactly the affected artifacts. No new bookkeeping. Routing needs no edges: collapsed-tree descends synopsis->chunks within an artifact via records_for_artifact (surface grain), distinct from the #46 links view (cross-artifact edges). Promote strategy._text_of -> public text_of (now a cross-module SSOT; _text_of kept as alias). 10 hermetic tests, 358 total.
…esizer (review) Adversarial review of PR #55 confirmed two should-fixes (both contradicted docstrings shipped in this PR) and four test-hardening nits. should-fix 1 — text_key alignment: the default synthesizer extracted text via text_of(raw) with NO text_key, so with_synopsis(Chunked(text_key='body')) + the default synthesizer summarized a different field than the strategy indexed (silent wrong-field synopsis). Thread the inner strategy's text_key into the default: make_llm_synthesizer gains text_key=; _SynopsisStrategy passes getattr(strategy, 'text_key', None). Restores the SSOT-alignment promise. should-fix 2 — non-identifiable synthesizer: an unnamed lambda / local closure falls through to a '<lambda>'/'<locals>' qualname that distinct callables share, so swapping one for another left strategy_id unchanged -> silent staleness, contradicting 'no silent staleness'. Now warn (UserWarning) and use a 'custom' sentinel, surfacing the lost guarantee at construction; named functions and explicit/stamped ids still track. Tests (+7): default-synth text_key threading; lambda-warns + named-tracks; with_synopsis(Package()) synopsis-over-description router precedence; mutation-resistant offline (poison oa); default-id content-stability; non-str synthesize at decompose; file-backed incremental round-trip. 365 total.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #48.
Adds
ir.with_synopsis(strategy, *, synthesize=None): a strategy wrapper that adds one LLM-derivedsynopsissurface per artifact at build time — the document-summary-index / collapsed-tree fuel (report 12, ADR #43). A query matching the synopsis routes (viair.traverse+collapsed_tree_policy) down to that artifact's chunks; build-time cost, ≈free at query time, fully incremental.What landed
ir/synopsis.py—with_synopsis(prepends the synopsis so it is the first summary surface → the router; empty synopsis dropped),make_llm_synthesizer(Artifact -> str, mirrorsmake_llm_formulator/make_llm_selector: injectablesummarizedouble, lazyoaon first synthesis soimport irstays offline, errors/empty →""never a fabricated summary), and theSynthesizertype.synthesizer_id+synopsis_kindand holds the inner strategy;index._strategy_idnow recurses into nested strategies, so a model/prompt change OR an inner-strategy-param change re-synthesizes exactly the affected artifacts. No new bookkeeping.strategy._text_of→ publictext_of(now a cross-module SSOT for artifact-text extraction;_text_ofkept as an alias).Design decisions (posted on #48 before coding)
description.records_for_artifactdescent), so nolinks-view edges are written — that view is for cross-artifact edges (different grain). Refines the issue body's wording.with_synopsis+edge_extractor=re-runs synthesis every build (eager edge ingest callsdecomposefor all artifacts). The common path (noedge_extractor) stays fully incremental.Acceptance criteria
build(strategy=with_synopsis(Chunked()))with an injected fake synthesizer indexes a synopsis surface per artifact, hermetically.traverseroutes to the right chunks end-to-end (gold routed, trap whose synopsis didn't match excluded though its chunk matches the query).oaat import time.10 hermetic tests (light embedder + memory store + injected synthesizers); 358 total green, lint + format clean.