Skip to content

feat: chapter-based narrative + drop log/encyclopedia (ADR-015, PR 1)#58

Open
ditvor wants to merge 1 commit into
developfrom
claude/flamboyant-chaplygin-6763f9
Open

feat: chapter-based narrative + drop log/encyclopedia (ADR-015, PR 1)#58
ditvor wants to merge 1 commit into
developfrom
claude/flamboyant-chaplygin-6763f9

Conversation

@ditvor
Copy link
Copy Markdown
Owner

@ditvor ditvor commented May 27, 2026

Summary

PR 1 of the ADR-015 rollout — migrates the narrative data model from the flat paragraphs + selected_photo_indices shape to a six-chapter envelope so the four upcoming Trailpath layouts (Zine / Sunday / Postcard / Album) can consume their native input.

  • BREAKING: NarrativeOutput.chapters: list[Chapter] (exactly 6) replaces flat paragraphs and selected_photo_indices as canonical state. schema_version bumps from 3 to 4. Each chapter binds one photo via photo_index and carries its own id + time + place + lat/lon + tri-lingual title + sentence-leveled body. paragraphs_as_localized() becomes a computed view over chapters[*].body; selected_photo_indices survives as a read-only @property.
  • Style.log and Style.encyclopedia removed — never surfaced in the picker, never shipped to users. v0 product decision is to ship the five Trailpath styles (Letter + the four new ones in subsequent renderer PRs).
  • narrative_max_tokens default raised 4096 → 8192 to fit the new chapter JSON envelope (case 04 was truncating at 4096).
  • Writer prompt rewritten with the chapter envelope; previous Phase 4 skeleton preserved as a dated comment.
  • Eval rubric refit (drops indices_valid, adds chapter_count_is_six + chapter_photo_binding_valid, loosens paragraph-count check to 3-6 to fit six chapter bodies).
  • All four eval cases refreshed via make eval-update-golden (paid); follow-up make eval-live confirms judge non-regression (threshold 1.0) across every axis on every case.

Letter (editorial) template walks chapter bodies with no visible-output change vs the PR-57 baseline. Phase 0 of ADR-015 docs.

Eval score tables

All four cases pass rubric and judge non-regression vs the fresh ADR-015 goldens (threshold 1.0). Score deltas reflect single-run model noise, not real regression — they're vs the goldens written 1 minute earlier.

Rubric (every check on every case)

```
schema_validates PASS
chapter_count_is_six PASS (6 chapters)
chapter_photo_binding_valid PASS (6 unique indices, all in range)
paragraph_count_3_to_5_each_lang PASS (en=6, ru=6, de=6)
russian_actually_cyrillic PASS
word_count_ratio_en_ru_in_0_7_to_1_4 PASS (~0.79-0.85)
word_count_ratio_en_de_in_0_7_to_1_4 PASS (~0.92-1.03)
title_under_60_chars PASS
subtitle_under_90_chars PASS
milestone_under_30_chars PASS
pull_quote_drawn_from_body PASS (overlap >= 0.82)
```

Judge (eval-live vs the freshly-written goldens; 0-5; Δ from same-run noise)

Case warmth narrative_arc russian_fidelity photo_selection faithfulness
01-fixture-baseline 4.50 (+0.50) 4.50 (0.00) 4.50 (0.00) 3.50 (0.00) 2.08 (-0.24)
02-joyful-summit 4.00 (-0.50) 4.00 (-0.50) 4.50 (0.00) 3.00 (-0.50) 2.67 (-0.16)
03-exhausted-foggy 4.50 (0.00) 4.50 (0.00) 4.50 (0.00) 3.00 (-0.50) 3.03 (+0.24)
04-bad-tolz-family 4.00 (0.00) 4.00 (0.00) 4.50 (0.00) 4.00 (-0.50) 3.42 (-0.44)

All cases: judge non-regressing (threshold 1.00).

The judge consistently notes that the writer's photo binding is "evenly-spaced and mechanical" (e.g. [0,2,4,6,8,10]) — a follow-up tuning opportunity for ADR-006's Option C (per-style prompt suffix) or a chapter-aware photo-selection rubric. Not a PR 1 gate.

Test plan

  • `make ci` clean: ruff + ruff format + mypy + pytest (353 pass, 92% coverage)
  • Editorial visible output unchanged: `tests/golden/test-render-editorial.html` regenerated; structural assertions (`@font-face`, oklch tokens, display+eyebrow+margin+quote+elevation classes, all three `data-lang` buttons) hold
  • Carousel still works for the new fixed-6 binding (8 slides: 1 title + 6 photos + 1 quote)
  • Paid eval refresh: `make eval-update-golden` (writer × 4 + judge × 4) + `make eval-live` (writer × 4 + judge × 4), both pass
  • Reviewer: skim ADR-015 — the load-bearing decisions (A2 chapters canonical, A-writer single pass, six chapters exact, no rev-geocoder for `place`) are documented there

Out of scope for this PR

  • The four new style templates (Zine / Sunday / Postcard / Album) — separate PRs per ADR-015 rollout sequence.
  • Reverse geocoding for chapter `place` — v0 falls back to `HikeInput.location_name`. Follow-up ADR.
  • Builder-side chapter edit mode (per-chapter accept/edit/remove) — Phase 4.1 / 4.2 of the faithfulness initiative.
  • Per-style register tuning (ADR-006 Option C escape hatch) — wait for real usage signal first.

🤖 Generated with Claude Code

…lopedia (ADR-015)

BREAKING: NarrativeOutput.paragraphs + selected_photo_indices are
replaced by a six-Chapter envelope. Each chapter binds one photo and
carries its own id, time, place, lat/lon, tri-lingual title, and
sentence-leveled body. Schema bumps 3->4; every prior cache entry
invalidates. The four upcoming Trailpath layouts (Zine, Sunday,
Postcard, Album) consume this shape natively in subsequent renderer
PRs; the Letter (editorial) template walks chapter bodies with no
visible-output change vs PR-57.

The legacy Style.log and Style.encyclopedia values are removed:
they never surfaced in the picker and the v0 product decision is to
ship the five Trailpath styles only.

- trailstory/models.py: new Chapter; NarrativeOutput.chapters
  (exactly CHAPTER_COUNT=6) replaces flat paragraphs +
  selected_photo_indices. paragraphs_as_localized() becomes a
  computed view over chapters[*].body; selected_photo_indices
  survives as a read-only @Property. Style enum collapses to
  editorial only. schema_version=4.
- trailstory/llm/prompts.py: writer JSON skeleton rewritten for
  the six-chapter contract; previous (Phase 4) skeleton preserved
  as a dated comment.
- trailstory/llm/narrative.py: verifier walks chapters[*].body.
- trailstory/config.py: narrative_max_tokens 4096->8192 (longer
  output for the new shape).
- tests/eval/rubric.py: drops indices_valid; adds
  chapter_count_is_six + chapter_photo_binding_valid; loosens
  paragraph_count_3_to_5_each_lang to 3-6 to fit the computed view.
- tests/eval/golden/*.json: refreshed under the new shape via
  make eval-update-golden; eval-live confirms judge non-regression
  (threshold 1.0) across every axis on every case.
- templates/styles/editorial.html.j2: walks narrative.chapters and
  emits per-chapter body with sentence-level provenance spans.
- templates/styles/log.html.j2 + encyclopedia.html.j2: deleted.
- tests/golden/test-render-editorial.html: regenerated.
- web/copy.py, web/pipeline.py, web/routes.py, web/dev.py,
  Makefile: drop log + encyclopedia references; web/dev.py fake
  narrative rebuilt as six chapter envelopes.
- CHANGELOG.md, CLAUDE.md: BREAKING entry + decision register 15.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant