docs: publish benchmark baselines by flyingrobots · Pull Request #27 · git-stunts/git-cas

flyingrobots · 2026-03-30T17:14:02Z

Summary

publish the first maintained benchmark baseline in docs/BENCHMARKS.md
land TR-003 as a Truth cycle and archive the consumed backlog card
add TR-011 for streaming encrypted restore follow-up work

Verification

CI=1 npx vitest bench --run test/benchmark/chunking.bench.js
npx prettier --check CHANGELOG.md docs/BENCHMARKS.md docs/BACKLOG/README.md docs/archive/BACKLOG/README.md docs/archive/BACKLOG/TR-003-benchmark-baselines.md docs/design/README.md docs/design/TR-003-benchmark-baselines.md docs/legends/TR-truth.md
git diff --check
pre-commit eslint
pre-push eslint
pre-push npm test

Summary by CodeRabbit

Documentation
- Published benchmark baselines documentation with chunking performance results and guidance for development
- Updated project documentation to reflect completed design investigation and added streaming encrypted restore work to backlog

coderabbitai · 2026-03-30T17:14:19Z

📝 Walkthrough

Walkthrough

This PR closes the TR-003 Benchmark Baselines task by publishing chunking performance baselines to docs/BENCHMARKS.md and archiving the associated design documentation. It simultaneously introduces TR-011 Streaming Encrypted Restore as a new backlog item, and updates relevant indices and legends to reflect these changes.

Changes

Cohort / File(s)	Summary
Changelog & Index Updates `CHANGELOG.md`, `docs/design/README.md`, `docs/legends/TR-truth.md`	Added entries documenting TR-003 completion and TR-011 initiation; promoted TR-003 from backlog to design doc surface; added TR-011 to active backlog listing.
Active Backlog Management `docs/BACKLOG/README.md`, `docs/BACKLOG/TR-011-streaming-encrypted-restore.md`	Removed TR-003 from live backlog; added new TR-011 design investigation focused on lower-memory streaming encrypted restore paths and buffering constraints.
Published Benchmarks `docs/BENCHMARKS.md`	New baseline document recording chunking tradeoff results (fixed-size vs. CDC) with throughput data and reuse percentages across variable input sizes, environment specifics, and refresh guidance.
Design & Archive Records `docs/design/TR-003-benchmark-baselines.md`, `docs/archive/BACKLOG/README.md`, `docs/archive/BACKLOG/TR-003-benchmark-baselines.md`	Formalized TR-003 truth cycle with scope, non-goals, decisions, implementation sequence, and verification steps; archived backlog entry with linked reference; corrected relative paths in archived design doc.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~3 minutes

Poem

🐰 A rabbit hops through docs with glee,
TR-003 benchmarks published—wild and free!
Archives tucked, new TR-011 begun,
Indices aligned, the routing's done. 📊✨

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'docs: publish benchmark baselines' directly and clearly summarizes the main change—publishing benchmark baselines documentation as shown in docs/BENCHMARKS.md and related documentation updates.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch docs/tr-003-benchmark-baselines

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

🧹 Nitpick comments (2)

docs/design/TR-003-benchmark-baselines.md (2)
71-74: Optional: Consider varying sentence structure for Human Playback questions.

The static analysis tool flags three consecutive questions beginning with "Does." While the parallel structure works well for a checklist format and aids consistency, you could optionally vary the phrasing if you prefer:

"Does the published doc explain both throughput cost and dedupe benefit?"

"Is the machine and runtime environment documented?"

"Does it avoid pretending local measurements are universal truth?"

However, the current structure is clear and appropriate for validation questions.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/design/TR-003-benchmark-baselines.md` around lines 71 - 74, The three
consecutive checklist items beginning with "Does" can be optionally rephrased
for variety: keep the first as-is ("Does the published doc explain both
throughput cost and dedupe benefit?"), change the second to a neutral phrasing
like "Is the machine and runtime environment documented?" (replace the line that
currently begins "Does it say what machine and runtime produced the numbers?"),
and reword the third to "Does it avoid presenting local measurements as
universal truth?" (replace the line that begins "Does it avoid pretending local
measurements are universal truth?"); update those three checklist lines in
TR-003-benchmark-baselines.md accordingly while preserving their validation
intent and checklist formatting.
77-80: Optional: Consider varying sentence structure for Agent Playback questions.

Similar to the Human Playback section, three consecutive questions begin with "Can." The parallel structure is appropriate for a checklist, but you could optionally vary the phrasing:

"Can an agent tell which benchmark results are published versus merely possible to derive from the harness?"

"Is the distinction between fixed-chunk speed and CDC edit-shift reuse benefits clear?"

"Can it tell how to refresh the baseline later without inventing a new method?"

The current structure is effective for validation purposes.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/design/TR-003-benchmark-baselines.md` around lines 77 - 80, Update the
"Agent Playback" question block by rephrasing the second question to vary
sentence structure while preserving meaning: keep the first question as "Can an
agent tell which benchmark results are published versus merely possible to
derive from the harness?", change the second to "Is the distinction between
fixed-chunk speed and CDC edit-shift reuse benefits clear?", and keep the third
as "Can it tell how to refresh the baseline later without inventing a new
method?" so the three lines still form a checklist but avoid repetitive "Can"
openings.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@docs/design/TR-003-benchmark-baselines.md`:
- Around line 71-74: The three consecutive checklist items beginning with "Does"
can be optionally rephrased for variety: keep the first as-is ("Does the
published doc explain both throughput cost and dedupe benefit?"), change the
second to a neutral phrasing like "Is the machine and runtime environment
documented?" (replace the line that currently begins "Does it say what machine
and runtime produced the numbers?"), and reword the third to "Does it avoid
presenting local measurements as universal truth?" (replace the line that begins
"Does it avoid pretending local measurements are universal truth?"); update
those three checklist lines in TR-003-benchmark-baselines.md accordingly while
preserving their validation intent and checklist formatting.
- Around line 77-80: Update the "Agent Playback" question block by rephrasing
the second question to vary sentence structure while preserving meaning: keep
the first question as "Can an agent tell which benchmark results are published
versus merely possible to derive from the harness?", change the second to "Is
the distinction between fixed-chunk speed and CDC edit-shift reuse benefits
clear?", and keep the third as "Can it tell how to refresh the baseline later
without inventing a new method?" so the three lines still form a checklist but
avoid repetitive "Can" openings.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d3104acb-c689-46a6-bfc1-ca843d3853ca

📥 Commits

Reviewing files that changed from the base of the PR and between 849762d and 764ac2c.

📒 Files selected for processing (9)

CHANGELOG.md
docs/BACKLOG/README.md
docs/BACKLOG/TR-011-streaming-encrypted-restore.md
docs/BENCHMARKS.md
docs/archive/BACKLOG/README.md
docs/archive/BACKLOG/TR-003-benchmark-baselines.md
docs/design/README.md
docs/design/TR-003-benchmark-baselines.md
docs/legends/TR-truth.md

flyingrobots added 2 commits March 30, 2026 06:00

docs: publish chunking benchmark baselines

57ee58e

docs: add encrypted restore streaming backlog item

764ac2c

coderabbitai bot reviewed Mar 30, 2026

View reviewed changes

flyingrobots merged commit 60b9dd0 into main Mar 30, 2026
6 checks passed

flyingrobots deleted the docs/tr-003-benchmark-baselines branch March 30, 2026 17:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs: publish benchmark baselines#27

docs: publish benchmark baselines#27
flyingrobots merged 2 commits intomainfrom
docs/tr-003-benchmark-baselines

flyingrobots commented Mar 30, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Mar 30, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

flyingrobots commented Mar 30, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Verification

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Mar 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

flyingrobots commented Mar 30, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 30, 2026 •

edited

Loading