Skip to content

fix(health): fresh-session guard against stale-state false-rot [GET-36]#60

Merged
DevanshuNEU merged 2 commits into
OpenCodeIntel:mainfrom
DevanshuNEU:feat/get-36-fresh-session-rot-guard
May 4, 2026
Merged

fix(health): fresh-session guard against stale-state false-rot [GET-36]#60
DevanshuNEU merged 2 commits into
OpenCodeIntel:mainfrom
DevanshuNEU:feat/get-36-fresh-session-rot-guard

Conversation

@DevanshuNEU

@DevanshuNEU DevanshuNEU commented May 4, 2026

Copy link
Copy Markdown
Contributor

Summary

Adds a Rule 0 guard at the top of `computeHealthScore` that returns Healthy when `turnCount <= 2 AND contextPct < 30`, before any per-model or secondary classifier runs. This blocks stale derived signals (growthRate from a prior conversation, leaked isDetailHeavy, and any future projection wrapper) from escalating a session that has no real history yet. The guard is defensive: the bug as originally filed referenced `escalateForProjection` which does not exist on main today, so the guard primarily protects against a class of leak paths and pre-emptively covers the orphan PR #56 commits when they land.

Type of Change

  • `fix` — Bug fix
  • `feat` — New feature
  • `refactor` — Code restructure, no behavior change
  • `test` — Tests only
  • `chore` — Build, CI, tooling, dependencies
  • `docs` — Documentation only

What Was Changed

`lib/health-score.ts`

  • Added two exported constants: `FRESH_SESSION_TURN_CEIL = 2` and `FRESH_SESSION_CONTEXT_CEIL = 30`. The turn ceiling is inclusive, the context ceiling is exclusive (any conversation that has reached 30% deserves the full classifier pass since 30% is below every model's warn threshold).
  • Added Rule 0 at the top of `computeHealthScore`. Returns Healthy with model-aware coaching from `getRotCoaching` when both conditions are met.
  • Updated the file header comment to document Rule 0 and its rationale, including how it composes with future projection wrappers.

`tests/unit/health-score.test.ts`

  • 12 new tests in `describe('fresh-session guard (GET-36)')`:
    • Positive path (5): empty input, upper boundary, stale growthRate, stale isDetailHeavy, negative contextPct
    • Boundary (3): turnCount=3 just past ceiling, contextPct=30 exactly, contextPct >= absolute critical floor with low turns
    • Does-NOT-mask (4): high context with low turns (untracked old chat), warn-boundary first turn, detail-heavy first turn, model-aware coaching string

How to Test

  1. `bun run compile` — clean
  2. `bun run test` — 1808 passing (1796 → 1808, +12 new)
  3. `bun run build` — clean, content script 75.82 KB (under 100 KB ceiling)
  4. Manual on claude.ai:
    • Open a long existing chat. Indicator shows Critical or Degrading.
    • Click "New chat". Overlay clears.
    • Type a short first message and send. After STREAM_COMPLETE, indicator shows Healthy.
    • Repeat from a fresh tab open on `/new` — first turn shows Healthy.
    • Open a moderately-long pre-existing chat that LCO never tracked. After your first reply, if contextPct is genuinely high (>= 30%), indicator shows Degrading or Critical. The guard does not mask real high context.

Acceptance Criteria

  • Health indicator shows Healthy on any conversation with turnCount <= 2 AND contextPct < 30 AND no draft, regardless of prior tab/conversation history. Covered by tests in the new `describe` block.
  • Unit test: `computeHealthScore` with fresh input never escalates above Healthy via projection unless a real draft is active. The guard runs inside `computeHealthScore` before zone classification; future `escalateForProjection` wrappers run on the returned HealthScore and can still escalate when a real draft is active.
  • Manual: open new chat after a long previous conversation in the same tab — overlay resets to Healthy. Already satisfied by the SPA-nav reset path in `entrypoints/claude-ai.content.ts:1003-1080` (added in PR feat(storage): account isolation via organization UUID #29). `INITIAL_STATE.health = null` so the overlay shows no indicator until the first STREAM_COMPLETE; the guard then returns Healthy on that first turn.

Checklist

  • Tests pass locally (`bun run test`)
  • TypeScript compiles clean (`bun run compile`)
  • Extension builds without errors (`bun run build`)
  • No secrets, hardcoded URLs, or sensitive tokens exposed
  • Comments are professional and clear — no emojis, no AI-generated filler
  • Commit messages follow conventional commits

Related Issues

Closes GET-36 (https://linear.app/getsaar/issue/GET-36/health-indicator-context-rot-signal-fires-on-fresh-sessions)

Notes for Reviewer

  • Diagnosis pivot. The issue body suspected `escalateForProjection` leaking via `projectedContextPctHigh`. That mechanism does not exist on `main` today (it lives in the four orphan commits from PR feat(attachments): pre-submit cost preview for images and PDFs [GET-24] #56 that were not pushed). On main, fresh-session input already returns Healthy by default. The guard is therefore primarily defensive: it makes the "fresh = Healthy" invariant code-visible and pre-emptively covers the orphan commits when they land.
  • Composition with future wrappers. When the orphan commits land, `escalateForProjection` will wrap `computeHealthScore` and bump the result based on `projectedContextPctHigh`. The Rule 0 guard sits inside `computeHealthScore`, so the wrapper still gets to run on the returned HealthScore. Fresh + no draft stays Healthy; fresh + huge real draft can be escalated by the wrapper.
  • Scenario E judgment call (kept the inclusive Healthy). A user who opens an old (untracked) claude.ai conversation with low context fill (~20%) and 30 prior turns will see Healthy on their first observed turn, even though attention-drift research suggests rot at high turn counts. This is honest: Saar can't see what it didn't observe, and the 20% fill is genuinely fine. Override in a follow-up if we want to bias toward over-warning untracked chats.
  • No orchestrator change. The SPA-nav reset path in `claude-ai.content.ts` already clears state, convState, dismissed, lastDetailHeavy, deltaHistory, etc. `INITIAL_STATE.health = null` means the overlay correctly shows no health indicator between SPA nav and the first STREAM_COMPLETE.

Summary by CodeRabbit

Release Notes

  • New Features
    • Improved health assessment for new conversations: fresh sessions with minimal interaction and context are now automatically marked as healthy, with guidance tailored to early conversation stages.

…-36]

Rule 0 returns Healthy when turnCount <= 2 AND contextPct < 30,
before the per-model classifier runs. Blocks stale growthRate,
isDetailHeavy, and any future projection wrapper from escalating
a session that has no real history yet.

Acceptance criteria:
- Healthy on any conversation with turnCount <= 2 AND contextPct < 30
  regardless of prior tab/conversation state
- Wrappers like escalateForProjection still run on the returned
  HealthScore so a real draft can escalate after the guard
- AC OpenCodeIntel#3 (overlay resets on new chat) already satisfied by existing
  SPA-nav reset path in claude-ai.content.ts (PR OpenCodeIntel#29)

12 new tests: positive path (5), boundary (3), does-not-mask (4).
@vercel

vercel Bot commented May 4, 2026

Copy link
Copy Markdown

@DevanshuNEU is attempting to deploy a commit to the Dev's projects Team on Vercel.

A member of the Team first needs to authorize it.

@coderabbitai

coderabbitai Bot commented May 4, 2026

Copy link
Copy Markdown

Warning

Rate limit exceeded

@DevanshuNEU has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 31 minutes and 3 seconds before requesting another review.

To keep reviews running without waiting, you can enable usage-based add-on for your organization. This allows additional reviews beyond the hourly cap. Account admins can enable it under billing.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 7fbb0988-0609-48a3-8a7b-5df670c2eb7e

📥 Commits

Reviewing files that changed from the base of the PR and between e08672d and a43fe4b.

📒 Files selected for processing (1)
  • tests/unit/health-score.test.ts
📝 Walkthrough

Walkthrough

A fresh-session guard is added to computeHealthScore that returns healthy status immediately for conversations with few turns and low context, before running classifier logic. Two new threshold constants define this behavior, with comprehensive test coverage validating guard boundaries and interactions with existing rules.

Changes

Fresh-Session Guard Implementation

Layer / File(s) Summary
Constants Definition
lib/health-score.ts
Two new exported constants define the fresh-session guard cutoffs: FRESH_SESSION_TURN_CEIL = 2 and FRESH_SESSION_CONTEXT_CEIL = 30.
Core Logic
lib/health-score.ts
Rule 0 is prepended to computeHealthScore, short-circuiting to Healthy with getRotCoaching when turnCount <= 2 and contextPct < 30, before primary/secondary classifier logic runs.
Tests
tests/unit/health-score.test.ts
New test suite validates the guard fires for fresh sessions with noisy signals, does not fire outside guard boundaries, respects higher-priority rules (e.g., critical floor), and returns model-aware coaching containing "fresh".

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Poem

🐰 Fresh sessions bloom bright,
No noisy signals tonight!
Guard gates the eager score,
Healthy dreams before more.
Turn-count whispers low and clear—
Two turns, thirty percent: appear!

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically summarizes the main change: adding a fresh-session guard to prevent false rot signals, with reference to the issue identifier.
Description check ✅ Passed The description comprehensively covers all required sections: summary, type of change, what was changed, how to test, acceptance criteria, and notes for reviewer.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
Review rate limit: 0/1 reviews remaining, refill in 31 minutes and 3 seconds.

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/unit/health-score.test.ts`:
- Around line 242-258: The test comment misleadingly claims it verifies the
guard path vs primary classifier, but both branches call getRotCoaching(model,
contextPct, isDetailHeavy) so the coaching check cannot distinguish paths;
update the test comment in the it(...) block for computeHealthScore (using
SONNET_45 and FRESH_SESSION_CONTEXT_CEIL) to remove or rewrite the paragraph
that asserts "specifically verify the guard did not produce this result" and
instead state that the test only verifies the exclusive boundary behavior (guard
does not fire at ceiling) and that the coaching string is expected but not proof
of which branch produced it.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 48e2d854-2742-49db-816d-beb52a6e1149

📥 Commits

Reviewing files that changed from the base of the PR and between 9c1d6af and e08672d.

📒 Files selected for processing (2)
  • lib/health-score.ts
  • tests/unit/health-score.test.ts

Comment thread tests/unit/health-score.test.ts
…ch claim [GET-36]

The 'guard ceiling is exclusive' test asserted the coaching string cites
the model and previously claimed this distinguished the guard branch from
the primary fall-through. Both branches call getRotCoaching with identical
arguments, so the string is byte-identical from either path. Comment now
states honestly that the level assertion is the boundary check and the
coaching match is a non-empty/shape sanity check, not proof of branch.
@vercel

vercel Bot commented May 4, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
getsaar Ready Ready Preview, Comment May 4, 2026 4:54am

@DevanshuNEU DevanshuNEU merged commit 8a923e9 into OpenCodeIntel:main May 4, 2026
6 of 7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant