Skip to content

feat: add per-session LLM token budget with hard cap and soft warn#934

Draft
harrykamboj1 wants to merge 1 commit into
rohitg00:mainfrom
harrykamboj1:feat/per-session-token-budget
Draft

feat: add per-session LLM token budget with hard cap and soft warn#934
harrykamboj1 wants to merge 1 commit into
rohitg00:mainfrom
harrykamboj1:feat/per-session-token-budget

Conversation

@harrykamboj1

Copy link
Copy Markdown

Summary

  • Per-session token budget in KV (mem:session-budget), default cap 100k via AGENTMEMORY_SESSION_TOKEN_CAP
  • Enforcement at ResilientProvider: block when exhausted, record estimated tokens in finally
  • Soft warn at 80% (event::mem::budget::soft-warned); hard cap (event::mem::budget::exhausted)
  • Compress → synthetic fallback; summarize → truncated partial summary
  • GET /agentmemory/session/budget; agentmemory status shows active / near-cap / exhausted
  • OTEL histogram session.tokens_used on meter agentmemory

Closes #767

Test plan

  • npm test -- test/session-budget.test.ts (15 tests)
  • npm test -- test/consistency.test.ts
  • Manual: AGENTMEMORY_AUTO_COMPRESS=true, low cap, verify synthetic fallback + status line

Spec alignment / open items

Gap Status Notes
KV shape (5 vs 11 fields) Intentional superset Spec minimum + ops fields (inputTokens, callCount, timestamps)
Function count (2 vs 4) Open record/reap are implementation; get duplicates REST — can drop if preferred
Reaper: cron vs setInterval Follows repo pattern Matches recent-searches-sweep; no cron triggers in repo today
ALS on observe / consolidate-pipeline Deferred Observe doesn't call provider; consolidate bills __system__ sentinel
Sentinel unknown vs __system__ Open Used __system__ for clarity — rename if spec is strict
Atomic increment via kv.update Kept keyed lock No increment op used elsewhere in codebase
Cost at display time vs record Deferred Flat costEstimate at record for v1
OTEL export name Verify Instrument session.tokens_used on meter agentmemory

Signed-off-by: harrykamboj1 <singhharnoor116@gmail.com>
@vercel

vercel Bot commented Jun 14, 2026

Copy link
Copy Markdown

@harrykamboj1 is attempting to deploy a commit to the rohitg00's projects Team on Vercel.

A member of the Team first needs to authorize it.

@coderabbitai

coderabbitai Bot commented Jun 14, 2026

Copy link
Copy Markdown
Contributor

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: ca44b4e9-ebf9-4404-8419-a0e5fb577843

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat(budget): per-session token cap with soft warn + hard block

1 participant