Skip to content

[upstream PR 934] feat: add per-session LLM token budget with hard cap and soft warn #380

@wbugitlab1

Description

@wbugitlab1

Source: Source pull request number: 934 in rohitg00/agentmemory (URL omitted to avoid GitHub cross-reference)
Title: feat: add per-session LLM token budget with hard cap and soft warn
Author: harrykamboj1
State: open
Draft: yes
Merged: no
Head: harrykamboj1/agentmemory:feat/per-session-token-budget @ ea5b85d
Base: main @ f6f9e3c
Labels: (none)
Changed files: 0
Commits: 0
Created: 2026-06-14T10:35:06Z
Updated: 2026-06-14T10:35:15Z
Closed: (not closed)
Merged at: (not merged)

Original PR body:

Summary

  • Per-session token budget in KV (mem:session-budget), default cap 100k via AGENTMEMORY_SESSION_TOKEN_CAP
  • Enforcement at ResilientProvider: block when exhausted, record estimated tokens in finally
  • Soft warn at 80% (event::mem::budget::soft-warned); hard cap (event::mem::budget::exhausted)
  • Compress → synthetic fallback; summarize → truncated partial summary
  • GET /agentmemory/session/budget; agentmemory status shows active / near-cap / exhausted
  • OTEL histogram session.tokens_used on meter agentmemory

Closes #767

Test plan

  • npm test -- test/session-budget.test.ts (15 tests)
  • npm test -- test/consistency.test.ts
  • Manual: AGENTMEMORY_AUTO_COMPRESS=true, low cap, verify synthetic fallback + status line

Spec alignment / open items

Gap Status Notes
KV shape (5 vs 11 fields) Intentional superset Spec minimum + ops fields (inputTokens, callCount, timestamps)
Function count (2 vs 4) Open record/reap are implementation; get duplicates REST — can drop if preferred
Reaper: cron vs setInterval Follows repo pattern Matches recent-searches-sweep; no cron triggers in repo today
ALS on observe / consolidate-pipeline Deferred Observe doesn't call provider; consolidate bills __system__ sentinel
Sentinel unknown vs __system__ Open Used __system__ for clarity — rename if spec is strict
Atomic increment via kv.update Kept keyed lock No increment op used elsewhere in codebase
Cost at display time vs record Deferred Flat costEstimate at record for v1
OTEL export name Verify Instrument session.tokens_used on meter agentmemory

Local branch:
Fork PR:
Fork decision:
Verification:
Notes:

Metadata

Metadata

Assignees

No one assigned

    Labels

    decision-candidateFork decision has not been madeupstream-draftUpstream pull request is a draftupstream-openUpstream pull request is openupstream-prTracks an upstream pull request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions