[upstream PR 934] feat: add per-session LLM token budget with hard cap and soft warn

Source: Source pull request number: 934 in rohitg00/agentmemory (URL omitted to avoid GitHub cross-reference)
Title: feat: add per-session LLM token budget with hard cap and soft warn
Author: harrykamboj1
State: open
Draft: yes
Merged: no
Head: harrykamboj1/agentmemory:feat/per-session-token-budget @ ea5b85d8546cd89d39b843ac1ae45fe1ca2309d2
Base: main @ f6f9e3cb1385da31f48036868dc3c7fe342b67ba
Labels: (none)
Changed files: 0
Commits: 0
Created: 2026-06-14T10:35:06Z
Updated: 2026-06-14T10:35:15Z
Closed: (not closed)
Merged at: (not merged)

Original PR body:

## Summary
- Per-session token budget in KV (`mem:session-budget`), default cap 100k via `AGENTMEMORY_SESSION_TOKEN_CAP`
- Enforcement at `ResilientProvider`: block when exhausted, record estimated tokens in `finally`
- Soft warn at 80% (`event::mem::budget::soft-warned`); hard cap (`event::mem::budget::exhausted`)
- Compress → synthetic fallback; summarize → truncated partial summary
- `GET /agentmemory/session/budget`; `agentmemory status` shows active / near-cap / exhausted
- OTEL histogram `session.tokens_used` on meter `agentmemory`

Closes #767

## Test plan
- [x] `npm test -- test/session-budget.test.ts` (15 tests)
- [x] `npm test -- test/consistency.test.ts`
- [x] Manual: `AGENTMEMORY_AUTO_COMPRESS=true`, low cap, verify synthetic fallback + status line

## Spec alignment / open items
| Gap | Status | Notes |
|-----|--------|-------|
| KV shape (5 vs 11 fields) | Intentional superset | Spec minimum + ops fields (`inputTokens`, `callCount`, timestamps) |
| Function count (2 vs 4) | Open | `record`/`reap` are implementation; `get` duplicates REST — can drop if preferred |
| Reaper: cron vs setInterval | Follows repo pattern | Matches `recent-searches-sweep`; no cron triggers in repo today |
| ALS on observe / consolidate-pipeline | Deferred | Observe doesn't call provider; consolidate bills `__system__` sentinel |
| Sentinel `unknown` vs `__system__` | Open | Used `__system__` for clarity — rename if spec is strict |
| Atomic increment via kv.update | Kept keyed lock | No increment op used elsewhere in codebase |
| Cost at display time vs record | Deferred | Flat `costEstimate` at record for v1 |
| OTEL export name | Verify | Instrument `session.tokens_used` on meter `agentmemory` |



Local branch:
Fork PR:
Fork decision:
Verification:
Notes:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[upstream PR 934] feat: add per-session LLM token budget with hard cap and soft warn #380

Summary

Test plan

Spec alignment / open items

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Gap	Status	Notes
KV shape (5 vs 11 fields)	Intentional superset	Spec minimum + ops fields (`inputTokens`, `callCount`, timestamps)
Function count (2 vs 4)	Open	`record`/`reap` are implementation; `get` duplicates REST — can drop if preferred
Reaper: cron vs setInterval	Follows repo pattern	Matches `recent-searches-sweep`; no cron triggers in repo today
ALS on observe / consolidate-pipeline	Deferred	Observe doesn't call provider; consolidate bills `__system__` sentinel
Sentinel `unknown` vs `__system__`	Open	Used `__system__` for clarity — rename if spec is strict
Atomic increment via kv.update	Kept keyed lock	No increment op used elsewhere in codebase
Cost at display time vs record	Deferred	Flat `costEstimate` at record for v1
OTEL export name	Verify	Instrument `session.tokens_used` on meter `agentmemory`

[upstream PR 934] feat: add per-session LLM token budget with hard cap and soft warn #380

Description

Summary

Test plan

Spec alignment / open items

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions