Skip to content

fix(triage): defer instead of error on budget-exceeded (OPENHUMAN-TAURI-X)#1558

Open
CodeGhost21 wants to merge 3 commits into
tinyhumansai:mainfrom
CodeGhost21:fix/triage-budget-exceeded-defer
Open

fix(triage): defer instead of error on budget-exceeded (OPENHUMAN-TAURI-X)#1558
CodeGhost21 wants to merge 3 commits into
tinyhumansai:mainfrom
CodeGhost21:fix/triage-budget-exceeded-defer

Conversation

@CodeGhost21
Copy link
Copy Markdown
Contributor

@CodeGhost21 CodeGhost21 commented May 12, 2026

Summary

  • Fixes Sentry OPENHUMAN-TAURI-X[composio][triage] run_triage failed paging on every Composio trigger fire when the user is out of inference credits (108 occurrences over 5 days).
  • Adds an ArmError::BudgetExhausted variant so the triage evaluator treats budget/top-up/out-of-credits responses as user-actionable instead of fatal: skip the cloud retry (same wall), fall through to local, and if local is unavailable or also fails return TriageOutcome::Deferred (Ok, not Err). The composio bus subscriber therefore logs warn, never error — no Sentry page.
  • Pattern set mirrors is_inference_budget_exceeded_error in channels/providers/web.rs (budget exceeded / top up / add credits / out of credits / no remaining credits) and is evaluated inline to avoid a cross-domain dep on the web-channel module.

Root cause

The backend rejected the inference call with 400 Bad Request — Budget exceeded — add credits to continue. classify_error only recognises 429 and 5xx/timeout/connection patterns as retryable, so a 400 fell through to ArmError::Fatal. run_triage returned Err, and composio/bus.rs:307 logged it with tracing::error!, which Sentry's tracing layer ships as an error event. Every webhook fire that arrived while the user was out of credits paged.

Test plan

  • cargo test openhuman::agent::triage::evaluator — 18 passed (15 existing + 3 new)
  • classify_string_recognises_budget_exceeded_as_budget_exhausted — exact Sentry payload classifies as BudgetExhausted, not Fatal
  • classify_string_recognises_top_up_and_out_of_credits_as_budget_exhausted — sibling phrasings covered
  • cloud_budget_exhausted_skips_retry_and_falls_to_local — only 2 dispatches (no cloud retry); local produces the decision
  • cloud_budget_exhausted_without_local_returns_deferred_not_err — defers with budget-named reason; no cloud retry
  • cargo check clean; cargo fmt --check clean

Summary by CodeRabbit

  • Bug Fixes

    • Better handling of inference budget exhaustion: skips unnecessary cloud retries, records budget-exhausted causes, surfaces clearer deferral reasons (distinguishing cloud-budget + local unavailable vs other failures), and includes budget context in failure logs.
    • Budget-exceeded cases are now classified more accurately via a conservative matching heuristic.
  • Tests

    • Added regression and unit tests covering budget-exhaustion classification, retry behavior, and fallback/defer outcomes.

Review Change Stack

…RI-X)

When the OpenHuman backend rejected an inference call with
"400 Bad Request — Budget exceeded — add credits to continue",
classify_error fell through to ArmError::Fatal (it's neither 429
nor 5xx/transient). run_triage then returned Err, which the
composio bus subscriber logged with tracing::error!, paging Sentry
on every Composio trigger fire (108 occurrences in 5 days).

Budget exhaustion is user-actionable, not a code bug. Route it
through a new ArmError::BudgetExhausted variant that skips the
cloud retry (same wall) and falls through to the local arm; if
local is unavailable or also fails, return TriageOutcome::Deferred
(Ok, not Err) with a budget-named reason so the bus logs warn,
not error.
@CodeGhost21 CodeGhost21 requested a review from a team May 12, 2026 13:22
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 12, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 26202256-e030-4d96-9425-dffc16166cbc

📥 Commits

Reviewing files that changed from the base of the PR and between 6f7aab6 and 12802bd.

📒 Files selected for processing (2)
  • src/openhuman/agent/triage/evaluator.rs
  • src/openhuman/agent/triage/evaluator_tests.rs
🚧 Files skipped from review as they are similar to previous changes (2)
  • src/openhuman/agent/triage/evaluator_tests.rs
  • src/openhuman/agent/triage/evaluator.rs

📝 Walkthrough

Walkthrough

Adds detection and an ArmError::BudgetExhausted variant, maps budget-related cloud errors to it, and updates run_triage_with_arms to skip cloud retry on budget exhaustion and fall back to local or produce budget-specific deferred reasons. Tests cover classification and triage flows.

Changes

Budget-Aware Triage Fallback

Layer / File(s) Summary
Budget Error Type and Detection
src/openhuman/agent/triage/evaluator.rs, src/openhuman/agent/triage/evaluator_tests.rs
ArmError::BudgetExhausted variant is introduced. classify_error now detects budget-related messages via is_inference_budget_exceeded and maps them to the new variant. Unit tests verify detection for a real Sentry payload and multiple "top up / out of credits" phrasing variants.
Triage Flow Integration with Budget Handling
src/openhuman/agent/triage/evaluator.rs, src/openhuman/agent/triage/evaluator_tests.rs
run_triage_with_arms tracks cloud_budget_exhausted, skips cloud retry when BudgetExhausted is seen, falls back to local immediately, and composes deferred reasons that include the budget cause when applicable. Integration tests assert cloud→local fallback, retry-then-budget cases, and budget-with-no-local deferral.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

  • tinyhumansai/openhuman#1367: Extended the tiered triage pipeline (cloud→retry→local→defer) that this PR builds upon with budget-aware flow control and BudgetExhausted handling.

Poem

🐰 When cloud credits dwindle low and sigh,
I sniff the message, give a little cry.
No retry hop — I bound to local ground,
A tidy fallback, nimble and sound.
Hooray for triage that knows when to try!

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title accurately summarizes the main change: handling budget-exhausted errors by deferring instead of erroring, with the issue reference for context.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
src/openhuman/agent/triage/evaluator_tests.rs (1)

367-450: ⚡ Quick win

Cover the retry-then-budget branch too.

These additions cover budget exhaustion on the first cloud call and with no local arm, but the separate branch where attempt #1 is retryable and attempt #2 returns BudgetExhausted is still unexercised. A small counter-based regression test for that sequence would lock down the new CloudAfterRetry control flow.

As per coding guidelines, achieve ≥ 80% coverage on changed lines; PRs must pass diff-cover check via cargo-llvm-cov.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/openhuman/agent/triage/evaluator_tests.rs` around lines 367 - 450, Add a
new tokio test (e.g., cloud_budget_exhausted_after_retry_skips_local) that uses
mock_agent_run_turn with a counter: on first call return a retryable/transient
Err (so the triage will schedule a cloud retry), on second call return the
Budget exceeded error string used elsewhere; then call
run_triage_with_arms(cloud_arm(), Some(local_arm()), &envelope()) and assert the
outcome is a Decision with TriageResolutionPath::LocalFallback and
run.used_local true, and assert the counter equals 2 to ensure we hit cloud
twice (first transient then budget) but do not perform a third cloud retry;
reference the same helpers used in existing tests (mock_agent_run_turn,
run_triage_with_arms, cloud_arm, local_arm, envelope,
TriageResolutionPath::LocalFallback) and mirror the assertion style from
cloud_budget_exhausted_skips_retry_and_falls_to_local.
src/openhuman/agent/triage/evaluator.rs (1)

175-208: ⚡ Quick win

Add correlation fields to the new budget/defer warnings.

These warn paths are now the main breadcrumb for budget-exhausted triage, but they only emit the message text. Please include stable identifiers from envelope plus the active arm/path here so an operator can tie the warning back to the triggering event without correlating adjacent logs manually.

As per coding guidelines, use log / tracing with stable grep-friendly prefixes and correlation fields.

Also applies to: 253-257

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/openhuman/agent/triage/evaluator.rs` around lines 175 - 208, Update the
two tracing::warn! calls that handle Err(ArmError::BudgetExhausted(...)) (the
one at the initial cloud attempt and the one inside the retry match) to include
stable correlation fields from the envelope and the active resolution path; for
example add envelope identifiers (e.g. envelope.id or envelope.request_id — use
the actual stable id field name on the Envelope struct) and the path string
(e.g. TriageResolutionPath::Cloud and TriageResolutionPath::CloudAfterRetry or a
derived &str) as structured fields to the warn! macro so operators can grep/
correlate these budget-exhausted warnings to the triggering event and path.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/openhuman/agent/triage/evaluator_tests.rs`:
- Around line 113-121: The test is moving the non-Copy ArmError twice by using
matches!(err, ...) and then matching err in the panic message; update the
assertion to avoid moving err by matching on a reference (e.g., use
matches!(&err, ArmError::BudgetExhausted(_)) and adjust the panic match to match
&err with patterns like ArmError::Retryable { .. } => "Retryable", etc.), or
alternatively compute a label string from err first (let label = match &err {
... }) and then call assert!(matches!(&err, ArmError::BudgetExhausted(_)),
"...", label = label) so err is not moved twice; apply this change around the
classify_error call and the assert block.

---

Nitpick comments:
In `@src/openhuman/agent/triage/evaluator_tests.rs`:
- Around line 367-450: Add a new tokio test (e.g.,
cloud_budget_exhausted_after_retry_skips_local) that uses mock_agent_run_turn
with a counter: on first call return a retryable/transient Err (so the triage
will schedule a cloud retry), on second call return the Budget exceeded error
string used elsewhere; then call run_triage_with_arms(cloud_arm(),
Some(local_arm()), &envelope()) and assert the outcome is a Decision with
TriageResolutionPath::LocalFallback and run.used_local true, and assert the
counter equals 2 to ensure we hit cloud twice (first transient then budget) but
do not perform a third cloud retry; reference the same helpers used in existing
tests (mock_agent_run_turn, run_triage_with_arms, cloud_arm, local_arm,
envelope, TriageResolutionPath::LocalFallback) and mirror the assertion style
from cloud_budget_exhausted_skips_retry_and_falls_to_local.

In `@src/openhuman/agent/triage/evaluator.rs`:
- Around line 175-208: Update the two tracing::warn! calls that handle
Err(ArmError::BudgetExhausted(...)) (the one at the initial cloud attempt and
the one inside the retry match) to include stable correlation fields from the
envelope and the active resolution path; for example add envelope identifiers
(e.g. envelope.id or envelope.request_id — use the actual stable id field name
on the Envelope struct) and the path string (e.g. TriageResolutionPath::Cloud
and TriageResolutionPath::CloudAfterRetry or a derived &str) as structured
fields to the warn! macro so operators can grep/ correlate these
budget-exhausted warnings to the triggering event and path.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 300215d0-23b7-4fac-9958-b8081e96a775

📥 Commits

Reviewing files that changed from the base of the PR and between 15c7442 and 293182b.

📒 Files selected for processing (2)
  • src/openhuman/agent/triage/evaluator.rs
  • src/openhuman/agent/triage/evaluator_tests.rs

Comment thread src/openhuman/agent/triage/evaluator_tests.rs Outdated
- Avoid double-move of `err` in the top_up/out_of_credits classify
  test by matching on `&err` and pre-computing a label string.
- Add `cloud_budget_exhausted_on_retry_falls_through_to_local`
  to cover the path where cloud is transient on attempt tinyhumansai#1 and
  budget-exceeded on the retry — must skip a third cloud call
  and dispatch to local.
- Carry envelope correlation fields (source / display_label /
  external_id) and the active resolution path on the two
  budget-exhausted warn! sites so operators can grep these
  back to the triggering event.
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/openhuman/agent/triage/evaluator.rs`:
- Around line 233-240: The Deferred branch currently injects raw upstream error
text into TriageOutcome::Deferred.reason (see TriageOutcome::Deferred, variable
reason and match on cloud_budget_exhausted), violating the safe-to-log contract;
change reason to a stable, scrubbed message (e.g. "cloud retry exhausted; local
arm unavailable" or "cloud budget exhausted; local arm unavailable") without
interpolating err/cloud_err, and emit the original error details separately at
debug/trace level (e.g., tracing::debug! or log::debug!) with a fixed prefix
like "[rpc]" or "[triage]" and any correlation IDs; apply the same fix to the
other similar branch around the 255–268 region so no raw upstream error text
flows into Deferred.reason while operator details remain available in debug
logs.
- Around line 463-480: The phrase matching in is_inference_budget_exceeded
currently uses normalized.contains(...) which can match substrings inside words
(e.g., "stop updating" matching "top up"); change it to operate on token
windows: after computing normalized and splitting into Vec<String> words, for
each needle (split into needle_tokens) slide a window of the same length over
the normalized words and require an exact token-by-token match (contiguous
words) before returning true. Update the logic that iterates the needles (the
array of strings like "budget exceeded", "top up", etc.) to perform this
token-window containment check instead of using contains so only
whole-word/whole-phrase matches trigger BudgetExhausted.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: e6b61012-47ce-4e85-ac65-6cad406c2e9f

📥 Commits

Reviewing files that changed from the base of the PR and between 293182b and 6f7aab6.

📒 Files selected for processing (2)
  • src/openhuman/agent/triage/evaluator.rs
  • src/openhuman/agent/triage/evaluator_tests.rs
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/openhuman/agent/triage/evaluator_tests.rs

Comment thread src/openhuman/agent/triage/evaluator.rs Outdated
Comment thread src/openhuman/agent/triage/evaluator.rs Outdated
Copy link
Copy Markdown
Contributor

@graycyrus graycyrus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review

Solid fix for a real production issue — the BudgetExhausted variant is well-designed, the skip-retry logic is correct, and the test coverage is thorough (5 new tests covering all budget paths including the retry-then-budget sequence). Correlation fields on the warn sites are a nice touch.

One actionable item below on the Deferred.reason field. The contains() matching in is_inference_budget_exceeded is actually safe — the normalization + whitespace collapse makes it effectively word-boundary matching (CodeRabbit's "stop updating" example doesn't actually match "top up" after normalization).

let reason = match cloud_budget_exhausted {
Some(err) => format!("cloud budget exhausted; local arm unavailable: {err}"),
None => "cloud retry exhausted; local arm unavailable".to_string(),
};
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[major] Raw upstream error text leaks into Deferred.reason

This field gets logged via warn! on line 261 and may be surfaced in RPC responses. The pre-PR pattern uses stable, scrubbed copy ("cloud retry exhausted; local arm unavailable"). Splicing raw err in breaks the safe-to-log contract and could carry PII or sensitive upstream text.

Same issue at lines 255-258 where cloud_err and err are interpolated.

Suggestion — keep reason scrubbed, log the raw error separately at debug:

let reason = match cloud_budget_exhausted {
    Some(_) => "cloud budget exhausted; local arm unavailable".to_string(),
    None => "cloud retry exhausted; local arm unavailable".to_string(),
};

And for the local-also-failed path:

let reason = match cloud_budget_exhausted {
    Some(_) => "cloud budget exhausted; local arm failed".to_string(),
    None => "cloud + local both failed".to_string(),
};
tracing::debug!(
    cloud_err = ?cloud_budget_exhausted,
    local_err = %err,
    "[triage::evaluator] deferred — raw error details"
);

- Deferred.reason is documented as scrubbed/safe-to-log but the
  no-local and both-failed branches were interpolating raw upstream
  `err` / `cloud_err` text. Replace with stable strings ("cloud budget
  exhausted; local arm unavailable", etc.) and route the original
  errors to debug/warn logs with the [triage::evaluator] target plus
  envelope correlation fields.
- is_inference_budget_exceeded used String::contains, which fires on
  cross-word substrings (e.g. "stop updating" matches "top up").
  Switch to token-window matching against the normalized word list so
  only whole-word/whole-phrase needles trigger BudgetExhausted.
- Add regression test covering the substring false-positive.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants