Skip to content

Enforce single-agent tenancy to prevent LRU session-state cross-contamination#176

Merged
therandomsecurityguy merged 10 commits into
mainfrom
dac/lru-single-tenant-fix
Jun 27, 2026
Merged

Enforce single-agent tenancy to prevent LRU session-state cross-contamination#176
therandomsecurityguy merged 10 commits into
mainfrom
dac/lru-single-tenant-fix

Conversation

@therandomsecurityguy

@therandomsecurityguy therandomsecurityguy commented Jun 19, 2026

Copy link
Copy Markdown
Contributor

Summary

Enforces single-agent tenancy in code to prevent session-state cross-contamination in the LRU cache.

Changes

  1. Added config section with as default
  2. **Enforce single-agent tenancy in **: track first observed , deny requests with different
  3. Added deny reason to enum in firma-core
  4. Updated all test files to pass parameter

Behavior

  • On first validated request, the from_the token is stored
  • Any subsequent request with a different is denied at Stage 1 with
  • This prevents LRU session-state cross-contamination if someone configures a shared long-lived Sidecar serving multiple agents

Testing

All 516 firma-sidecar tests pass, plus full workspace test suite passes.

@therandomsecurityguy therandomsecurityguy requested a review from a team June 19, 2026 11:26
revocation: Arc<dyn RevocationStore + Send + Sync>,
verifier: Box<dyn TokenVerifier + Send + Sync>,
tenancy_mode: TenancyMode,
first_agent_id: Mutex<Option<AgentId>>,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is a "set once, never change" value, let's use https://doc.rust-lang.org/std/sync/struct.OnceLock.html rather than a Mutex.

@LukeMathWalker LukeMathWalker Jun 21, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could also store the OnceLock inside TenancyMode::SingleAgent, if we prefer.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to OnceLock

Comment on lines +145 to +146
envelope: None,
identity: None,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason we don't capture the envelope (envelope: Some(envelope.clone()))? Same for the identity (identity: Some(DenyIdentity::from_claims(&claims)) should do it, no?).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The TenantMismatch deny now carries envelope: Some(envelope.clone()) and identity: Some(DenyIdentity::from_claims(&claims)), matching the convention used in the rest of the enforcement code.

Comment on lines 75 to 81
pub fn new(
capability_map: CapabilityMap,
verifier: Box<dyn TokenVerifier + Send + Sync>,
revocation: Arc<dyn RevocationStore + Send + Sync>,
clock_skew_tolerance: Duration,
tenancy_mode: TenancyMode,
) -> Self {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we have de-facto only one tenancy mode at the moment, we could also consider renaming this from new to single_tenant. The explicit parameter doesn't buy us much anyway.

(Feel free to keep it as is, not a problem)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll keep it for now because of some of the refactor but updated it when we go to multi.

let claims = self.validate(&entry.raw_token)?;

// Step 3: Enforce single-agent tenancy (V1 ADR §2)
if self.tenancy_mode == TenancyMode::SingleAgent {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we add a test for this scenario?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

/// # Errors
///
/// Returns an error if the configuration is invalid.
pub fn validate(&self) -> Result<(), String> {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we add this later?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed.

Comment thread crates/firma-sidecar/src/pipeline.rs Outdated
/// Build a pipeline where Stage 1 denies every request: the
/// `CapabilityMap` is empty, so token selection fails before any
/// runtime-state bookkeeping.
#[expect(dead_code, reason = "test helper not currently used")]

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder why we cannot run this test.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The helper was written for the "Stage 1 DENY short-circuits before session counter" invariant. The test that motivates it was never written. Fixed it.

/// serves one agent, preventing session-state cross-contamination in the LRU.
#[derive(Debug, Clone, Default, Deserialize, PartialEq, Eq)]
#[serde(rename_all = "snake_case")]
pub enum TenancyMode {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a blocker from my side, but should we wait until we have the other variants before introducing this configuration? We could add the enforcement now, but avoid making it configurable until we have at least a second option.

If we decide to expose it already, should we also update the documentation to explain the purpose and expected behavior of this new field for users?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the rustdoc

@codecov

codecov Bot commented Jun 27, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 97.70992% with 3 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
...a-sidecar/src/enforcement/capability_validation.rs 97.91% 2 Missing ⚠️
crates/firma-sidecar/src/startup/pipeline.rs 0.00% 1 Missing ⚠️

📢 Thoughts on this report? Let us know!

…mination

- Add [tenancy] mode = "single_agent" as default config
- Track first observed agent_id in CapabilityValidator
- Deny requests with different agent_id with TenantMismatch reason
- Add TenantMismatch to DenyReason enum in firma-core
- Update all test files to pass TenancyMode parameter
Replaces Mutex<Option<AgentId>> with OnceLock<AgentId> since the value
is set on first observation and never changes. get_or_init combines
initialization and comparison in a single atomic operation, removing
the lock/unwrap and the associated poison-panic surface.
The deny branch had envelope: None and identity: None, inconsistent with
the convention across constraint_enforcement.rs and pipeline.rs where
both are attached when available. Here both are in scope:

- envelope: &NormalizedEnvelope is a parameter to enforce()
- claims is the freshly verified CapabilityClaims from validate()

Attaching identity is especially important for audit: TenantMismatch
denies must be filterable via 'firma monitor --agent <id>' (FIR-208),
and the presenting agent is what should be attributable.
Adds MutableVerifier — a clone-able TokenVerifier whose returned claims
can be swapped mid-test. This lets a single CapabilityValidator instance
be exercised across multiple enforce() calls, which is required to
observe the OnceLock behavior (first call establishes the agent_id,
subsequent calls compare against it).

The test asserts:
- first call with agent_first populates the OnceLock and succeeds
- second call with agent_second on the same validator denies with
  DenyReason::TenantMismatch at the TokenValidation stage
- the deny carries the verified identity (agent_id == agent_second)
  and the envelope, matching the convention in constraint_enforcement.rs
The method only returns Ok(()) and is the only such stub among the
config types. Other configs (audit, local_exec, etc.) have real
validation; tenancy currently has nothing to check, so the call site
in Config::validate just adds noise. Re-add when a future tenancy
mode actually requires validation.
The previous rustdoc was a single sentence that referenced a
non-existent 'V1 ADR §2' — the only explanation for why this field
exists, what it does at runtime, or how to configure it lived in the
pipeline.rs comment about the LRU counter.

Expands the TenancyMode rustdoc to cover:
- what 'tenancy' means in this context (one-sidecar-one-agent)
- the concrete threat it prevents (agent A's LRU counters visible to
  agent B's Cedar evaluation)
- runtime behavior: first request binds, mismatches deny with
  TenantMismatch at TokenValidation, restart resets the bound
- TOML configuration example
- removes the dangling ADR reference

Also adds a module-level overview so the file is self-explanatory.
@therandomsecurityguy therandomsecurityguy force-pushed the dac/lru-single-tenant-fix branch from 3566bdf to bcd37f3 Compare June 27, 2026 12:13
@therandomsecurityguy therandomsecurityguy merged commit 1763bcb into main Jun 27, 2026
15 checks passed
@therandomsecurityguy therandomsecurityguy deleted the dac/lru-single-tenant-fix branch June 27, 2026 12:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants