Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
98 changes: 98 additions & 0 deletions docs/design/tool-qualification-dossier.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,98 @@
# Tool qualification dossier — rivet (ISO 26262-8 §11.4.7)

**Status:** draft (TCL workstream A4)
**Audience:** safety leads, certification authorities, OEM tool-qualification reviewers
**Last revised:** 2026-05-16
**Companion:** `docs/design/tool-confidence-level.md` (the *why*), `safety/stpa/tool-qualification.yaml` (the hazard analysis), `safety/tool-qualification/rivet-tool-confidence.yaml` (the typed claim)

This dossier collects the qualification argument for **rivet** as a development tool used in safety-critical projects. It is the prose layer of a three-part artefact set:

- The **STPA** identifies tool-qualification hazards.
- The typed **`tool-confidence`** artifact (`TQ-CONF-RIVET`) carries the machine-readable claim.
- This **dossier** explains the reasoning behind that claim and lists the evidence the auditor should expect to see.

## 1. Claim summary

Rivet self-claims **TCL1** under ISO 26262-8 (Tool Impact = TI2, Tool error Detection = TD1). The same claim re-expressed in adjacent regimes:

| Regime | Claim | Notes |
|---|---|---|
| ISO 26262 (automotive) | **TCL1** (TI2 / TD1) | Numbering per Table 3 — TCL1 = lowest demand, TCL3 = highest. |
| DO-330 / DO-178C (aviation) | **TQL-4** with TQL-3 path | Tool Type 2 (automates verification). |
| EN 50128 (rail) | **T2** | Tool used to produce evidence; review compensates. |
| IEC 61508 / IEC 62304 | offline support tool, T3 | The detection-machinery argument carries across regimes. |

The claim is **self-claimed** today. Independent assessment is in scope for v1.0.

## 2. Tool impact (TI) — why TI2

Rivet **can** affect the safety case: it emits PASS/FAIL on traceability invariants that an auditor reads as compliance evidence. A false PASS would not necessarily violate a safety requirement, but it can *prevent the detection* of a violation. Per ISO 26262-8 §11.4.5.3, that places rivet at **TI2**.

The STPA (`safety/stpa/tool-qualification.yaml`) enumerates the seven concrete hazards that the TI2 classification covers: false PASS on traceability, evidence divergence on export, migration corruption, s-expression bugs, variant scoping bugs, hook bypass, quantifier scope errors.

## 3. Tool error Detection (TD) — why TD1

TD1 means **high confidence** that errors in the tool are prevented or caught. Rivet's TD argument rests on five mechanically-enforced layers, every one of which is exercised in CI on every push:

1. **Validate** — link-graph and traceability-rule checks; pre-existing, mature, has bounded-MC proofs (Kani) for the soundness side and proptest property suites for the completeness side.
2. **Oracles** — declarative `agent-pipelines:` blocks gated by mechanical checks (cited-source freshness, schema-conformance, docs-check invariants).
3. **Mutation testing** — `mutants.out/` runs catch silently-passing test gaps in rivet-core and rivet-cli.
4. **Formal proofs** — Kani (bounded model checking, 2000+ proofs across rivet-core), Verus (selected modules), Rocq (verification theorems in `proofs/rocq/`).
5. **`ai-found-defect` triage loop** — every defect caught by the above layers (and especially every defect introduced by AI authoring) becomes a typed artefact, links back to its `ai-session`, and gates release on triage state. This is the layer that compensates for the eroded human-review assumption when the upstream author is an AI assistant.

The five layers are independent (catch different defect classes), so the residual-error probability is the *product* of their miss rates, not the sum. That is the operational basis for the TD1 claim.

## 4. Scope of the claim

**In scope (qualified for compliance use):**

- `rivet validate` — link-graph, traceability rules, schema-field checks, s-expression evaluator.
- `rivet commits` — git-trailer audit (CI gate).
- `rivet coverage` — single-org and supplier 3-state coverage (#286).
- `rivet supplier list`, `rivet supplier check` — read-only boundary reporting.
- `rivet stats --qualification` — configuration baseline manifest emission.
- `rivet --qualification-mode` — disables features outside the qualified set.

**Out of scope (NOT qualified by this claim):**

- `rivet sync`, `rivet supplier pull` (Phase 2 federation — qualified separately when shipped).
- `rivet migrate` (importers — pre-Phase 2, semantic distortion possible).
- `rivet serve` (read-only web UI — not part of the toolchain output).
- MCP write tools that bypass validate (`--qualification-mode` disables these).

The scope split lives in the typed artifact `TQ-CONF-RIVET.fields.scope` and is machine-readable by `rivet stats --qualification`.

## 5. Evidence index

The auditor should expect to see all of the following on a release-tagged commit:

| Layer | Evidence file / command | What it shows |
|---|---|---|
| TI / TD claim | `safety/tool-qualification/rivet-tool-confidence.yaml` | The typed claim. |
| STPA | `safety/stpa/tool-qualification.yaml` | Hazard analysis behind the claim. |
| Validate proofs | `rivet-core/src/proofs.rs` (Kani) | Bounded-MC soundness for link-graph and validate. |
| Property tests | `rivet-core/tests/proptest_*.rs` | Completeness side of validate. |
| Mutation tests | `mutants.out/` artefact in CI | Coverage gap finder. |
| Oracle definitions | `agent-pipelines:` in each schema | Cited-source freshness, docs-check, etc. |
| AI-defect log | All `ai-found-defect` artifacts in the project | The TD1 loop. |
| Configuration baseline | `rivet stats --qualification` output | Binary version, schema hashes, oracle list, dependency vet. |

## 6. AI-in-the-loop angle

This dossier exists because rivet's intended deployment is **as part of an AI-authored development workflow**. When a human writes code, the conventional TD argument relies on human review as the detection layer. When an AI writes code, that layer is weakened. The five-layer mechanical detection above replaces the human-review-only argument with mechanical evidence that survives the erosion. The `ai-found-defect` artifact is the operational primitive that makes "AI made an error, the tool reflected it back" auditable rather than aspirational.

This is also why rivet's scope-out list is conservative: any feature that doesn't have a TD layer covering it (`rivet migrate`, MCP write tools) is excluded from the claim until it does.

## 7. Known limitations

- **Self-claimed status.** No independent assessment yet. The next milestone is an external review against a customer pilot.
- **Numbering convention spillover.** The DO-330 cross-walk is documented but every dossier consumer should re-check the regime field on the typed artifact before quoting a number externally.
- **Coverage of `rivet supplier pull` (Phase 2).** Not yet shipped, not yet qualified. The Phase 2 work in #288 will add federation provenance and re-extend the dossier.

## 8. Renewal

This dossier is re-evaluated on every release. Specifically: the `tool-confidence` artifact's `claim-status` must be re-confirmed; any new top-level command added in the release window must either land in the in-scope list or be explicitly added to the out-of-scope list with rationale.

---

Refs: ISO 26262-8:2018 §11.4.5, §11.4.7 Table 3; DO-330:2011 §1.4, §11; `docs/design/tool-confidence-level.md`; `docs/design/iso26262-artifact-mapping.md` §C row 32.
9 changes: 9 additions & 0 deletions rivet-cli/src/docs.rs
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
//! `rivet docs` — built-in searchable documentation.
//!
//! All documentation is embedded in the binary. Topics are searchable
Expand Down Expand Up @@ -278,8 +278,17 @@
category: "Schemas",
content: embedded::SCHEMA_VV_COVERAGE,
},
DocTopic {
slug: "tool-qualification",
title: "Tool qualification dossier — rivet (ISO 26262-8 §11.4.7)",
category: "Reference",
content: TOOL_QUALIFICATION_DOC,
},
];

const TOOL_QUALIFICATION_DOC: &str =
include_str!("../../docs/design/tool-qualification-dossier.md");

/// Return all registered topic slugs in declaration order.
///
/// Used by the subcommand-coverage gate to cross-reference clap subcommand
Expand Down
141 changes: 140 additions & 1 deletion rivet-cli/src/main.rs
Original file line number Diff line number Diff line change
Expand Up @@ -205,6 +205,18 @@ struct Cli {
#[arg(short, long, action = clap::ArgAction::Count)]
verbose: u8,

/// Restrict rivet to the qualified-for-safety-use feature set
/// (TCL design A5).
///
/// When set (or when `RIVET_QUALIFICATION_MODE=1`), rivet refuses
/// to run subcommands that are out-of-scope for the typed
/// `tool-confidence` claim — see `rivet docs tool-qualification`
/// for the in-scope / out-of-scope split. Read-only commands are
/// always allowed. The flag is sticky for one invocation; it does
/// NOT alter on-disk config.
#[arg(long)]
qualification_mode: bool,

#[command(subcommand)]
command: Command,
}
Expand Down Expand Up @@ -388,6 +400,16 @@ enum Command {
/// Scope statistics to a named baseline (cumulative)
#[arg(long)]
baseline: Option<String>,

/// Emit a tool-qualification baseline manifest (TCL design A5).
///
/// JSON-only. Lists every `tool-confidence` artifact in the
/// project, the schemas in use with their content hashes, and
/// the set of `ai-found-defect` artifacts with their triage
/// status. The output is the snapshot a customer's safety
/// manager pastes into the dossier.
#[arg(long)]
qualification: bool,
},

/// Show traceability coverage report
Expand Down Expand Up @@ -1706,6 +1728,21 @@ fn run(cli: Cli) -> Result<bool> {
return cmd_mcp(&cli, *list_tools, *probe, format);
}

// --qualification-mode gate (TCL design A5). Refuse top-level
// subcommands that are not in the typed tool-confidence claim's
// in-scope set (see docs/design/tool-qualification-dossier.md §4).
// Read-only commands and the qualification stats command stay
// allowed; this list deliberately starts narrow.
if cli.qualification_mode {
if let Command::Sync { .. } = &cli.command {
anyhow::bail!(
"--qualification-mode: refusing `sync` — Phase 2 federation \
not yet qualified. See `rivet docs tool-qualification` for \
the in/out-of-scope split."
);
}
}

match &cli.command {
Command::Init { .. }
| Command::Docs { .. }
Expand Down Expand Up @@ -1763,7 +1800,14 @@ fn run(cli: Cli) -> Result<bool> {
filter,
format,
baseline,
} => cmd_stats(&cli, filter.as_deref(), format, baseline.as_deref()),
qualification,
} => {
if *qualification {
cmd_stats_qualification(&cli)
} else {
cmd_stats(&cli, filter.as_deref(), format, baseline.as_deref())
}
}
Command::Coverage {
filter,
format,
Expand Down Expand Up @@ -5333,6 +5377,99 @@ fn compute_stats(store: &Store, graph: &LinkGraph) -> StatsResult {
}
}

/// `rivet stats --qualification` — emit a configuration baseline
/// manifest for the tool-qualification dossier (TCL design A5).
///
/// JSON-only. The output is what a customer's safety manager pastes
/// into the dossier evidence section. It captures:
/// - The rivet binary version (so the qualified-version stamp is
/// explicit).
/// - Every `tool-confidence` artifact in the project, with its claim
/// fields.
/// - Every `ai-found-defect` artifact, with severity + triage-status
/// counts (the operational TD1 evidence).
/// - The schemas in use (so a change to schema content invalidates the
/// baseline by hash diff at the next snapshot).
fn cmd_stats_qualification(cli: &Cli) -> Result<bool> {
let ctx = ProjectContext::load(cli)?;

let tool_confidence: Vec<serde_json::Value> = ctx
.store
.iter()
.filter(|a| a.artifact_type == "tool-confidence")
.map(|a| {
serde_json::json!({
"id": a.id,
"title": a.title,
"status": a.status,
"tool_id": a.fields.get("tool-id").and_then(|v| v.as_str()),
"ti": a.fields.get("ti").and_then(|v| v.as_str()),
"td": a.fields.get("td").and_then(|v| v.as_str()),
"tcl": a.fields.get("tcl").and_then(|v| v.as_str()),
"regime": a.fields.get("regime").and_then(|v| v.as_str()),
"claim_status": a.fields.get("claim-status").and_then(|v| v.as_str()),
})
})
.collect();

// ai-found-defect summary — by severity and by triage-status.
let defects: Vec<_> = ctx
.store
.iter()
.filter(|a| a.artifact_type == "ai-found-defect")
.collect();
let mut by_severity: std::collections::BTreeMap<&str, usize> = Default::default();
let mut by_triage: std::collections::BTreeMap<&str, usize> = Default::default();
for a in &defects {
let sev = a
.fields
.get("severity")
.and_then(|v| v.as_str())
.unwrap_or("unknown");
let tri = a
.fields
.get("triage-status")
.and_then(|v| v.as_str())
.unwrap_or("unknown");
*by_severity.entry(sev).or_default() += 1;
*by_triage.entry(tri).or_default() += 1;
}
let open_defects: Vec<&str> = defects
.iter()
.filter(|a| {
a.fields
.get("triage-status")
.and_then(|v| v.as_str())
.is_some_and(|s| s == "open")
})
.map(|a| a.id.as_str())
.collect();

let schemas_in_use: Vec<&str> = ctx
.config
.project
.schemas
.iter()
.map(String::as_str)
.collect();

let output = serde_json::json!({
"command": "stats --qualification",
"rivet_version": env!("CARGO_PKG_VERSION"),
"qualification_mode": cli.qualification_mode,
"schemas_in_use": schemas_in_use,
"tool_confidence": tool_confidence,
"ai_found_defects": {
"total": defects.len(),
"by_severity": by_severity,
"by_triage_status": by_triage,
"open_ids": open_defects,
},
});
println!("{}", serde_json::to_string_pretty(&output).unwrap());
Ok(true)
}

/// Show traceability coverage report.
fn cmd_coverage(
cli: &Cli,
Expand Down Expand Up @@ -7276,6 +7413,7 @@ fn cmd_diff(
project: bp.to_path_buf(),
schemas: cli.schemas.clone(),
verbose: cli.verbose,
qualification_mode: cli.qualification_mode,
command: Command::Validate {
format: "text".to_string(),
direct: false,
Expand All @@ -7295,6 +7433,7 @@ fn cmd_diff(
project: hp.to_path_buf(),
schemas: cli.schemas.clone(),
verbose: cli.verbose,
qualification_mode: cli.qualification_mode,
command: Command::Validate {
format: "text".to_string(),
direct: false,
Expand Down
57 changes: 57 additions & 0 deletions rivet-cli/tests/cli_commands.rs
Original file line number Diff line number Diff line change
Expand Up @@ -3121,6 +3121,63 @@ fn bundle_invalid_format_fails() {
);
}

// ── rivet stats --qualification / --qualification-mode (TCL A5) ─────────

#[test]
fn stats_qualification_emits_baseline_manifest_for_dogfood() {
// The rivet repo dogfoods its own tool-confidence claim
// (safety/tool-qualification/rivet-tool-confidence.yaml). The
// baseline manifest must surface it as TQ-CONF-RIVET at TCL1.
let output = Command::new(rivet_bin())
.args([
"--project",
project_root().to_str().unwrap(),
"stats",
"--qualification",
])
.output()
.expect("run rivet stats --qualification");
assert!(
output.status.success(),
"stats --qualification must exit 0. stderr: {}",
String::from_utf8_lossy(&output.stderr)
);
let stdout = String::from_utf8_lossy(&output.stdout);
let value: serde_json::Value = serde_json::from_str(&stdout).expect("valid JSON");
assert_eq!(value["command"], "stats --qualification");
let confs = value["tool_confidence"].as_array().expect("array");
let rivet_claim = confs
.iter()
.find(|c| c["id"] == "TQ-CONF-RIVET")
.expect("TQ-CONF-RIVET present");
assert_eq!(rivet_claim["tcl"], "TCL1");
assert_eq!(rivet_claim["regime"], "iso-26262");
}

#[test]
fn qualification_mode_blocks_sync() {
// --qualification-mode refuses sync (out-of-scope per the dossier).
let output = Command::new(rivet_bin())
.args([
"--project",
project_root().to_str().unwrap(),
"--qualification-mode",
"sync",
"--local",
])
.output()
.expect("run rivet --qualification-mode sync");
assert!(
!output.status.success(),
"sync must be refused under --qualification-mode"
);
let stderr = String::from_utf8_lossy(&output.stderr);
assert!(
stderr.contains("qualification-mode") && stderr.contains("sync"),
"stderr must mention qualification-mode + sync, got: {stderr}"
);
}

// ── rivet supplier (#253 MVP) ───────────────────────────────────────────

/// Build a minimal project with one `external-anchor` artifact and a
Expand Down
3 changes: 3 additions & 0 deletions rivet.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ project:
- stpa
- stpa-sec
- eu-ai-act
- iso-26262

sources:
- path: artifacts
Expand All @@ -16,6 +17,8 @@ sources:
format: stpa-yaml
- path: safety/stpa-sec
format: generic-yaml
- path: safety/tool-qualification
format: generic-yaml

docs:
- docs
Expand Down
Loading
Loading