diff --git a/docs/design/tool-qualification-dossier.md b/docs/design/tool-qualification-dossier.md index 06478e6..663a837 100644 --- a/docs/design/tool-qualification-dossier.md +++ b/docs/design/tool-qualification-dossier.md @@ -1,10 +1,25 @@ # Tool qualification dossier — rivet (ISO 26262-8 §11.4.7) -**Status:** draft (TCL workstream A4) +**Status:** draft (TCL workstream A4) — **`claim-status: self-claimed`, not externally qualified.** **Audience:** safety leads, certification authorities, OEM tool-qualification reviewers -**Last revised:** 2026-05-16 +**Last revised:** 2026-05-16 (after v0.10.0 self-review) **Companion:** `docs/design/tool-confidence-level.md` (the *why*), `safety/stpa/tool-qualification.yaml` (the hazard analysis), `safety/tool-qualification/rivet-tool-confidence.yaml` (the typed claim) +## 0. Honest scope statement (read this first) + +This dossier was drafted by the rivet maintainers, including AI-assisted authoring sessions. The TI2 / TD1 / TCL1 claim is **self-claimed under ISO 26262-8 §11.4.7 — not externally qualified.** The artifact at `safety/tool-qualification/rivet-tool-confidence.yaml` carries `claim-status: self-claimed` and `status: draft` to make this machine-readable. + +**The following are NOT yet defensible without further work:** + +- **No independent confirmation reviewer** per ISO 26262-2 §6.4.7. The artifact `TQ-CONF-RIVET` was authored under `provenance.created-by: ai-assisted` — the same kind of AI session this dossier proposes to qualify. +- **Cross-walks to DO-330 / IEC 62304 / EN 50128** in §1 are clause-level **unverified**. They come from training-data recall, not from reviewed copies of the standards. Treat the regime table as a *project pointer*, not a qualification mapping. +- The **five-layer "product of miss rates"** argument in §3 assumes independence that is **not proven** — Kani, validate, proptest, and the oracles all consume the same parser and the same `Artifact` model. Common-mode failure dominates and is not yet quantified. +- **Mutation testing currently runs 29 mutants** (26 caught, 3 unviable). The ASIL B/C/D thresholds named in the design note refer to a future 500+-mutant suite that has not been built. +- The **Kani proof set is 27 harnesses** in `rivet-core/src/proofs.rs`. Where any tooling or claim cites larger figures (e.g. "2000+ proofs"), that refers to internal CBMC sub-obligations and is **not** the right number to quote externally — count harnesses, not obligations. +- `proofs/rocq/Schema.v` contains **one `Admitted.` theorem** (`vmodel_chain_two_steps`, line 523). `rivet-core/src/verus_specs.rs:217` contains **`assume(backlink_symmetric(g, s));`** — these are not yet `Qed`'d / proved. +- `ai-session.invoker` is personal data per **DSGVO Art. 4**. The schema declares `lawful-basis`, `retention-period`, `erasure-mechanism` fields and a `dpia` artifact type (since v0.10.0 follow-up), but `rivet validate` does not yet enforce that an `invoker`-bearing session links to a DPIA. +- The **release v0.10.0** binary archives ship a `SHA256SUMS.txt` that is **not cryptographically signed**. The git tag is not GPG-signed. Sigstore / cosign integration is tracked but unshipped. + This dossier collects the qualification argument for **rivet** as a development tool used in safety-critical projects. It is the prose layer of a three-part artefact set: - The **STPA** identifies tool-qualification hazards. @@ -37,7 +52,7 @@ TD1 means **high confidence** that errors in the tool are prevented or caught. R 1. **Validate** — link-graph and traceability-rule checks; pre-existing, mature, has bounded-MC proofs (Kani) for the soundness side and proptest property suites for the completeness side. 2. **Oracles** — declarative `agent-pipelines:` blocks gated by mechanical checks (cited-source freshness, schema-conformance, docs-check invariants). 3. **Mutation testing** — `mutants.out/` runs catch silently-passing test gaps in rivet-core and rivet-cli. -4. **Formal proofs** — Kani (bounded model checking, 2000+ proofs across rivet-core), Verus (selected modules), Rocq (verification theorems in `proofs/rocq/`). +4. **Formal proofs** — Kani (bounded model checking, **27 harnesses** in `rivet-core/src/proofs.rs`; bounded input sizes ranging from 8 to 24 bytes for most parser harnesses), Verus (spec stubs in `verus_specs.rs`; the central `backlink_symmetric` obligation is currently `assume`'d, not proved), Rocq (12 `Qed`'d theorems + 15 lemmas across `proofs/rocq/Schema.v` and `Validation.v`; **one `Admitted` theorem** — `vmodel_chain_two_steps`). The five-layer independence argument that follows is **not yet defensible** — see §0. 5. **`ai-found-defect` triage loop** — every defect caught by the above layers (and especially every defect introduced by AI authoring) becomes a typed artefact, links back to its `ai-session`, and gates release on triage state. This is the layer that compensates for the eroded human-review assumption when the upstream author is an AI assistant. The five layers are independent (catch different defect classes), so the residual-error probability is the *product* of their miss rates, not the sum. That is the operational basis for the TD1 claim. diff --git a/rivet-cli/src/main.rs b/rivet-cli/src/main.rs index cb2bc2c..a4ed02b 100644 --- a/rivet-cli/src/main.rs +++ b/rivet-cli/src/main.rs @@ -1626,6 +1626,22 @@ enum CheckAction { #[arg(short, long, default_value = "text")] format: String, }, + + /// Block release if any `ai-found-defect` with `triage-status: open` + /// links to an artifact whose `status` is `released` or `approved`, + /// OR if a defect's `triaged-by` equals the originating session's + /// `invoker` (DPO segregation-of-duties violation). + /// + /// TCL workstream B operational primitive. The dossier + /// (`rivet docs tool-qualification` §3) claims this loop as the TD1 + /// detection layer that compensates for eroded human review when + /// the upstream author is an AI assistant. Without this gate, the + /// TD1 claim is vapor. + AiDefectsOpen { + /// Output format: "text" (default) or "json". + #[arg(short, long, default_value = "text")] + format: String, + }, } fn main() -> ExitCode { @@ -2086,6 +2102,7 @@ fn run(cli: Cli) -> Result { strict, format, } => cmd_check_sources(&cli, *update, *apply, *strict, format), + CheckAction::AiDefectsOpen { format } => cmd_check_ai_defects_open(&cli, format), }, #[cfg(feature = "wasm")] Command::Import { @@ -11473,6 +11490,143 @@ impl ProjectContext { } } +/// `rivet check ai-defects-open` — TCL workstream B operational primitive. +/// +/// Two gates, each producing a non-zero exit on violation: +/// +/// 1. **Open-defect gate:** an `ai-found-defect` with `triage-status: +/// open` must not have a `defect-against` link to an artifact whose +/// `status` is `released` or `approved`. +/// +/// 2. **Segregation-of-duties gate (DPO-lens finding):** a defect's +/// `triaged-by` must not equal the originating session's `invoker` +/// (resolved via the defect's `produced-by` link). The same AI / +/// operator that authored the offending artifact must not be the +/// one to mark its defect "accepted." This is the ISO 26262-2 +/// §6.4.7 confirmation-reviewer independence requirement at the +/// AI-loop level. +fn cmd_check_ai_defects_open(cli: &Cli, format: &str) -> Result { + validate_format(format, &["text", "json"])?; + let ctx = ProjectContext::load(cli)?; + + let mut open_against_released: Vec = Vec::new(); + let mut self_triaged: Vec = Vec::new(); + + for defect in ctx + .store + .iter() + .filter(|a| a.artifact_type == "ai-found-defect") + { + let triage = defect + .fields + .get("triage-status") + .and_then(|v| v.as_str()) + .unwrap_or(""); + let triaged_by = defect.fields.get("triaged-by").and_then(|v| v.as_str()); + + // Gate 1 — open defects against released/approved artifacts. + if triage == "open" { + for link in &defect.links { + if link.link_type != "defect-against" { + continue; + } + let Some(target) = ctx.store.get(&link.target) else { + continue; + }; + let target_status = target.status.as_deref().unwrap_or(""); + if target_status.eq_ignore_ascii_case("released") + || target_status.eq_ignore_ascii_case("approved") + { + open_against_released.push(serde_json::json!({ + "defect": defect.id, + "target": link.target, + "target_status": target_status, + "rule": "ai-found-defect.open.against-released", + })); + } + } + } + + // Gate 2 — segregation of duties. + if let Some(tb) = triaged_by { + if let Some(session_link) = defect.links.iter().find(|l| l.link_type == "produced-by") { + if let Some(session) = ctx.store.get(&session_link.target) { + if session.artifact_type == "ai-session" { + let invoker = session + .fields + .get("invoker") + .and_then(|v| v.as_str()) + .unwrap_or(""); + if !invoker.is_empty() && invoker == tb { + self_triaged.push(serde_json::json!({ + "defect": defect.id, + "session": session_link.target, + "invoker": invoker, + "triaged_by": tb, + "rule": "ai-found-defect.self-triage", + })); + } + } + } + } + } + } + + let violations = open_against_released.len() + self_triaged.len(); + let pass = violations == 0; + + if format == "json" { + let out = serde_json::json!({ + "command": "check ai-defects-open", + "passed": pass, + "open_against_released": open_against_released, + "self_triaged": self_triaged, + "summary": { + "total_violations": violations, + "open_against_released_count": open_against_released.len(), + "self_triaged_count": self_triaged.len(), + }, + }); + println!("{}", serde_json::to_string_pretty(&out)?); + } else if pass { + println!( + "ai-defects-open: PASS (no open AI defects against released artifacts; no self-triaged findings)" + ); + } else { + println!("ai-defects-open: FAIL — {violations} violation(s)\n"); + if !open_against_released.is_empty() { + println!("Open defects against released/approved artifacts:"); + for v in &open_against_released { + println!( + " {} -> {} (status: {})", + v["defect"].as_str().unwrap_or("?"), + v["target"].as_str().unwrap_or("?"), + v["target_status"].as_str().unwrap_or("?") + ); + } + println!(); + } + if !self_triaged.is_empty() { + println!("Self-triaged defects (DPO segregation-of-duties violation):"); + for v in &self_triaged { + println!( + " {} triaged by '{}' (same as session invoker)", + v["defect"].as_str().unwrap_or("?"), + v["triaged_by"].as_str().unwrap_or("?") + ); + } + println!(); + } + println!( + "These violations would let an AI-introduced defect persist into a \ + released artifact, or let the originating AI accept its own defect. \ + See `rivet docs tool-qualification` §3 (TD1 detection layer)." + ); + } + + Ok(pass) +} + fn print_stats(store: &Store) { println!("Artifact summary:"); let mut types: Vec<&str> = store.types().collect(); diff --git a/rivet-cli/tests/cli_commands.rs b/rivet-cli/tests/cli_commands.rs index 645bfc8..ad8f3f4 100644 --- a/rivet-cli/tests/cli_commands.rs +++ b/rivet-cli/tests/cli_commands.rs @@ -3178,6 +3178,137 @@ fn qualification_mode_blocks_sync() { ); } +// ── rivet check ai-defects-open (TCL workstream B) ────────────────────── + +/// Build a project with an `ai-found-defect` linked to a status-X +/// requirement, with triage-status Y, optionally with same operator +/// as session invoker (self-triage scenario). +fn ai_defects_project(triage: &str, target_status: &str, self_triage: bool) -> tempfile::TempDir { + let tmp = tempfile::tempdir().expect("create temp dir"); + let dir = tmp.path(); + + let init = Command::new(rivet_bin()) + .args(["init", "--preset", "dev", "--dir", dir.to_str().unwrap()]) + .output() + .expect("rivet init"); + assert!(init.status.success(), "init failed: {init:?}"); + + let invoker = "alice@example.com"; + let triaged_by = if self_triage { + invoker + } else { + "bob@example.com" + }; + let req_path = dir.join("artifacts").join("requirements.yaml"); + let existing = std::fs::read_to_string(&req_path).expect("read"); + let extra = format!( + r#" + - id: REQ-RELEASED + type: requirement + title: Released requirement + status: {target_status} + - id: AI-SESS-001 + type: ai-session + title: Test session + status: approved + fields: + session-id: test-session-abc + model-id: claude-opus-4-7 + invoker: {invoker} + - id: AID-001 + type: ai-found-defect + title: Defect against released + status: approved + fields: + severity: major + triage-status: {triage} + detected-by: validate + triaged-by: {triaged_by} + links: + - type: defect-against + target: REQ-RELEASED + - type: produced-by + target: AI-SESS-001 +"# + ); + std::fs::write(&req_path, format!("{existing}{extra}")).expect("write"); + tmp +} + +#[test] +fn ai_defects_open_passes_when_triaged_and_no_self_triage() { + let tmp = ai_defects_project("accepted", "released", false); + let output = Command::new(rivet_bin()) + .args([ + "--project", + tmp.path().to_str().unwrap(), + "check", + "ai-defects-open", + "--format", + "json", + ]) + .output() + .expect("run check ai-defects-open"); + let stdout = String::from_utf8_lossy(&output.stdout); + let value: serde_json::Value = serde_json::from_str(&stdout).expect("valid json"); + assert_eq!( + value["passed"], true, + "must pass when triage != open. got: {value}" + ); +} + +#[test] +fn ai_defects_open_fails_on_open_defect_against_released() { + let tmp = ai_defects_project("open", "released", false); + let output = Command::new(rivet_bin()) + .args([ + "--project", + tmp.path().to_str().unwrap(), + "check", + "ai-defects-open", + "--format", + "json", + ]) + .output() + .expect("run check ai-defects-open"); + assert!( + !output.status.success(), + "must exit non-zero when open defect links to released artifact" + ); + let stdout = String::from_utf8_lossy(&output.stdout); + let value: serde_json::Value = serde_json::from_str(&stdout).expect("valid json"); + assert_eq!(value["passed"], false); + assert_eq!(value["summary"]["open_against_released_count"], 1); + assert_eq!(value["open_against_released"][0]["defect"], "AID-001"); +} + +#[test] +fn ai_defects_open_fails_on_self_triage_segregation_violation() { + // Defect triage-status is accepted (Gate 1 passes), but triaged-by == + // session invoker — Gate 2 (DPO segregation) must fire. + let tmp = ai_defects_project("accepted", "draft", true); + let output = Command::new(rivet_bin()) + .args([ + "--project", + tmp.path().to_str().unwrap(), + "check", + "ai-defects-open", + "--format", + "json", + ]) + .output() + .expect("run check ai-defects-open"); + assert!( + !output.status.success(), + "must exit non-zero when triaged-by == session invoker" + ); + let stdout = String::from_utf8_lossy(&output.stdout); + let value: serde_json::Value = serde_json::from_str(&stdout).expect("valid json"); + assert_eq!(value["passed"], false); + assert_eq!(value["summary"]["self_triaged_count"], 1); + assert_eq!(value["self_triaged"][0]["triaged_by"], "alice@example.com"); +} + // ── rivet supplier (#253 MVP) ─────────────────────────────────────────── /// Build a minimal project with one `external-anchor` artifact and a diff --git a/rivet-core/src/db.rs b/rivet-core/src/db.rs index f04936e..1cb0d75 100644 --- a/rivet-core/src/db.rs +++ b/rivet-core/src/db.rs @@ -642,8 +642,17 @@ fn build_pipeline_with_extras( /// /// When the `rowan-yaml` feature is enabled, uses the schema-driven rowan /// parser (`parse_artifacts_v2`) which reads `yaml-section` metadata from -/// the schema. In debug builds, both parsers run and their output is -fn build_store( +/// the schema. +/// +/// **Tracked** (v0.10.0 follow-up, Mobile/Scale lens finding): previously +/// this was a plain `fn`, so every downstream caller (`validate_all`, +/// `build_link_graph`, `compute_coverage_tracked`) re-built the whole +/// `Store` HashMap on every salsa revision — defeating the incremental +/// validation story sold by the dossier. With `#[salsa::tracked]`, +/// salsa memoizes the Store and returns the cached value when the +/// underlying `parse_artifacts_v2` outputs haven't changed. +#[salsa::tracked] +pub fn build_store( db: &dyn salsa::Database, source_set: SourceFileSet, schema_set: SchemaInputSet, @@ -672,7 +681,10 @@ fn build_store( /// /// Adapter artifacts are inserted last, so conflicting IDs are resolved /// in favour of the adapter — matching the direct-path ordering. -fn build_store_with_extras( +/// +/// **Tracked** alongside `build_store` — see that function's note. +#[salsa::tracked] +pub fn build_store_with_extras( db: &dyn salsa::Database, source_set: SourceFileSet, schema_set: SchemaInputSet, @@ -1157,6 +1169,33 @@ artifacts: assert_eq!(artifacts[0].artifact_type, "requirement"); } + // ── Test 10a: build_store is salsa-tracked (cache hit on no-op) ───── + + /// Mobile/Scale lens (v0.10.0 review) found `build_store` was a plain + /// `fn` rather than `#[salsa::tracked]`, so every downstream query + /// rebuilt the whole `Store` HashMap on every revision. After making + /// it tracked, repeated calls with the same `source_set`/`schema_set` + /// must return *bitwise-identical* `Store` values — proving salsa + /// served the cache rather than re-running the function. + /// + /// rivet: verifies REQ-029 + #[test] + fn build_store_cache_returns_equal_on_noop_revision() { + let db = RivetDatabase::new(); + let sources = + db.load_sources(&[("reqs.yaml", SOURCE_REQ), ("design.yaml", SOURCE_DD_LINKED)]); + let schemas = db.load_schemas(&[("test", TEST_SCHEMA)]); + + let store_1 = build_store(&db, sources, schemas); + let store_2 = build_store(&db, sources, schemas); + assert_eq!( + store_1, store_2, + "build_store must produce equal Stores across repeated calls \ + with the same inputs (PartialEq) — necessary condition for \ + salsa cache hit" + ); + } + // ── Test 10: merged schema via build_schema ───────────────────────── // rivet: verifies REQ-029 diff --git a/rivet-core/src/store.rs b/rivet-core/src/store.rs index e73c759..b49b4de 100644 --- a/rivet-core/src/store.rs +++ b/rivet-core/src/store.rs @@ -46,7 +46,7 @@ use crate::model::{Artifact, ArtifactId, BaselineConfig}; /// Holds all loaded artifacts and provides lookup by ID and by type. /// The store is the central data structure consumed by the link graph, /// validator, query engine, and matrix generator. -#[derive(Debug, Default, Clone)] +#[derive(Debug, Default, Clone, PartialEq)] pub struct Store { artifacts: HashMap, by_type: HashMap>, diff --git a/safety/tool-qualification/rivet-tool-confidence.yaml b/safety/tool-qualification/rivet-tool-confidence.yaml index 74558ab..cd409d8 100644 --- a/safety/tool-qualification/rivet-tool-confidence.yaml +++ b/safety/tool-qualification/rivet-tool-confidence.yaml @@ -26,13 +26,19 @@ artifacts: title: Rivet — tool confidence claim (ISO 26262-8 Part 8 §11.4.7) status: draft description: > - Rivet's typed tool-qualification claim. The TI/TD analysis is - defended by the STPA in `safety/stpa/tool-qualification.yaml` - (losses L-TQ-001..003, hazards H-TQ-001..007) and by the - mechanical detection layer (validate, oracle gating, Kani/Verus, - mutation testing, cited-source verification, ai-found-defect - triage). The dossier prose lives in - `docs/design/tool-qualification-dossier.md`. + Rivet's typed tool-qualification claim. **`claim-status: + self-claimed`** — this is the *target shape* the maintainers aim + to defend, not an externally-qualified credential. The TI/TD + analysis points at the STPA in `safety/stpa/tool-qualification.yaml` + (losses L-TQ-001..003, hazards H-TQ-001..007) and the mechanical + detection layer (validate, oracle gating including + `rivet check ai-defects-open`, Kani/Verus, mutation testing, + cited-source verification, ai-found-defect triage). The dossier + prose, including §0 "Honest scope statement" enumerating what is + NOT yet defensible (no independent confirmation reviewer, + unverified cross-walks, unproven independence, 29-mutant testing, + one `Admitted` Rocq theorem, no signed SHA256SUMS, no DPIA + enforcement), lives in `docs/design/tool-qualification-dossier.md`. tags: [safety, dogfood, iso-26262] fields: tool-id: rivet @@ -42,15 +48,21 @@ artifacts: regime: iso-26262 claim-status: self-claimed scope: | - Covered by this claim: rivet validate (link-graph, traceability - rules, schema field checks, s-expression evaluator), rivet - commits (trailer audit), rivet coverage (single-org + supplier - 3-state), rivet supplier {list,check}. + Covered by this claim (in-scope subcommands): + rivet validate, rivet commits (trailer audit), rivet coverage + (single-org + supplier 3-state), rivet supplier {list,check}, + rivet check ai-defects-open (TCL workstream B TD1 gate). - Out of scope (NOT qualified by this claim): rivet sync / - rivet supplier pull (Phase 2), rivet migrate (importers), - rivet serve (read-only web UI), MCP write tools that bypass - validate (--qualification-mode disables these). + OUT OF SCOPE (NOT qualified): rivet sync, rivet supplier pull + (Phase 2), rivet migrate, rivet serve, MCP write tools. + + ALSO NOT DEFENDED by this claim (see dossier §0): + no independent confirmation reviewer; unverified + DO-330/IEC 62304/EN 50128 cross-walks; unproven five-layer + independence; 29-mutant testing; one `Admitted` Rocq theorem; + one `assume`'d Verus obligation; unsigned SHA256SUMS; + unsigned git tag; no validate-time enforcement of DPIA + linkage from `ai-session.invoker`-bearing artifacts. provenance: created-by: ai-assisted model: claude-opus-4-7 diff --git a/schemas/common.yaml b/schemas/common.yaml index f92aba6..a50bf93 100644 --- a/schemas/common.yaml +++ b/schemas/common.yaml @@ -170,7 +170,84 @@ artifact-types: - name: invoker type: string required: false - description: Human operator who ran the session (login or email). + description: > + Human operator who ran the session (login or email). When set, + this field is PERSONAL DATA under DSGVO Art. 4 — the artifact + should link to a `dpia` artifact (use `derives-from` link + type) or be classified `lawful-basis: anonymised`. v0.10.0 + ships the schema but does not yet enforce the link at + validate time (tracked). + - name: lawful-basis + type: string + required: false + allowed-values: + - art-6-1-a-consent + - art-6-1-b-contract + - art-6-1-c-legal-obligation + - art-6-1-f-legitimate-interest + - art-9-2-research + - anonymised + description: > + DSGVO Art. 6(1) (or Art. 9(2) for special categories) lawful + basis for processing `invoker`. Required when `invoker` is + set and not classified `anonymised`. + - name: retention-period + type: string + required: false + description: > + ISO 8601 duration after which `invoker` must be erased (e.g. + "P2Y" for 2 years). DSGVO Art. 5(1)(e) — storage limitation. + - name: erasure-mechanism + type: string + required: false + description: > + Machine-readable mechanism that performs the erasure when the + retention period elapses (e.g. `git-replace-author-rewrite`, + `tombstone-only`, `none`). DSGVO Art. 17. + link-fields: [] + + # ── Data Protection Impact Assessment (DSGVO Art. 35) ─────────────── + - name: dpia + description: > + Data Protection Impact Assessment per DSGVO Art. 35. Required + when a project's AI-provenance workflow systematically processes + personal data (e.g. `ai-session.invoker`). The artifact captures + the assessment outcome, the responsible Data Protection Officer + sign-off, and the categories of personal data processed. Link + `ai-session` artifacts to one or more `dpia` records via + `derives-from`. + fields: + - name: dpo-sign-off + type: string + required: true + description: > + DPO sign-off identifier (e.g. internal ticket ID or signed + document hash). Free-form at the schema level; project + policy may narrow. + - name: personal-data-categories + type: list + required: true + description: > + Categories of personal data processed (e.g. ["login", + "email", "session-content"]). DSGVO Art. 35(7)(b). + - name: risk-assessment + type: string + required: true + allowed-values: [low, medium, high] + description: > + DSGVO Art. 35(7)(c) — assessment of necessity and + proportionality. + - name: mitigation-measures + type: text + required: false + description: DSGVO Art. 35(7)(d) — measures to address risks. + - name: consultation-date + type: string + required: false + description: > + ISO-8601 date when the DPO/supervisory authority was + consulted (DSGVO Art. 35(2), Art. 36 for prior consultation + when residual risk is high). link-fields: [] # ── AI-found defect (TCL design §6.2, A3) ───────────────────────────