Skip to content

feat: structured decision observability (reason codes, metrics, logging)#191

Merged
dgenio merged 4 commits into
mainfrom
claude/issue-triage-grouping-wkmnhz
Jun 21, 2026
Merged

feat: structured decision observability (reason codes, metrics, logging)#191
dgenio merged 4 commits into
mainfrom
claude/issue-triage-grouping-wkmnhz

Conversation

@dgenio

@dgenio dgenio commented Jun 20, 2026

Copy link
Copy Markdown
Owner

Summary

Implements the "structured decision observability" group — five related issues that share one code area and implementation path — as a single coherent change. The unifying idea: make AgentFence's decisions machine-readable and operable, across the CLI (check) and both proxies, without compromising the local-first / no-telemetry posture.

What changed (by issue):

Related issue

Refs #136, #169, #101, #121, #163

How verified

  • make ci (fmt-check + vet + go test -race + coverage): green, total coverage 80.9%.
    • New packages: internal/metrics 89.1%, internal/oplog 71.7%.
  • New tests: engine reason-code table (every constraint family + default + rule-match + taint escalation), metrics counters/Prometheus/HTTP-handler/concurrency, oplog text+json+debug-gating, audit by_reason_code summarize, proxy metrics recording, and CLI --metrics / --log-format json (asserting stdout stays valid JSON and the default text path is unchanged).
  • Manual smoke: check --metrics summary on stderr; audit events emit schema_version:"4" + reason_code; proxy-http --metrics-listen serves /metrics in Prometheus format and emits JSON ops logs under --log-format json.
  • The audit-event schema drift-guard test (internal/audit/schema_test.go) is updated in lockstep with the new reason_code field and schema version.

Tradeoffs / risks

  • Audit schema bump to "4". reason_code is omitempty, so old readers ignore it and pre-taxonomy logs (no code) summarize fine. Per project guidance the maintainer asked to disregard strict back-compat; the field is nonetheless additive.
  • Options.Logger type change (io.Writer*slog.Logger) on both proxies — an internal API; the one test that set it was updated. Operational stderr strings changed shape (now structured), but check's text contract is preserved.
  • [Feature] OpenTelemetry metrics and structured decision export for the proxy #101 deviates from the issue's suggested OTel SDK (documented above) to keep the binary dependency-free; an OTLP exporter could be a focused follow-up if desired.

Scope notes (Mode B, one PR for all five issues as requested)

Checklist

  • Tests added or updated
  • Documentation updated if needed
  • CI passes (make ci green locally; race + 80.9% coverage)
  • Issue number included

🤖 Generated with Claude Code

https://claude.ai/code/session_01Y5TznmPR3zr2ZvppT2pMyH


Generated by Claude Code

Implements the "structured decision observability" group as one coherent
change across the engine, audit, proxies, and CLI:

- Typed reason codes (#136): every decision carries a stable, machine-readable
  policy.ReasonCode alongside the human-readable reason. Set at every engine
  decision site, taint escalation, and approval resolution; recorded on the
  audit Event (schema_version bumped to "4") and grouped by `audit summarize`.
  Free-text reasons are unchanged.

- Decision metrics (#169): new internal/metrics package with thread-safe
  Counters (by decision, tool, reason code; taint escalations, approval
  outcomes, eval latency, errors). `check --metrics` prints a summary to stderr
  on exit, dependency-free.

- Prometheus endpoint (#101): proxy/proxy-http expose the same counters at
  /metrics via opt-in `--metrics-listen <addr>`, rendered in Prometheus
  text-exposition format with no third-party dependency (kept off by default;
  local and operator-controlled, per the no-telemetry posture).

- Structured operational logging (#121, #163): new internal/oplog package wraps
  log/slog; `--log-format text|json` on check/proxy/proxy-http routes stderr
  diagnostics. text (default) is unchanged; json emits one record per line. The
  operational log stays distinct from the audit log and decision output.

Docs: new docs/observability.md; audit-event-schema, README status table, and
CHANGELOG updated. README demo blocks refreshed (schema_version "4" +
reason_code). make ci green (race + 80.9% coverage).

Refs #136, #169, #101, #121, #163

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01Y5TznmPR3zr2ZvppT2pMyH
Copilot AI review requested due to automatic review settings June 20, 2026 14:15

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR implements “structured decision observability” across check, proxy, and proxy-http by introducing stable decision reason codes, in-process decision/latency/error counters with optional Prometheus exposition, and structured operational logging to stderr—while keeping the project’s local-first/no-telemetry posture.

Changes:

  • Add policy.ReasonCode and propagate it through engine decisions, taint escalation, approval resolution, and audit events (audit schema bumped to "4"), plus audit summarization by reason code.
  • Introduce internal/metrics counters (CLI check --metrics stderr summary; proxy /metrics via --metrics-listen) and wire metrics recording into both proxies.
  • Introduce internal/oplog (slog wrapper) and wire --log-format text|json into CLI and proxies; update docs/README/CHANGELOG and add extensive tests.

Reviewed changes

Copilot reviewed 25 out of 26 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
schema/agentfence-audit-event.schema.json Bumps schema version examples to 4 and adds reason_code field to the JSON schema.
README.md Updates status table and demo output examples to schema v4 + reason_code.
internal/proxy/proxy.go Switches proxy diagnostics to *slog.Logger, adds metrics wiring (latency/errors/decision counts), and records approval reason codes.
internal/proxy/proxy_test.go Updates test options to match new logger default/type.
internal/proxy/metrics_test.go Adds proxy-side test verifying metrics recording from processAgentLine.
internal/policy/reasoncode.go Introduces the reason-code taxonomy as a stable enum-like string type.
internal/policy/policy.go Adds ReasonCode to EvaluationResult JSON output (omitempty).
internal/oplog/oplog.go Adds operational logging package wrapping log/slog with text/json formats.
internal/oplog/oplog_test.go Adds tests for format parsing, text rendering, JSON output validity, and debug gating.
internal/metrics/metrics.go Adds thread-safe counters, snapshot formatting (text/Prometheus), and latency/error tracking.
internal/metrics/metrics_test.go Adds tests for counting/snapshotting, determinism, Prometheus output, and concurrency.
internal/metrics/http.go Adds HTTP handlers/mux for /metrics endpoint (Prometheus text exposition).
internal/metrics/http_test.go Adds tests for handler method gating and mux routing.
internal/httpproxy/httpproxy.go Switches HTTP proxy diagnostics to *slog.Logger, adds metrics wiring, and records approval reason codes.
internal/engine/reasoncode_test.go Adds coverage ensuring engine decision paths set correct ReasonCode (including taint escalation).
internal/engine/engine.go Propagates reason codes through constraint evaluation and taint escalation into results/events.
internal/audit/summarize.go Adds ByReasonCode grouping and includes it in text summary output.
internal/audit/summarize_reasoncode_test.go Adds test for summarize-by-reason-code behavior and text output section.
internal/audit/audit.go Bumps audit schema version to 4; adds ReasonCode to Event and sets it in new events/error events.
internal/approval/approval.go Adds reason-code mapping to approval outcomes (Outcome.Code).
docs/observability.md New doc describing decision streams, operational logging, CLI metrics summary, and proxy /metrics.
docs/audit-event-schema.md Updates schema version to 4 and documents the reason-code taxonomy.
cmd/agentfence/observability_test.go Adds CLI tests for --metrics stderr summary and --log-format json stderr behavior without stdout pollution.
cmd/agentfence/main.go Adds flags/wiring for --log-format, --metrics, --metrics-listen; starts/stops metrics server for proxies.
CHANGELOG.md Documents the new observability feature group and the schema bump.
.gitignore Ignores locally-built ./agentfence binary.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread internal/metrics/metrics.go Outdated
Comment thread docs/observability.md
Comment thread internal/metrics/http.go Outdated
claude added 3 commits June 20, 2026 14:21
…rect docs

- WritePrometheus now emits agentfence_reason_codes_total{code=...} so the
  /metrics endpoint actually exposes the reason-code breakdown promised by the
  feature and docs; corrected the misleading agentfence_decisions_total HELP
  text ("by tool and decision") and the function doc comment.
- ServeMux doc comment fixed: it builds and returns a mux; it does not run a
  server (the caller owns the http.Server).
- docs/observability.md: narrowed the "text default is byte-stable" claim to
  check's stderr only; note the proxies' text diagnostics changed shape and
  recommend --log-format json for machine parsing.

Refs #101, #121, #163

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01Y5TznmPR3zr2ZvppT2pMyH
The /metrics example claimed HELP "by decision and reason code" but the
emitter labels agentfence_decisions_total by {tool,decision} and exports
reason codes as a separate agentfence_reason_codes_total series. Correct
the HELP text, add the reason_codes_total series to the example, and list
it in the metric-families table so the docs match WritePrometheus output.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01RLq9ynrFfGmCRJQDBGYsWz
…doc error kinds

Address audit findings on the observability surface (no behavior change to
decisions or the audit chain):

- oplog.textHandler now serializes writes through a shared mutex (a pointer,
  shared across WithAttrs/WithGroup clones), matching slog's stdlib handlers.
  The stdio proxy logs from two relay goroutines, so without it records could
  interleave on stderr.
- serveMetrics binds the listener synchronously (net.Listen) so an unusable
  --metrics-listen address is reported at startup instead of silently in a
  goroutine; a bind failure disables the endpoint without taking down the proxy.
- docs/observability.md enumerates the actual agentfence_errors_total kinds
  (oversize, unparsed, batch, relay) emitted by the proxies.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01RLq9ynrFfGmCRJQDBGYsWz
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants