OpenWonderLabs · chenliuyun · May 7, 2026 · May 7, 2026 · May 7, 2026 · May 7, 2026
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -9,7 +9,19 @@ This project follows [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
 
 ## [Unreleased]
 
-## [3.3.3] - 2026-05-07
+## [3.4.0] - 2026-05-07
+
+### Added
+
+- **Decision trace** — `automation.audit.evaluate_trace` (`full | sampled | off`, default `sampled`) records every rule evaluation decision to the audit log with condition-level breakdown, LLM call results, and throttle state. `evaluate_retention_days` bounds how long records are kept.
+- **`rules trace-explain`** — inspect why a rule fired or was blocked. Reads trace records from the audit log; filter by rule name, fire ID, or time window. `--last` shows the most recent evaluation; `--json` outputs structured data for scripting.
+- **LLM condition** (`llm:`) — gates rule execution on an AI yes/no judgement. Supports `provider: auto | openai | anthropic`, `cache_ttl` (skip redundant calls for identical context), `budget.max_calls_per_hour`, and `on_error: fail | pass | skip`. Cache key is content-addressed so equivalent prompts + context always hit the cache. `rules lint` flags misconfigurations (missing provider key, TTL too high for trigger frequency, budget zero).
+- **`rules simulate`** — replay historical events from the audit log (or a `--against <file>` JSONL snapshot) against a rule and report would-fire / blocked-by-condition / throttled / error counts without starting the live engine. `--since <duration>` bounds the replay window.
+- **MCP tools** — `rules_explain` (decision trace lookup) and `rules_simulate` (offline rule replay) exposed via the MCP server alongside existing rule tools.
+
+### Changed
+
+- Test suite: 1959 → 2204 tests (+245, covering trace, explain, LLM condition, and simulate modules).
 
 ### Added
 

diff --git a/README.md b/README.md
@@ -93,7 +93,7 @@ Under the hood every surface shares the same catalog, cache, and HMAC client —
 - 🎨 **Dual output modes** — colorized tables by default; `--json` passthrough for `jq` and scripting
 - 🔐 **Secure credentials** — HMAC-SHA256 signed requests; config file written with `0600`; env-var override for CI
 - 🔍 **Dry-run mode** — preview every mutating request before it hits the API
-- 🧪 **Fully tested** — 1959 Vitest tests, mocked axios, zero network in CI
+- 🧪 **Fully tested** — 2204 Vitest tests, mocked axios, zero network in CI
 - ⚡ **Shell completion** — Bash / Zsh / Fish / PowerShell
 
 ## Requirements
@@ -273,9 +273,9 @@ Five annotated starter files covering common setups live in
 With a policy.yaml (v0.2) you can declare automations that the CLI
 executes for you. Supported triggers: **MQTT** (device events),
 **cron** (schedule-driven), and **webhook** (local HTTP POST).
-Supported conditions: `time_between` (quiet hours) and `device_state`
-(live API check with per-tick dedup). Every fire is recorded in
-`~/.switchbot/audit.log`. `rules run` is long-running; use
+Supported conditions: `time_between` (quiet hours), `device_state`
+(live API check with per-tick dedup), and `llm` (AI decision — see
+below). Every fire is recorded in `~/.switchbot/audit.log`. `rules run` is long-running; use
 `daemon start` / `daemon reload` for the managed background mode.
 
 **Actions** — each rule's `then` array accepts two action types:
@@ -296,6 +296,35 @@ then:
     template: '{"rule":"{{ rule.name }}","fired":"{{ rule.fired_at }}"}'
 ```
 
+**LLM condition** — add an AI judgement step before actions fire. The engine calls the
+configured LLM provider, passes the prompt plus recent event context, and gates execution
+on the model's yes/no answer:
+
+```yaml
+conditions:
+  - llm:
+      prompt: "Is the temperature above normal comfort range?"
+      provider: auto          # auto | openai | anthropic
+      cache_ttl: 5m           # skip redundant calls for identical context
+      budget:
+        max_calls_per_hour: 20
+      on_error: pass          # fail | pass | skip
+```
+
+Set `OPENAI_API_KEY` or `ANTHROPIC_API_KEY` (provider `auto` tries Anthropic first).
+`rules lint` flags misconfigured LLM conditions (no provider key, cache TTL too high for
+the trigger frequency, budget zero). Evaluation decisions are recorded in the trace log.
+
+**Decision trace** — enable `automation.audit.evaluate_trace` in `policy.yaml` to record
+every evaluation decision (why a rule fired or was blocked):
+
+```yaml
+automation:
+  audit:
+    evaluate_trace: sampled   # full | sampled | off (default: sampled)
+    evaluate_retention_days: 7
+```
+
 ```bash
 # 1. Author rules under `automation.rules`. See examples/policies/automation.yaml
 #    for a walkthrough covering the three trigger sources.
@@ -333,6 +362,16 @@ switchbot rules suggest --intent "if door opens and temp below 20 turn on heater
   --llm auto                               # routes complex intents to LLM automatically
 switchbot rules suggest --intent "..." --llm openai  # explicit backend
 # Set OPENAI_API_KEY or ANTHROPIC_API_KEY; auto mode falls back to heuristic on failure
+
+# 9. Explain why a specific evaluation fired or was blocked (requires evaluate_trace).
+switchbot rules trace-explain --rule "motion on" --last
+switchbot rules trace-explain --rule "motion on" --since 1h --json
+switchbot rules trace-explain <fireId>     # single evaluation by ID
+
+# 10. Simulate a rule against historical events without running the engine.
+switchbot rules simulate "motion on"       # replay last 24h from audit log
+switchbot rules simulate "motion on" --since 7d --json
+switchbot rules simulate policy.yaml --rule "night AC" --against events.jsonl
 ```
 
 `rules suggest` enforces several guardrails on LLM output so a model can't quietly arm
@@ -881,12 +920,16 @@ Exposes MCP tools (`list_devices`, `describe_device`, `get_device_status`,
 `send_command`, `list_scenes`, `run_scene`, `search_catalog`,
 `account_overview`, `plan_suggest`, `plan_run`, `audit_query`,
 `audit_stats`, `policy_diff`, `policy_validate`, `policy_new`,
-`policy_migrate`, `rules_suggest`, `rule_notifications`) plus a
+`policy_migrate`, `rules_suggest`, `rule_notifications`,
+`rules_explain`, `rules_simulate`) plus a
 `switchbot://events` resource for real-time shadow updates.
 `rules_suggest` accepts an optional `llm` parameter (`openai | anthropic | auto`)
 to generate YAML for complex intents via an LLM backend.
 `rule_notifications` returns `rule-notify` audit entries, filterable by rule
 name, time range, channel, and result.
+`rules_explain` returns the decision trace for a specific evaluation (why a rule
+fired or was blocked); `rules_simulate` replays historical events against a rule
+and reports would-fire / blocked / throttled outcomes.
 See [`docs/agent-guide.md`](./docs/agent-guide.md) for the full tool reference and safety rules (destructive-command guard).
 
 ### `doctor` — self-check
@@ -1166,7 +1209,7 @@ npm install
 
 npm run dev -- <args>       # Run from TypeScript sources via tsx
 npm run build               # Compile to dist/
-npm test                    # Run the Vitest suite (1959 tests)
+npm test                    # Run the Vitest suite (2204 tests)
 npm run test:watch          # Watch mode
 npm run test:coverage       # Coverage report (v8, HTML + text)
 ```
@@ -1232,7 +1275,8 @@ src/
 │   ├── install.ts        # `switchbot install` / `uninstall`
 │   ├── policy.ts         # `policy validate/new/migrate/diff/add-rule/backup/restore`
 │   ├── rules.ts          # `rules suggest/lint/list/explain/run/reload/tail/replay/
-│   │                     #   conflicts/doctor/summary/last-fired/webhook-*`
+│   │                     #   conflicts/doctor/summary/last-fired/webhook-*/
+│   │                     #   trace-explain/simulate`
 │   ├── scenes.ts
 │   ├── health.ts         # `health check/serve` — report + HTTP endpoints
 │   ├── upgrade-check.ts  # `upgrade-check` — npm registry version check
@@ -1256,7 +1300,7 @@ src/
     ├── format.ts         # renderRows / filterFields / output-format dispatch
     ├── audit.ts          # JSONL audit log writer
     └── quota.ts          # Local daily-quota counter
-tests/                    # Vitest suite (1959 tests, mocked axios, no network)
+tests/                    # Vitest suite (2204 tests, mocked axios, no network)
 ```
 
 ### Release flow

diff --git a/package-lock.json b/package-lock.json
diff --git a/package.json b/package.json
@@ -1,6 +1,6 @@
 {
   "name": "@switchbot/openapi-cli",
-  "version": "3.3.3",
+  "version": "3.4.0",
   "description": "SwitchBot smart home CLI — control devices, run scenes, stream real-time events, and integrate AI agents via MCP. Full API v1.1 coverage.",
   "keywords": [
     "switchbot",

diff --git a/src/commands/capabilities.ts b/src/commands/capabilities.ts
@@ -200,6 +200,8 @@ export const COMMAND_META: Record<string, CommandMeta> = {
   'rules summary': READ_LOCAL,
   'rules last-fired': READ_LOCAL,
   'rules explain': READ_LOCAL,
+  'rules trace-explain': READ_LOCAL,
+  'rules simulate': READ_LOCAL,
   'schema export': READ_LOCAL,
   'scenes list': READ_REMOTE,
   'scenes execute': ACTION_REMOTE,

diff --git a/src/commands/doctor.ts b/src/commands/doctor.ts
@@ -934,11 +934,13 @@ function checkNotifyConnectivity(): Check {
     return { name: 'notify-connectivity', status: 'ok', detail: { present: false, message: 'policy file could not be loaded' } };
   }
 
-  const policy = loaded.data as { automation?: { rules?: Array<{ then?: Array<{ type?: string; channel?: string; to?: string }> }> } } | null;
-  const rules = policy?.automation?.rules ?? [];
+  const policy = loaded.data as { automation?: { rules?: unknown } } | null;
+  const rawRules = policy?.automation?.rules;
+  const rules = Array.isArray(rawRules) ? rawRules : [];
   const webhookUrls: string[] = [];
   for (const rule of rules) {
-    for (const action of rule.then ?? []) {
+    const then = (rule as { then?: unknown }).then;
+    for (const action of (Array.isArray(then) ? then : []) as Array<{ type?: string; channel?: string; to?: string }>) {
       if (action.type === 'notify' && (action.channel === 'webhook' || action.channel === 'openclaw') && action.to) {
         webhookUrls.push(action.to);
       }

diff --git a/src/commands/mcp.ts b/src/commands/mcp.ts
@@ -58,6 +58,12 @@ import { planMigration } from '../policy/migrate.js';
 import { suggestPlan } from './plan.js';
 import { suggestRule } from '../rules/suggest.js';
 import { addRuleToPolicyFile, AddRuleError } from '../policy/add-rule.js';
+import {
+  loadTraceRecords,
+  loadRelatedAudit,
+  formatExplainJson,
+} from '../rules/explain.js';
+import { simulateRule } from '../rules/simulate.js';
 import { allowsDirectDestructiveExecution, destructiveExecutionHint } from '../lib/destructive-mode.js';
 import { writeFileSync } from 'node:fs';
 import { readAudit, type AuditEntry } from '../utils/audit.js';
@@ -1972,6 +1978,152 @@ API docs: https://github.com/OpenWonderLabs/SwitchBotAPI`,
     },
   );
 
+  // ---- rules_explain --------------------------------------------------------
+  server.registerTool(
+    'rules_explain',
+    {
+      title: 'Show why a rule evaluation fired or was blocked',
+      description:
+        'Read rule-evaluate trace records from the audit log and format them for inspection. ' +
+        'Pass fire_id to explain a specific evaluation; or pass rule_name with last:true for the ' +
+        'most recent evaluation; or pass rule_name + since for a window. ' +
+        'Returns trace records only when automation.audit.evaluate_trace is "sampled" or "full".',
+      _meta: { agentSafetyTier: 'read' },
+      inputSchema: z.object({
+        fire_id: z.string().optional().describe('Specific fireId to explain.'),
+        rule_name: z.string().optional().describe('Filter to this rule name.'),
+        since: z.string().optional().describe('Duration string (e.g. 1h, 7d) — show evaluations in this window.'),
+        last: z.boolean().optional().describe('Return only the most recent evaluation (requires rule_name).'),
+        audit_log: z.string().optional().describe(`Audit log path (default: ${pathJoin(os.homedir(), '.switchbot', 'audit.log')}).`),
+      }).strict(),
+      outputSchema: {
+        records: z.array(z.unknown()).describe('Array of trace + relatedAudit objects.'),
+        count: z.number().describe('Number of trace records returned.'),
+      },
+    },
+    async ({ fire_id, rule_name, since, last, audit_log }) => {
+      const DEFAULT_AUDIT_PATH = pathJoin(os.homedir(), '.switchbot', 'audit.log');
+      const auditFile = audit_log ?? DEFAULT_AUDIT_PATH;
+      const sinceIso = since
+        ? new Date(Date.now() - (parseDurationToMs(since) ?? 0)).toISOString()
+        : undefined;
+
+      let records = loadTraceRecords(auditFile, {
+        fireId: fire_id,
+        ruleName: rule_name,
+        since: sinceIso,
+      });
+
+      if (records.length === 0) {
+        return {
+          content: [{ type: 'text' as const, text: 'No rule-evaluate trace records found. Check that automation.audit.evaluate_trace is "sampled" or "full".' }],
+          structuredContent: { records: [], count: 0 },
+        };
+      }
+
+      if (last) {
+        records = [records[records.length - 1]];
+      }
+
+      const output = records.map((record) => {
+        const related = loadRelatedAudit(auditFile, record.fireId);
+        return JSON.parse(formatExplainJson(record, related)) as unknown;
+      });
+
+      return {
+        content: [{ type: 'text' as const, text: JSON.stringify(output, null, 2) }],
+        structuredContent: { records: output, count: output.length },
+      };
+    },
+  );
+
+  // ---- rules_simulate -------------------------------------------------------
+  server.registerTool(
+    'rules_simulate',
+    {
+      title: 'Simulate a rule against historical events',
+      description:
+        'Replay historical events from the audit log or a JSONL file against a rule definition ' +
+        'and report would-fire / blocked-by-condition / throttled outcomes. ' +
+        'Useful for validating a new or modified rule before deployment. ' +
+        'Pass rule_yaml to test an unpublished rule, or rule_name + policy_path to test a deployed rule.',
+      _meta: { agentSafetyTier: 'read' },
+      inputSchema: z.object({
+        rule_yaml: z.string().optional().describe('Standalone rule YAML (takes precedence over policy_path + rule_name).'),
+        policy_path: z.string().optional().describe('Path to policy.yaml (defaults to ~/.switchbot/policy.yaml).'),
+        rule_name: z.string().optional().describe('Name of the rule in policy.yaml to simulate.'),
+        since: z.string().optional().describe('Replay events from this window (e.g. 7d, 24h).'),
+        against: z.string().optional().describe('JSONL file path of EngineEvent objects to replay.'),
+        live_llm: z.boolean().optional().describe('Allow live LLM calls for llm conditions (default: skip and report as would-call).'),
+        audit_log: z.string().optional().describe(`Audit log path (default: ${pathJoin(os.homedir(), '.switchbot', 'audit.log')}).`),
+      }).strict(),
+      outputSchema: {
+        report: z.unknown().describe('SimulateReport object.'),
+      },
+    },
+    async ({ rule_yaml, policy_path, rule_name, since, against, live_llm, audit_log }) => {
+      const DEFAULT_AUDIT_PATH = pathJoin(os.homedir(), '.switchbot', 'audit.log');
+      const auditFile = audit_log ?? DEFAULT_AUDIT_PATH;
+
+      let rule: Record<string, unknown> | undefined;
+
+      if (rule_yaml) {
+        try {
+          rule = yamlParse(rule_yaml) as Record<string, unknown>;
+        } catch (err) {
+          return {
+            content: [{ type: 'text' as const, text: `Failed to parse rule_yaml: ${String(err)}` }],
+            structuredContent: { report: null },
+          };
+        }
+      } else if (policy_path || rule_name) {
+        const { loadPolicyFile } = await import('../policy/load.js');
+        const policyFile = policy_path ?? pathJoin(os.homedir(), '.switchbot', 'policy.yaml');
+        try {
+          const policy = loadPolicyFile(policyFile);
+          const data = (policy.data ?? {}) as { automation?: { rules?: Array<{ name: string }> } };
+          const found = data.automation?.rules?.find((r) => r.name === rule_name);
+          if (!found) {
+            return {
+              content: [{ type: 'text' as const, text: `Rule "${rule_name}" not found in ${policyFile}.` }],
+              structuredContent: { report: null },
+            };
+          }
+          rule = found as unknown as Record<string, unknown>;
+        } catch (err) {
+          return {
+            content: [{ type: 'text' as const, text: `Failed to load policy: ${String(err)}` }],
+            structuredContent: { report: null },
+          };
+        }
+      } else {
+        return {
+          content: [{ type: 'text' as const, text: 'Provide rule_yaml or (policy_path + rule_name) to specify the rule to simulate.' }],
+          structuredContent: { report: null },
+        };
+      }
+
+      try {
+        const report = await simulateRule({
+          rule: rule as unknown as Parameters<typeof simulateRule>[0]['rule'],
+          since,
+          against,
+          auditLog: auditFile,
+          liveLlm: live_llm ?? false,
+        });
+        return {
+          content: [{ type: 'text' as const, text: JSON.stringify(report, null, 2) }],
+          structuredContent: { report },
+        };
+      } catch (err) {
+        return {
+          content: [{ type: 'text' as const, text: `Simulate error: ${String(err)}` }],
+          structuredContent: { report: null },
+        };
+      }
+    },
+  );
+
   // ---- policy_add_rule ------------------------------------------------------
   server.registerTool(
     'policy_add_rule',