Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 13 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,19 @@ This project follows [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [Unreleased]

## [3.3.3] - 2026-05-07
## [3.4.0] - 2026-05-07

### Added

- **Decision trace** — `automation.audit.evaluate_trace` (`full | sampled | off`, default `sampled`) records every rule evaluation decision to the audit log with condition-level breakdown, LLM call results, and throttle state. `evaluate_retention_days` bounds how long records are kept.
- **`rules trace-explain`** — inspect why a rule fired or was blocked. Reads trace records from the audit log; filter by rule name, fire ID, or time window. `--last` shows the most recent evaluation; `--json` outputs structured data for scripting.
- **LLM condition** (`llm:`) — gates rule execution on an AI yes/no judgement. Supports `provider: auto | openai | anthropic`, `cache_ttl` (skip redundant calls for identical context), `budget.max_calls_per_hour`, and `on_error: fail | pass | skip`. Cache key is content-addressed so equivalent prompts + context always hit the cache. `rules lint` flags misconfigurations (missing provider key, TTL too high for trigger frequency, budget zero).
- **`rules simulate`** — replay historical events from the audit log (or a `--against <file>` JSONL snapshot) against a rule and report would-fire / blocked-by-condition / throttled / error counts without starting the live engine. `--since <duration>` bounds the replay window.
- **MCP tools** — `rules_explain` (decision trace lookup) and `rules_simulate` (offline rule replay) exposed via the MCP server alongside existing rule tools.

### Changed

- Test suite: 1959 → 2204 tests (+245, covering trace, explain, LLM condition, and simulate modules).

### Added

Expand Down
60 changes: 52 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -93,7 +93,7 @@ Under the hood every surface shares the same catalog, cache, and HMAC client —
- 🎨 **Dual output modes** — colorized tables by default; `--json` passthrough for `jq` and scripting
- 🔐 **Secure credentials** — HMAC-SHA256 signed requests; config file written with `0600`; env-var override for CI
- 🔍 **Dry-run mode** — preview every mutating request before it hits the API
- 🧪 **Fully tested** — 1959 Vitest tests, mocked axios, zero network in CI
- 🧪 **Fully tested** — 2204 Vitest tests, mocked axios, zero network in CI
- ⚡ **Shell completion** — Bash / Zsh / Fish / PowerShell

## Requirements
Expand Down Expand Up @@ -273,9 +273,9 @@ Five annotated starter files covering common setups live in
With a policy.yaml (v0.2) you can declare automations that the CLI
executes for you. Supported triggers: **MQTT** (device events),
**cron** (schedule-driven), and **webhook** (local HTTP POST).
Supported conditions: `time_between` (quiet hours) and `device_state`
(live API check with per-tick dedup). Every fire is recorded in
`~/.switchbot/audit.log`. `rules run` is long-running; use
Supported conditions: `time_between` (quiet hours), `device_state`
(live API check with per-tick dedup), and `llm` (AI decision — see
below). Every fire is recorded in `~/.switchbot/audit.log`. `rules run` is long-running; use
`daemon start` / `daemon reload` for the managed background mode.

**Actions** — each rule's `then` array accepts two action types:
Expand All @@ -296,6 +296,35 @@ then:
template: '{"rule":"{{ rule.name }}","fired":"{{ rule.fired_at }}"}'
```

**LLM condition** — add an AI judgement step before actions fire. The engine calls the
configured LLM provider, passes the prompt plus recent event context, and gates execution
on the model's yes/no answer:

```yaml
conditions:
- llm:
prompt: "Is the temperature above normal comfort range?"
provider: auto # auto | openai | anthropic
cache_ttl: 5m # skip redundant calls for identical context
budget:
max_calls_per_hour: 20
on_error: pass # fail | pass | skip
```

Set `OPENAI_API_KEY` or `ANTHROPIC_API_KEY` (provider `auto` tries Anthropic first).
`rules lint` flags misconfigured LLM conditions (no provider key, cache TTL too high for
the trigger frequency, budget zero). Evaluation decisions are recorded in the trace log.

**Decision trace** — enable `automation.audit.evaluate_trace` in `policy.yaml` to record
every evaluation decision (why a rule fired or was blocked):

```yaml
automation:
audit:
evaluate_trace: sampled # full | sampled | off (default: sampled)
evaluate_retention_days: 7
```

```bash
# 1. Author rules under `automation.rules`. See examples/policies/automation.yaml
# for a walkthrough covering the three trigger sources.
Expand Down Expand Up @@ -333,6 +362,16 @@ switchbot rules suggest --intent "if door opens and temp below 20 turn on heater
--llm auto # routes complex intents to LLM automatically
switchbot rules suggest --intent "..." --llm openai # explicit backend
# Set OPENAI_API_KEY or ANTHROPIC_API_KEY; auto mode falls back to heuristic on failure

# 9. Explain why a specific evaluation fired or was blocked (requires evaluate_trace).
switchbot rules trace-explain --rule "motion on" --last
switchbot rules trace-explain --rule "motion on" --since 1h --json
switchbot rules trace-explain <fireId> # single evaluation by ID

# 10. Simulate a rule against historical events without running the engine.
switchbot rules simulate "motion on" # replay last 24h from audit log
switchbot rules simulate "motion on" --since 7d --json
switchbot rules simulate policy.yaml --rule "night AC" --against events.jsonl
```

`rules suggest` enforces several guardrails on LLM output so a model can't quietly arm
Expand Down Expand Up @@ -881,12 +920,16 @@ Exposes MCP tools (`list_devices`, `describe_device`, `get_device_status`,
`send_command`, `list_scenes`, `run_scene`, `search_catalog`,
`account_overview`, `plan_suggest`, `plan_run`, `audit_query`,
`audit_stats`, `policy_diff`, `policy_validate`, `policy_new`,
`policy_migrate`, `rules_suggest`, `rule_notifications`) plus a
`policy_migrate`, `rules_suggest`, `rule_notifications`,
`rules_explain`, `rules_simulate`) plus a
`switchbot://events` resource for real-time shadow updates.
`rules_suggest` accepts an optional `llm` parameter (`openai | anthropic | auto`)
to generate YAML for complex intents via an LLM backend.
`rule_notifications` returns `rule-notify` audit entries, filterable by rule
name, time range, channel, and result.
`rules_explain` returns the decision trace for a specific evaluation (why a rule
fired or was blocked); `rules_simulate` replays historical events against a rule
and reports would-fire / blocked / throttled outcomes.
See [`docs/agent-guide.md`](./docs/agent-guide.md) for the full tool reference and safety rules (destructive-command guard).

### `doctor` — self-check
Expand Down Expand Up @@ -1166,7 +1209,7 @@ npm install

npm run dev -- <args> # Run from TypeScript sources via tsx
npm run build # Compile to dist/
npm test # Run the Vitest suite (1959 tests)
npm test # Run the Vitest suite (2204 tests)
npm run test:watch # Watch mode
npm run test:coverage # Coverage report (v8, HTML + text)
```
Expand Down Expand Up @@ -1232,7 +1275,8 @@ src/
│ ├── install.ts # `switchbot install` / `uninstall`
│ ├── policy.ts # `policy validate/new/migrate/diff/add-rule/backup/restore`
│ ├── rules.ts # `rules suggest/lint/list/explain/run/reload/tail/replay/
│ │ # conflicts/doctor/summary/last-fired/webhook-*`
│ │ # conflicts/doctor/summary/last-fired/webhook-*/
│ │ # trace-explain/simulate`
│ ├── scenes.ts
│ ├── health.ts # `health check/serve` — report + HTTP endpoints
│ ├── upgrade-check.ts # `upgrade-check` — npm registry version check
Expand All @@ -1256,7 +1300,7 @@ src/
├── format.ts # renderRows / filterFields / output-format dispatch
├── audit.ts # JSONL audit log writer
└── quota.ts # Local daily-quota counter
tests/ # Vitest suite (1959 tests, mocked axios, no network)
tests/ # Vitest suite (2204 tests, mocked axios, no network)
```

### Release flow
Expand Down
4 changes: 2 additions & 2 deletions package-lock.json

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion package.json
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
{
"name": "@switchbot/openapi-cli",
"version": "3.3.3",
"version": "3.4.0",
"description": "SwitchBot smart home CLI — control devices, run scenes, stream real-time events, and integrate AI agents via MCP. Full API v1.1 coverage.",
"keywords": [
"switchbot",
Expand Down
2 changes: 2 additions & 0 deletions src/commands/capabilities.ts
Original file line number Diff line number Diff line change
Expand Up @@ -200,6 +200,8 @@ export const COMMAND_META: Record<string, CommandMeta> = {
'rules summary': READ_LOCAL,
'rules last-fired': READ_LOCAL,
'rules explain': READ_LOCAL,
'rules trace-explain': READ_LOCAL,
'rules simulate': READ_LOCAL,
'schema export': READ_LOCAL,
'scenes list': READ_REMOTE,
'scenes execute': ACTION_REMOTE,
Expand Down
8 changes: 5 additions & 3 deletions src/commands/doctor.ts
Original file line number Diff line number Diff line change
Expand Up @@ -934,11 +934,13 @@ function checkNotifyConnectivity(): Check {
return { name: 'notify-connectivity', status: 'ok', detail: { present: false, message: 'policy file could not be loaded' } };
}

const policy = loaded.data as { automation?: { rules?: Array<{ then?: Array<{ type?: string; channel?: string; to?: string }> }> } } | null;
const rules = policy?.automation?.rules ?? [];
const policy = loaded.data as { automation?: { rules?: unknown } } | null;
const rawRules = policy?.automation?.rules;
const rules = Array.isArray(rawRules) ? rawRules : [];
const webhookUrls: string[] = [];
for (const rule of rules) {
for (const action of rule.then ?? []) {
const then = (rule as { then?: unknown }).then;
for (const action of (Array.isArray(then) ? then : []) as Array<{ type?: string; channel?: string; to?: string }>) {
if (action.type === 'notify' && (action.channel === 'webhook' || action.channel === 'openclaw') && action.to) {
webhookUrls.push(action.to);
}
Expand Down
152 changes: 152 additions & 0 deletions src/commands/mcp.ts
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,12 @@ import { planMigration } from '../policy/migrate.js';
import { suggestPlan } from './plan.js';
import { suggestRule } from '../rules/suggest.js';
import { addRuleToPolicyFile, AddRuleError } from '../policy/add-rule.js';
import {
loadTraceRecords,
loadRelatedAudit,
formatExplainJson,
} from '../rules/explain.js';
import { simulateRule } from '../rules/simulate.js';
import { allowsDirectDestructiveExecution, destructiveExecutionHint } from '../lib/destructive-mode.js';
import { writeFileSync } from 'node:fs';
import { readAudit, type AuditEntry } from '../utils/audit.js';
Expand Down Expand Up @@ -1972,6 +1978,152 @@ API docs: https://github.com/OpenWonderLabs/SwitchBotAPI`,
},
);

// ---- rules_explain --------------------------------------------------------
server.registerTool(
'rules_explain',
{
title: 'Show why a rule evaluation fired or was blocked',
description:
'Read rule-evaluate trace records from the audit log and format them for inspection. ' +
'Pass fire_id to explain a specific evaluation; or pass rule_name with last:true for the ' +
'most recent evaluation; or pass rule_name + since for a window. ' +
'Returns trace records only when automation.audit.evaluate_trace is "sampled" or "full".',
_meta: { agentSafetyTier: 'read' },
inputSchema: z.object({
fire_id: z.string().optional().describe('Specific fireId to explain.'),
rule_name: z.string().optional().describe('Filter to this rule name.'),
since: z.string().optional().describe('Duration string (e.g. 1h, 7d) — show evaluations in this window.'),
last: z.boolean().optional().describe('Return only the most recent evaluation (requires rule_name).'),
audit_log: z.string().optional().describe(`Audit log path (default: ${pathJoin(os.homedir(), '.switchbot', 'audit.log')}).`),
}).strict(),
outputSchema: {
records: z.array(z.unknown()).describe('Array of trace + relatedAudit objects.'),
count: z.number().describe('Number of trace records returned.'),
},
},
async ({ fire_id, rule_name, since, last, audit_log }) => {
const DEFAULT_AUDIT_PATH = pathJoin(os.homedir(), '.switchbot', 'audit.log');
const auditFile = audit_log ?? DEFAULT_AUDIT_PATH;
const sinceIso = since
? new Date(Date.now() - (parseDurationToMs(since) ?? 0)).toISOString()
: undefined;

let records = loadTraceRecords(auditFile, {
fireId: fire_id,
ruleName: rule_name,
since: sinceIso,
});

if (records.length === 0) {
return {
content: [{ type: 'text' as const, text: 'No rule-evaluate trace records found. Check that automation.audit.evaluate_trace is "sampled" or "full".' }],
structuredContent: { records: [], count: 0 },
};
}

if (last) {
records = [records[records.length - 1]];
}

const output = records.map((record) => {
const related = loadRelatedAudit(auditFile, record.fireId);
return JSON.parse(formatExplainJson(record, related)) as unknown;
});

return {
content: [{ type: 'text' as const, text: JSON.stringify(output, null, 2) }],
structuredContent: { records: output, count: output.length },
};
},
);

// ---- rules_simulate -------------------------------------------------------
server.registerTool(
'rules_simulate',
{
title: 'Simulate a rule against historical events',
description:
'Replay historical events from the audit log or a JSONL file against a rule definition ' +
'and report would-fire / blocked-by-condition / throttled outcomes. ' +
'Useful for validating a new or modified rule before deployment. ' +
'Pass rule_yaml to test an unpublished rule, or rule_name + policy_path to test a deployed rule.',
_meta: { agentSafetyTier: 'read' },
inputSchema: z.object({
rule_yaml: z.string().optional().describe('Standalone rule YAML (takes precedence over policy_path + rule_name).'),
policy_path: z.string().optional().describe('Path to policy.yaml (defaults to ~/.switchbot/policy.yaml).'),
rule_name: z.string().optional().describe('Name of the rule in policy.yaml to simulate.'),
since: z.string().optional().describe('Replay events from this window (e.g. 7d, 24h).'),
against: z.string().optional().describe('JSONL file path of EngineEvent objects to replay.'),
live_llm: z.boolean().optional().describe('Allow live LLM calls for llm conditions (default: skip and report as would-call).'),
audit_log: z.string().optional().describe(`Audit log path (default: ${pathJoin(os.homedir(), '.switchbot', 'audit.log')}).`),
}).strict(),
outputSchema: {
report: z.unknown().describe('SimulateReport object.'),
},
},
async ({ rule_yaml, policy_path, rule_name, since, against, live_llm, audit_log }) => {
const DEFAULT_AUDIT_PATH = pathJoin(os.homedir(), '.switchbot', 'audit.log');
const auditFile = audit_log ?? DEFAULT_AUDIT_PATH;

let rule: Record<string, unknown> | undefined;

if (rule_yaml) {
try {
rule = yamlParse(rule_yaml) as Record<string, unknown>;
} catch (err) {
return {
content: [{ type: 'text' as const, text: `Failed to parse rule_yaml: ${String(err)}` }],
structuredContent: { report: null },
};
}
} else if (policy_path || rule_name) {
const { loadPolicyFile } = await import('../policy/load.js');
const policyFile = policy_path ?? pathJoin(os.homedir(), '.switchbot', 'policy.yaml');
try {
const policy = loadPolicyFile(policyFile);
const data = (policy.data ?? {}) as { automation?: { rules?: Array<{ name: string }> } };
const found = data.automation?.rules?.find((r) => r.name === rule_name);
if (!found) {
return {
content: [{ type: 'text' as const, text: `Rule "${rule_name}" not found in ${policyFile}.` }],
structuredContent: { report: null },
};
}
rule = found as unknown as Record<string, unknown>;
} catch (err) {
return {
content: [{ type: 'text' as const, text: `Failed to load policy: ${String(err)}` }],
structuredContent: { report: null },
};
}
} else {
return {
content: [{ type: 'text' as const, text: 'Provide rule_yaml or (policy_path + rule_name) to specify the rule to simulate.' }],
structuredContent: { report: null },
};
}

try {
const report = await simulateRule({
rule: rule as unknown as Parameters<typeof simulateRule>[0]['rule'],
since,
against,
auditLog: auditFile,
liveLlm: live_llm ?? false,
});
return {
content: [{ type: 'text' as const, text: JSON.stringify(report, null, 2) }],
structuredContent: { report },
};
} catch (err) {
return {
content: [{ type: 'text' as const, text: `Simulate error: ${String(err)}` }],
structuredContent: { report: null },
};
}
},
);

// ---- policy_add_rule ------------------------------------------------------
server.registerTool(
'policy_add_rule',
Expand Down
Loading
Loading