diff --git a/.agents/skills/building-dashboards/.meta/.gitkeep b/.agents/skills/building-dashboards/.meta/.gitkeep
new file mode 100644
index 00000000..e69de29b
diff --git a/.agents/skills/building-dashboards/README.md b/.agents/skills/building-dashboards/README.md
new file mode 100644
index 00000000..5cfd110e
--- /dev/null
+++ b/.agents/skills/building-dashboards/README.md
@@ -0,0 +1,91 @@
+# building-dashboards
+
+Designs and builds Axiom dashboards via API. Covers chart types, APL patterns, SmartFilters, layout, and configuration options.
+
+## What It Does
+
+- **Dashboard Design** - Blueprint structure: at-a-glance stats, trends, breakdowns, evidence
+- **Chart Types** - Statistic, TimeSeries, Table, Pie, LogStream, Heatmap, SmartFilter, Note
+- **APL + Metrics/MPL Patterns** - Golden signals, percentiles, error rates, and metrics chart queries via `query.apl`
+- **Layout Composition** - Grid-based layouts with section templates
+- **Deployment** - Scripts to validate, create, update, and manage dashboards
+
+## Installation
+
+```bash
+npx skills add axiomhq/skills
+```
+
+## Prerequisites
+
+- `axiom-sre` skill (for API access and schema discovery)
+- `query-metrics` skill (for metrics dataset/metric/tag discovery; also vendored locally in `scripts/metrics/`)
+- Tools: `jq`, `curl`
+
+The install command above includes all skill dependencies.
+
+## Configuration
+
+Create `~/.axiom.toml` with your Axiom deployment(s):
+
+```toml
+[deployments.prod]
+url = "https://api.axiom.co"
+token = "xaat-your-api-token"
+org_id = "your-org-id"
+```
+
+- **`org_id`** - The organization ID. Get it from Settings → Organization.
+- **`token`** - Use an advanced API token with minimal privileges.
+
+**Tip:** Run `scripts/setup` from the `axiom-sre` skill for interactive configuration.
+
+## Usage
+
+```bash
+# Setup and check requirements
+scripts/setup
+
+# Create dashboard from template
+scripts/dashboard-from-template service-overview "my-service" "my-dataset" ./dashboard.json
+
+# Validate dashboard JSON
+scripts/dashboard-validate ./dashboard.json
+
+# Deploy dashboard
+scripts/dashboard-create <deployment> ./dashboard.json
+
+# List, update, delete
+scripts/dashboard-list <deployment>
+scripts/dashboard-update <deployment> <id> <file>
+scripts/dashboard-chart-patch <deployment> <id> <chart-id> <patch-file> --version <version>
+scripts/dashboard-delete <deployment> <id>
+```
+
+## Scripts
+
+| Script | Purpose |
+|--------|---------|
+| `dashboard-create` | Deploy new dashboard |
+| `dashboard-validate` | Validate JSON structure |
+| `dashboard-list` | List all dashboards |
+| `dashboard-get` | Fetch dashboard JSON |
+| `dashboard-update` | Update existing dashboard |
+| `dashboard-chart-patch` | Patch one chart in an existing dashboard |
+| `dashboard-copy` | Clone a dashboard |
+| `dashboard-delete` | Delete with confirmation |
+| `dashboard-from-template` | Generate from template |
+
+## Templates
+
+Pre-built templates in `reference/templates/`:
+- `service-overview.json` - Single service oncall dashboard
+- `service-overview-with-filters.json` - With SmartFilter dropdowns
+- `api-health.json` - HTTP API health dashboard
+- `blank.json` - Minimal skeleton
+
+## Related Skills
+
+- `axiom-sre` - Schema discovery and query exploration
+- `query-metrics` - Discover metric names, tags, and tag values for MPL queries
+- `spl-to-apl` - Translate Splunk dashboards to Axiom
diff --git a/.agents/skills/building-dashboards/SKILL.md b/.agents/skills/building-dashboards/SKILL.md
new file mode 100644
index 00000000..83dbb32a
--- /dev/null
+++ b/.agents/skills/building-dashboards/SKILL.md
@@ -0,0 +1,655 @@
+---
+name: building-dashboards
+description: Designs and builds Axiom dashboards via API. Covers chart types, APL and metrics/MPL query patterns, SmartFilters, layout, and configuration options. Use when creating dashboards, migrating from Splunk, or configuring chart options.
+---
+
+# Building Dashboards
+
+You design dashboards that help humans make decisions quickly. Dashboards are products: audience, questions, and actions matter more than chart count.
+
+## Philosophy
+
+1. **Decisions first.** Every panel answers a question that leads to an action.
+2. **Overview → drilldown → evidence.** Start broad, narrow on click/filter, end with raw logs.
+3. **Rates and percentiles over averages.** Averages hide problems; p95/p99 expose them.
+4. **Simple beats dense.** One question per panel. No chart junk.
+5. **Validate with data.** Never guess fields—discover schema first.
+
+---
+
+## Entry Points
+
+Choose your starting point:
+
+| Starting from | Workflow |
+|---------------|----------|
+| **Vague description** | Intake → check dataset kind → design blueprint (APL or MPL) → queries per panel → deploy |
+| **Template** | Pick template → customize dataset/service/env → deploy |
+| **Splunk dashboard** | Extract SPL → translate via spl-to-apl → map to chart types → deploy |
+| **Exploration** | Use axiom-sre to discover schema/signals → productize into panels |
+
+---
+
+## Intake: What to Ask First
+
+Before designing, clarify:
+
+1. **Audience & decision**
+   - Oncall triage? (fast refresh, error-focused)
+   - Team health? (daily trends, SLO tracking)
+   - Exec reporting? (weekly summaries, high-level)
+
+2. **Scope**
+   - Service, environment, region, cluster, endpoint?
+   - Single service or cross-service view?
+
+3. **Dataset kind (mandatory first step)**
+   - Run `scripts/metrics/datasets <deploy>` to identify each dataset's `kind`
+   - **If `kind` is `otel:metrics:v1`** → this is a metrics dataset. Follow the **Metrics path** below.
+   - **Otherwise** → this is an events/logs dataset. Follow the **APL path** below.
+
+   > **⚠️ NEVER run `getschema` on a metrics dataset.** APL queries against `otel:metrics:v1` datasets return 0 rows without error — you will waste calls widening time ranges before realizing it's the wrong discovery method.
+
+   **APL path** (events/logs datasets):
+   - Discover fields with `getschema`:
+   ```apl
+   ['dataset'] | where _time between (ago(1h) .. now()) | getschema
+   ```
+   - Continue to steps 4–5 below.
+
+   **Metrics path** (`otel:metrics:v1` datasets):
+   - Run `scripts/metrics/metrics-spec <deploy> <dataset>` — **mandatory before composing any MPL query**
+   - Discover available metrics: `scripts/metrics/metrics-info <deploy> <dataset> metrics`
+   - Discover tags: `scripts/metrics/metrics-info <deploy> <dataset> tags`
+   - Explore tag values: `scripts/metrics/metrics-info <deploy> <dataset> tags <tag> values`
+   - If discovery returns empty results, retry with `--start` set to 7 days ago — sparse metrics (sensors, batch jobs, crons) may not have data in the default 24h window
+   - `find-metrics <value>` searches **tag values**, not metric names — use it only when you know a specific entity name (service, host, device) to find which metrics are associated with it
+   - Skip to the **Metrics/MPL Blueprint** below for panel design.
+
+4. **Golden signals** (APL path)
+   - Traffic: requests/sec, events/min
+   - Errors: error rate, 5xx count
+   - Latency: p50, p95, p99 duration
+   - Saturation: CPU, memory, queue depth, connections
+
+5. **Drilldown dimensions** (APL path)
+   - What do users filter/group by? (service, route, status, pod, customer_id)
+
+---
+
+## Dashboard Blueprint
+
+Choose the blueprint that matches your dataset kind (identified in Intake step 3).
+
+### APL Blueprint (events/logs datasets)
+
+#### 1. At-a-Glance (Statistic panels)
+Single numbers that answer "is it broken right now?"
+- Error rate (last 5m)
+- p95 latency (last 5m)
+- Request rate (last 5m)
+- Active alerts (if applicable)
+
+#### 2. Trends (TimeSeries panels)
+Time-based patterns that answer "what changed?"
+- Traffic over time
+- Error rate over time
+- Latency percentiles over time
+- Stacked by status/service for comparison
+
+#### 3. Breakdowns (Table/Pie panels)
+Top-N analysis that answers "where should I look?"
+- Top 10 failing routes
+- Top 10 error messages
+- Worst pods by error rate
+- Request distribution by status
+
+#### 4. Evidence (LogStream + SmartFilter)
+Raw events that answer "what exactly happened?"
+- LogStream filtered to errors
+- SmartFilter for service/env/route
+- Key fields projected for readability
+
+### Metrics/MPL Blueprint (metrics datasets)
+
+> **Prerequisite:** You MUST have run `scripts/metrics/metrics-spec` and `scripts/metrics/metrics-info` before designing panels. Never guess MPL syntax or metric/tag names.
+
+> **🚨 ALIGNMENT RULE — non-negotiable for dashboard panels:** Always align to the dashboard-supplied variable `$__interval`, not a fixed window. The dashboard runtime substitutes `$__interval` based on the active time range and panel width, so the same chart stays usable from a 5-minute to a 30-day view. Hard-coding `align to 1m` (or any constant) over-resolves long ranges and under-resolves short ones.
+>
+> ```mpl
+> | align to $__interval using avg   ✅ dashboard panels
+> | align to 1m using avg            ❌ fixed window — wrong granularity at most time ranges
+> ```
+>
+> **No `param` declaration needed in the chart `query.apl`** — the dashboard runtime injects `param $__interval: Duration;` automatically. (The Grafana datasource does the same via a preamble; the Axiom-native dashboard runtime behaves identically — verified against working production dashboards.)
+>
+> **Exceptions:** If you are pre-validating a query through `scripts/metrics/metrics-query` (which has no dashboard runtime), substitute a concrete duration for the test call only — do NOT commit that to the chart JSON. For genuinely sparse metrics where `$__interval` would round to an empty bucket (sensors, batch jobs, crons), a fixed wider window (e.g. `1h`) is acceptable; document why in the chart description.
+
+#### 1. At-a-Glance (Statistic panels)
+Current values for key metrics — answer "what's the state right now?"
+- Latest value of primary metrics (e.g., current temperature, power draw)
+- Use `group using avg` or `group using last` depending on metric type (gauge vs counter)
+
+#### 2. Trends (TimeSeries panels)
+Metric trends over time — answer "what changed?"
+- Primary metrics over time, grouped by key dimension
+- Use `align to $__interval using avg|sum|last` for proper time bucketing — `$__interval` is supplied by the dashboard runtime
+- Group by low-cardinality tags only (≤10 series per chart)
+
+#### 3. Breakdowns (TimeSeries or Table panels)
+Per-entity detail — answer "where should I look?"
+- Metrics broken down by entity (room, host, pod, service)
+- Filter by tag values to keep series count manageable
+- Use separate panels per dimension rather than one overloaded chart
+
+#### 4. Entity State (TimeSeries or Table panels)
+Boolean/state metrics — answer "what is on/off/active?"
+- Use `align to $__interval using last` for state metrics
+- Sparse metrics may need wider **fixed** align intervals (1h+) to show data — this is the documented exception to the `$__interval` rule
+
+---
+
+## Layout Auto-Normalization
+
+The console uses `react-grid-layout` which requires `minH`, `minW`, `moved`, and `static` on every layout entry. The `dashboard-create` and `dashboard-update` scripts auto-fill these if omitted, so layout entries only need `i`, `x`, `y`, `w`, `h`.
+
+---
+
+## Required Chart Structure
+
+**Every chart MUST have a unique `id` field.** Every layout entry's `i` field MUST reference a chart `id`. Missing or mismatched IDs will corrupt the dashboard in the UI (blank state, unable to save/revert).
+
+```json
+{
+  "charts": [
+    {
+      "id": "error-rate",
+      "name": "Error Rate",
+      "type": "Statistic",
+      "query": { "apl": "..." }
+    }
+  ],
+  "layout": [
+    {"i": "error-rate", "x": 0, "y": 0, "w": 3, "h": 2}
+  ]
+}
+```
+
+Use descriptive kebab-case IDs (e.g. `error-rate`, `p95-latency`, `traffic-rps`). The `dashboard-validate` and deploy scripts enforce this automatically.
+
+---
+
+## Metrics/MPL Chart Contract
+
+Metrics-backed charts require both `query.apl` (the MPL pipeline string) and `query.metricsDataset` (the dataset name). The `metricsDataset` field is what tells the backend to interpret `apl` as MPL rather than APL — omitting it causes the chart to misbehave even if the pipeline string is well-formed.
+
+> **CRITICAL:** Run `scripts/metrics/metrics-spec <deployment> <dataset>` before composing your first MPL query in a session. NEVER guess MPL syntax.
+>
+> **API gotcha:** Set `query.metricsDataset` to the dataset name (e.g. `"otel-metrics"`). The create API rejects `query.mpl` even though GET responses for existing metrics dashboards may include it — put the MPL string in `query.apl` instead.
+
+```json
+{
+  "type": "TimeSeries",
+  "query": {
+    "apl": "`otel-metrics`:`http.server.duration`\n| where `service.name` == \"api\"\n| align to $__interval using avg\n| group by `service.name` using avg",
+    "metricsDataset": "otel-metrics"
+  }
+}
+```
+
+Validate queries with `scripts/metrics/metrics-query` before embedding in dashboard JSON.
+
+See `reference/metrics-mpl.md` for the full contract and discovery scripts.
+
+---
+
+## Chart Types
+
+**Note:** Dashboard queries inherit time from the UI picker—no explicit `_time` filter needed.
+
+**Validation:** TimeSeries, Statistic, Table, Pie, LogStream, Note, MonitorList are fully validated by `dashboard-validate`. Heatmap, ScatterPlot, SmartFilter work but may trigger warnings.
+
+### Statistic
+**When:** Single KPI, current value, threshold comparison.
+
+```apl
+['logs']
+| where service == "api"
+| summarize 
+    total = count(),
+    errors = countif(status >= 500)
+| extend error_rate = round(100.0 * errors / total, 2)
+| project error_rate
+```
+
+**Pitfalls:** Don't use for time series; ensure query returns single row.
+
+### TimeSeries
+**When:** Trends over time, before/after comparison, rate changes.
+
+```apl
+// Single metric - use bin_auto for automatic sizing
+['logs']
+| summarize ['req/min'] = count() by bin_auto(_time)
+
+// Latency percentiles - use percentiles_array for proper overlay
+['logs']
+| summarize percentiles_array(duration_ms, 50, 95, 99) by bin_auto(_time)
+```
+
+**Best practices:**
+- Use `bin_auto(_time)` instead of fixed `bin(_time, 1m)` — auto-adjusts to time window
+- Use `percentiles_array()` instead of multiple `percentile()` calls — renders as one chart
+- Too many series = unreadable; use `top N` or filter
+
+### Table
+**When:** Top-N lists, detailed breakdowns, exportable data.
+
+```apl
+['logs']
+| where status >= 500
+| summarize errors = count() by route, error_message
+| top 10 by errors
+| project route, error_message, errors
+```
+
+**Pitfalls:**
+- Always use `top N` to prevent unbounded results
+- Use `project` to control column order and names
+
+### Pie
+**When:** Share-of-total for LOW cardinality dimensions (≤6 slices).
+
+```apl
+['logs']
+| summarize count() by status_class = case(
+    status < 300, "2xx",
+    status < 400, "3xx",
+    status < 500, "4xx",
+    "5xx"
+  )
+```
+
+**Pitfalls:**
+- Never use for high cardinality (routes, user IDs)
+- Prefer tables for >6 categories
+- Always aggregate to reduce slices
+
+### LogStream
+**When:** Raw event inspection, debugging, evidence gathering.
+
+```apl
+['logs']
+| where service == "api" and status >= 500
+| project-keep _time, trace_id, route, status, error_message, duration_ms
+| take 100
+```
+
+**Pitfalls:**
+- Always include `take N` (100-500 max)
+- Use `project-keep` to show relevant fields only
+- Filter aggressively—raw logs are expensive
+
+### Heatmap
+**When:** Distribution visualization, latency patterns, density analysis.
+
+```apl
+['logs']
+| summarize histogram(duration_ms, 15) by bin_auto(_time)
+```
+
+**Best for:** Latency distributions, response time patterns, identifying outliers.
+
+### Scatter Plot
+**When:** Correlation between two metrics, identifying patterns.
+
+```apl
+['logs']
+| summarize avg(duration_ms), avg(resp_size_bytes) by route
+```
+
+**Best for:** Response size vs latency correlation, resource usage patterns.
+
+### SmartFilter (Filter Bar)
+**When:** Interactive filtering for the entire dashboard.
+
+SmartFilter is a **chart type** that creates dropdown/search filters. Requires:
+1. A `SmartFilter` chart with filter definitions
+2. `declare query_parameters` in each panel query
+
+**Filter types:**
+- `selectType: "apl"` — Dynamic dropdown from APL query
+- `selectType: "list"` — Static dropdown with predefined options
+- `type: "search"` — Free-text input
+
+**Panel query pattern:**
+```apl
+declare query_parameters (country_filter:string = "");
+['logs'] | where isempty(country_filter) or ['geo.country'] == country_filter
+```
+
+See `reference/smartfilter.md` for full JSON structure and cascading filter examples.
+
+### Monitor List
+**When:** Display monitor status on operational dashboards.
+
+No APL needed—select monitors from the UI. Shows:
+- Monitor status (normal/triggered/off)
+- Run history (green/red squares)
+- Dataset, type, notifiers
+
+### Note
+**When:** Context, instructions, section headers.
+
+Use GitHub Flavored Markdown for:
+- Dashboard purpose and audience
+- Runbook links
+- Section dividers
+- On-call instructions
+
+---
+
+## Chart Configuration
+
+Charts support JSON configuration options beyond the query. See `reference/chart-config.md` for full details.
+
+**Quick reference:**
+
+| Chart Type | Key Options |
+|------------|-------------|
+| Statistic | `colorScheme`, `customUnits`, `unit`, `showChart` (sparkline), `errorThreshold`/`warningThreshold` |
+| TimeSeries | `aggChartOpts`: `variant` (line/area/bars), `scaleDistr` (linear/log), `displayNull` |
+| LogStream/Table | `tableSettings`: `columns`, `fontSize`, `highlightSeverity`, `wrapLines` |
+| Pie | `hideHeader` |
+| Note | `text` (markdown), `variant` |
+
+**Common options (all charts):**
+- `overrideDashboardTimeRange`: boolean
+- `overrideDashboardCompareAgainst`: boolean  
+- `hideHeader`: boolean
+
+---
+
+## APL Patterns
+
+### Time Filtering in Dashboards vs Ad-hoc Queries
+
+**Dashboard panel queries do NOT need explicit time filters.** The dashboard UI time picker automatically scopes all queries to the selected time window.
+
+```apl
+// DASHBOARD QUERY — no time filter needed
+['logs']
+| where service == "api"
+| summarize count() by bin_auto(_time)
+```
+
+**Ad-hoc queries (Axiom Query tab, axiom-sre exploration) MUST have explicit time filters:**
+
+```apl
+// AD-HOC QUERY — always include time filter
+['logs']
+| where _time between (ago(1h) .. now())
+| where service == "api"
+| summarize count() by bin_auto(_time)
+```
+
+### Bin Size Selection
+
+**Prefer `bin_auto(_time)`** — it automatically adjusts to the dashboard time window.
+
+Manual bin sizes (only when auto doesn't fit your needs):
+
+| Time window | Bin size |
+|-------------|----------|
+| 15m | 10s–30s |
+| 1h | 1m |
+| 6h | 5m |
+| 24h | 15m–1h |
+| 7d | 1h–6h |
+
+### Cardinality Guardrails
+Prevent query explosion:
+
+```apl
+// GOOD: bounded
+| summarize count() by route | top 10 by count_
+
+// BAD: unbounded high-cardinality grouping
+| summarize count() by user_id  // millions of rows
+```
+
+### Field Escaping
+Fields with dots need bracket notation:
+
+```apl
+| where ['kubernetes.pod.name'] == "frontend"
+```
+
+Fields with dots IN the name (not hierarchy) need escaping:
+
+```apl
+| where ['kubernetes.labels.app\\.kubernetes\\.io/name'] == "frontend"
+```
+
+### Golden Signal Queries
+
+**Traffic:**
+```apl
+| summarize requests = count() by bin_auto(_time)
+```
+
+**Errors (as rate %):**
+```apl
+| summarize total = count(), errors = countif(status >= 500) by bin_auto(_time)
+| extend error_rate = iff(total > 0, round(100.0 * errors / total, 2), 0.0)
+| project _time, error_rate
+```
+
+**Latency (use percentiles_array for proper chart overlay):**
+```apl
+| summarize percentiles_array(duration_ms, 50, 95, 99) by bin_auto(_time)
+```
+
+---
+
+## Layout Composition
+
+### Grid Principles
+- Dashboard width = 12 units
+- Typical panel: w=3 (quarter), w=4 (third), w=6 (half), w=12 (full)
+- Stats row: 4 panels × w=3, h=2
+- TimeSeries row: 2 panels × w=6, h=4
+- Tables: w=6 or w=12, h=4–6
+- LogStream: w=12, h=6–8
+
+### Section Layout Pattern
+
+```
+Row 0-1:  [Stat w=3] [Stat w=3] [Stat w=3] [Stat w=3]
+Row 2-5:  [TimeSeries w=6, h=4] [TimeSeries w=6, h=4]
+Row 6-9:  [Table w=6, h=4] [Pie w=6, h=4]
+Row 10+:  [LogStream w=12, h=6]
+```
+
+### Naming Conventions
+- Use question-style titles: "Error rate by route" not "Errors"
+- Prefix with context if multi-service: "[API] Error rate"
+- Include units: "Latency (ms)", "Traffic (req/s)"
+
+---
+
+## Dashboard Settings
+
+### Refresh Rate
+Dashboard auto-refreshes at configured interval. Options: 15s, 30s, 1m, 5m, etc.
+
+**⚠️ Query cost warning:** Short refresh (15s) + long time range (90d) = expensive queries running constantly.
+
+Recommendations:
+| Use case | Refresh rate |
+|----------|-------------|
+| Oncall/real-time | 15s–30s |
+| Team health | 1m–5m |
+| Executive/weekly | 5m–15m |
+
+### Sharing
+All dashboards created via API tokens are shared with everyone in the org (`owner: "X-AXIOM-EVERYONE"`). Private dashboards are not supported with API tokens.
+
+Data visibility is still governed by dataset permissions—users only see data from datasets they can access.
+
+### URL Time Range Parameters
+
+`?t_qr=24h` (quick range), `?t_ts=...&t_te=...` (custom), `?t_against=-1d` (comparison)
+
+---
+
+## Setup
+
+Run `scripts/setup` to check requirements (curl, jq, ~/.axiom.toml).
+
+Config in `~/.axiom.toml` (shared with axiom-sre):
+```toml
+[deployments.prod]
+url = "https://api.axiom.co"
+token = "xaat-your-token"
+org_id = "your-org-id"
+```
+
+---
+
+## Deployment
+
+### Scripts
+
+| Script | Usage |
+|--------|-------|
+| `scripts/dashboard-list <deploy>` | List all dashboards |
+| `scripts/dashboard-get <deploy> <id>` | Fetch dashboard JSON |
+| `scripts/dashboard-validate <file>` | Validate JSON structure |
+| `scripts/dashboard-create <deploy> <file>` | Create dashboard |
+| `scripts/dashboard-update <deploy> <id> <file>` | Update (needs version) |
+| `scripts/dashboard-chart-patch <deploy> <id> <chart-id> <patch-file> (--version <version> \| --overwrite)` | Patch one chart |
+| `scripts/dashboard-copy <deploy> <id>` | Clone dashboard |
+| `scripts/dashboard-link <deploy> <id>` | Get shareable URL |
+| `scripts/dashboard-delete <deploy> <id>` | Delete (with confirm) |
+| `scripts/axiom-api <deploy> <method> <path>` | **Dashboard/app API only** (rewrites to `app.*`). For data/metrics endpoints use `scripts/metrics/axiom-api` |
+| `scripts/metrics/axiom-api <deploy> <method> <path>` | **Data/metrics API** (supports `AXIOM_URL_OVERRIDE` for edge routing) |
+| `scripts/metrics/datasets <deploy>` | List datasets with `kind` and edge deployment |
+| `scripts/metrics/metrics-spec <deploy> <dataset>` | Fetch MPL query specification |
+| `scripts/metrics/metrics-info <deploy> <dataset> ...` | Discover metrics, tags, and values |
+| `scripts/metrics/metrics-query <deploy> <mpl> <start> <end>` | Execute a metrics query |
+
+> **⚠️ Two `axiom-api` scripts exist with different behaviors.** `scripts/axiom-api` rewrites URLs for the dashboard app API (`app.*`). `scripts/metrics/axiom-api` uses raw URLs and supports edge deployment routing. Using the wrong one will produce 404 errors.
+
+### Targeted Chart Updates
+
+Use `scripts/dashboard-chart-patch` when changing one existing chart and the dashboard layout, metadata, and other charts should remain untouched. It calls `PATCH /v2/dashboards/uid/{uid}/charts/{chartId}` with a JSON Merge Patch under the `chart` request field.
+
+Patch files contain only the chart fields to change:
+
+```json
+{
+  "name": "Error Rate (5m)",
+  "query": { "apl": "['logs'] | summarize errors=countif(status >= 500)" },
+  "config": { "stale": null }
+}
+```
+
+`null` removes an existing field. Nested objects merge recursively. If `id` is present in the patch, it must match the `<chart-id>` path argument. The server validates the resulting full dashboard before saving.
+
+Use `--version <version>` for optimistic concurrency after fetching the dashboard with `dashboard-get`. Use `--overwrite` only when last-write-wins behavior is intended. Continue using `dashboard-update` for layout changes, multi-chart edits, dashboard metadata, owner, refresh interval, or time window updates.
+
+### Workflow
+
+**⚠️ CRITICAL: Always validate queries BEFORE deploying.**
+
+**APL workflow:**
+1. Design dashboard (sections + panels)
+2. Write APL for each panel
+3. Build JSON (from template or manually)
+4. **Validate queries** using axiom-sre with explicit time filter
+5. `dashboard-validate` to check structure
+6. `dashboard-create` or `dashboard-update` to deploy
+7. **`dashboard-link` to get URL** — NEVER construct Axiom URLs manually (org IDs and base URLs vary per deployment)
+8. Share link with user
+
+**Metrics/MPL workflow:**
+1. Run `scripts/metrics/metrics-spec` to learn MPL syntax
+2. Run `scripts/metrics/metrics-info` to discover metrics and tags
+3. Design dashboard using the Metrics/MPL Blueprint
+4. Write MPL for each panel
+5. **Validate queries** with `scripts/metrics/metrics-query` using explicit time range
+6. Build JSON: put the full MPL string in `query.apl` AND set `query.metricsDataset` to the dataset name (required — denotes the chart as MPL). Do not set `query.mpl` (rejected by create API).
+7. `dashboard-validate` to check structure
+8. `dashboard-create` or `dashboard-update` to deploy
+9. **`dashboard-link` to get URL**
+10. Share link with user
+
+---
+
+## Sibling Skill Integration
+
+**spl-to-apl:** Translate Splunk SPL → APL. Map `timechart` → TimeSeries, `stats` → Statistic/Table. See `reference/splunk-migration.md`.
+
+**axiom-sre:** Discover schema with `getschema`, explore baselines, identify dimensions, then productize into panels.
+
+**query-metrics:** Discover metrics datasets, metric names, tags, and tag values. Metrics discovery scripts are also vendored locally in `scripts/metrics/`.
+
+---
+
+## Templates
+
+Pre-built templates in `reference/templates/`:
+
+| Template | Use case |
+|----------|----------|
+| `service-overview.json` | Single service oncall dashboard with Heatmap |
+| `service-overview-with-filters.json` | Same with SmartFilter (route/status dropdowns) |
+| `api-health.json` | HTTP API with traffic/errors/latency |
+| `blank.json` | Minimal skeleton |
+
+**Placeholders:** `{{service}}`, `{{dataset}}`
+
+**Usage:**
+```bash
+scripts/dashboard-from-template service-overview "my-service" "my-dataset" ./dashboard.json
+scripts/dashboard-validate ./dashboard.json
+scripts/dashboard-create prod ./dashboard.json
+```
+
+**⚠️ Templates assume field names** (`service`, `status`, `route`, `duration_ms`). Discover your schema first and use `sed` to fix mismatches.
+
+---
+
+## Common Pitfalls
+
+| Problem | Cause | Solution |
+|---------|-------|----------|
+| "unable to find dataset" errors | Dataset name doesn't exist in your org | Check available datasets in Axiom UI |
+| "creating private dashboards" 403 | API tokens can only create shared dashboards | Use `owner: "X-AXIOM-EVERYONE"` (the default) |
+| All panels show errors | Field names don't match your schema | Discover schema first, use sed to fix field names |
+| Dashboard shows no data | Service filter too restrictive | Remove or adjust `where service == 'x'` filters |
+| Queries time out | Missing time filter or too broad | Dashboard inherits time from picker; ad-hoc queries need explicit time filter |
+| Wrong org in dashboard URL | Manually constructed URL | **Always use `dashboard-link <deploy> <id>`** — never guess org IDs or base URLs |
+| `getschema` returns 0 rows | Dataset is `otel:metrics:v1`, not events | Run `scripts/metrics/datasets <deploy>` to check kind; use `scripts/metrics/metrics-info` for metrics discovery |
+| Metrics discovery returns empty | Sparse metrics (sensors, batch, cron) outside default 24h window | Retry with `--start` set to 7 days ago; some metrics only report intermittently |
+| 404 from metrics API calls | Used `scripts/axiom-api` (dashboard) instead of `scripts/metrics/axiom-api` (data) | Use `scripts/metrics/axiom-api` for all `/v1/query/`, `/v1/datasets` paths |
+| `find-metrics` returns unexpected results | It searches tag values, not metric names | Use `metrics-info <deploy> <dataset> metrics` to list metric names; `find-metrics` finds metrics associated with a known tag value |
+| Metrics chart renders blank or wrong values | Missing `query.metricsDataset` — backend treats `apl` as APL, not MPL | Set `query.metricsDataset` to the dataset name alongside `query.apl` |
+| `query.mpl` rejected on create | GET may return `query.mpl` for existing metrics charts, but create expects `query.apl` | Move/copy the MPL string into `query.apl` before deploy |
+| `decimals` rejected on create | Create API does not accept chart-level `decimals` even though GET may return it | Omit `decimals` from create payloads |
+
+---
+
+## Reference
+
+- `reference/chart-config.md` — All chart configuration options (JSON)
+- `reference/metrics-mpl.md` — Metrics/MPL chart contract and discovery scripts
+- `reference/smartfilter.md` — SmartFilter/FilterBar full configuration
+- `reference/chart-cookbook.md` — APL patterns per chart type
+- `reference/layout-recipes.md` — Grid layouts and section blueprints
+- `reference/splunk-migration.md` — Splunk panel → Axiom mapping
+- `reference/design-playbook.md` — Decision-first design principles
+- `reference/templates/` — Ready-to-use dashboard JSON files
+
+For APL syntax: https://axiom.co/docs/apl/introduction
diff --git a/.agents/skills/building-dashboards/reference/chart-config.md b/.agents/skills/building-dashboards/reference/chart-config.md
new file mode 100644
index 00000000..30974821
--- /dev/null
+++ b/.agents/skills/building-dashboards/reference/chart-config.md
@@ -0,0 +1,285 @@
+# Chart Configuration Options
+
+Charts support JSON configuration options beyond the query. These are set at the chart level.
+
+## Common Options (All Charts)
+
+```json
+{
+  "overrideDashboardTimeRange": false,
+  "overrideDashboardCompareAgainst": false,
+  "hideHeader": false
+}
+```
+
+## Metrics/MPL Query (MetricsDB Charts)
+
+Metrics charts require both `query.apl` (the MPL pipeline string) and `query.metricsDataset` (the dataset name, e.g. `"otel-metrics"`). The `metricsDataset` field is what flags the chart as MPL; without it the backend treats `apl` as APL and the chart misbehaves. Do not send `query.mpl` — the create API rejects it. Run `scripts/metrics/metrics-spec` to learn the full syntax before composing queries.
+
+### Minimal Metrics Query
+
+```json
+{
+  "type": "TimeSeries",
+  "query": {
+    "apl": "`otel-metrics`:`system.cpu.utilization`",
+    "metricsDataset": "otel-metrics"
+  }
+}
+```
+
+### Metrics Query with Filters and Transformations
+
+```json
+{
+  "type": "TimeSeries",
+  "query": {
+    "apl": "`otel-metrics`:`http.server.duration`\n| where `service.name` == \"api\"\n| where `deployment.environment` == \"prod\"\n| align to $__interval using avg\n| group by `service.name` using avg",
+    "metricsDataset": "otel-metrics"
+  }
+}
+```
+
+For full contract details, see `reference/metrics-mpl.md`.
+
+## Statistic Options
+
+```json
+{
+  "type": "Statistic",
+  "colorScheme": "Blue",
+  "customUnits": "req/s",
+  "unit": "Auto",
+  "showChart": true,
+  "hideValue": false,
+  "errorThreshold": "Above",
+  "errorThresholdValue": "100",
+  "warningThreshold": "Above",
+  "warningThresholdValue": "50",
+  "invertTheme": false
+}
+```
+
+> **API gotcha:** `decimals` is returned by GET and may appear in existing dashboards, but the create API rejects it. Omit `decimals` from create payloads.
+
+| Option | Values | Description |
+|--------|--------|-------------|
+| `colorScheme` | Blue, Orange, Red, Purple, Teal, Yellow, Green, Pink, Grey, Brown | Color theme |
+| `customUnits` | string | Unit suffix (e.g., "ms", "req/s") |
+| `unit` | Auto, Abbreviated, Byte, KB, MB, GB, TimeMS, TimeSec, Percent, etc. | Value formatting |
+| `decimals` | number | Decimal places in readback/GET payloads; omit on create because the API rejects it |
+| `showChart` | boolean | Show sparkline |
+| `hideValue` | boolean | Hide the main value |
+| `errorThreshold` | Above, AboveOrEqual, Below, BelowOrEqual, AboveOrBelow | Error condition |
+| `errorThresholdValue` | string | Error threshold value |
+| `warningThreshold` | same as error | Warning condition |
+| `warningThresholdValue` | string | Warning threshold value |
+| `invertTheme` | boolean | Invert colors |
+
+### Available Units
+
+- **Numbers**: `Auto`, `Abbreviated`
+- **Data**: `Byte`, `Kilobyte`, `Megabyte`, `Gigabyte`
+- **Data rates**: `BitsSec`, `BytesSec`, `KilobitsSec`, `KilobytesSec`, `MegabitsSec`, `MegabytesSec`, `GigabitsSec`, `GigabytesSec`
+- **Time**: `TimeNS`, `TimeUS`, `TimeMS`, `TimeSec`, `TimeMin`, `TimeHour`, `TimeDay`
+- **Percent**: `Percent` (0-1), `Percent100` (0-100)
+- **Currency**: `CurrencyUSD`, `CurrencyEUR`, `CurrencyGBP`, `CurrencyCAD`, `CurrencyAUD`, `CurrencyJPY`, `CurrencyINR`, `CurrencyCZK`, `CurrencyPLN`
+- **Date**: `DateDateTime`, `DateFromNow`, `DateYYYYMMDDHHmmss`
+
+## TimeSeries Options
+
+TimeSeries chart options are stored in `query.queryOptions.aggChartOpts` as a JSON string.
+
+### Key Formats
+
+**Important:** The `"*"` wildcard is unreliable. Always use the specific key format derived from your query.
+
+#### Deriving the Key
+
+The key format depends on how the column is computed:
+
+| Query Pattern | Key Format |
+|---------------|------------|
+| `summarize count()` | `{"alias":"count_","op":"count"}` |
+| `summarize sum(field)` | `{"alias":"sum_field","op":"sum"}` |
+| `summarize ['Name'] = sum(field) / 1000` | `{"alias":"Name","field":"field","op":"computed"}` |
+| `summarize ['Name'] = round(sum(field), 1)` | `{"alias":"Name","field":"field","op":"computed"}` |
+
+**Rule:** If the column uses any expression (math, `round()`, etc.), use `"op":"computed"` and include the source `"field"`.
+
+#### Simple Aggregation Example
+
+```json
+{
+  "type": "TimeSeries",
+  "query": {
+    "apl": "['logs'] | summarize count() by bin_auto(_time)",
+    "queryOptions": {
+      "aggChartOpts": "{\"{\\\"alias\\\":\\\"count_\\\",\\\"op\\\":\\\"count\\\"}\":{\"variant\":\"bars\"}}"
+    }
+  }
+}
+```
+
+#### Computed Column Example
+
+For `['Ingest GB'] = round(sum(['properties.hourly_ingest_bytes']) / 1e9, 1)`:
+
+```json
+{
+  "aggChartOpts": "{\"{\\\"alias\\\":\\\"Ingest GB\\\",\\\"field\\\":\\\"properties.hourly_ingest_bytes\\\",\\\"op\\\":\\\"computed\\\"}\":{\"variant\":\"bars\",\"displayNull\":\"auto\"}}"
+}
+```
+
+**Note:** The `field` value is the source field name without brackets or the `properties.` prefix path as written in the query.
+
+### View Mode (timeSeriesView)
+
+Controls what the TimeSeries panel displays. Set in `query.queryOptions.timeSeriesView`.
+
+| Value | Description |
+|-------|-------------|
+| `charts` | Chart only (default) |
+| `resultsTable` | Summary totals table only |
+| `charts\|resultsTable` | Chart with totals table below — shows both the time series and an aggregated summary |
+
+```json
+{
+  "type": "TimeSeries",
+  "query": {
+    "apl": "['logs'] | summarize count() by bin_auto(_time), service",
+    "queryOptions": {
+      "timeSeriesView": "charts|resultsTable"
+    }
+  }
+}
+```
+
+### Per-Series Options (inside aggChartOpts)
+
+| Option | Values | Description |
+|--------|--------|-------------|
+| `variant` | `line`, `area`, `bars` | Chart display mode |
+| `scaleDistr` | `linear`, `log` | Y-axis scale |
+| `displayNull` | `auto`, `null`, `span`, `zero` | Missing data handling |
+
+### displayNull Values
+
+- `auto`: Best representation based on chart type
+- `null`: Skip/ignore missing values (gaps in chart)
+- `span`: Join adjacent values across gaps
+- `zero`: Fill missing with zeros
+
+## LogStream / Table Options
+
+```json
+{
+  "type": "LogStream",
+  "tableSettings": {
+    "columns": [
+      {"name": "_time", "width": 150},
+      {"name": "message", "width": 400}
+    ],
+    "settings": {
+      "fontSize": "12px",
+      "highlightSeverity": true,
+      "showRaw": true,
+      "showEvent": true,
+      "showTimestamp": true,
+      "wrapLines": true,
+      "hideNulls": true
+    }
+  }
+}
+```
+
+| Option | Type | Description |
+|--------|------|-------------|
+| `columns` | array | Column order and widths (objects with `name` and `width`) |
+| `fontSize` | string | Font size (e.g., "12px") |
+| `highlightSeverity` | boolean | Color-code by log level |
+| `showRaw` | boolean | Show raw JSON |
+| `showEvent` | boolean | Show event column |
+| `showTimestamp` | boolean | Show timestamp column |
+| `wrapLines` | boolean | Wrap long lines |
+| `hideNulls` | boolean | Hide null values |
+
+## Pie Options
+
+```json
+{
+  "type": "Pie",
+  "hideHeader": false
+}
+```
+
+## Note Options
+
+```json
+{
+  "type": "Note",
+  "text": "## Section Header\n\nMarkdown content here.",
+  "variant": "default"
+}
+```
+
+Note content supports GitHub Flavored Markdown.
+
+## Heatmap Options
+
+Heatmap charts use the default options. Color scheme is fixed to blue gradient.
+
+```json
+{
+  "type": "Heatmap",
+  "query": {
+    "apl": "['logs'] | summarize histogram(duration_ms, 15) by bin_auto(_time)"
+  }
+}
+```
+
+## Annotations
+
+Display deployment markers, incidents, or custom events on charts.
+
+Annotations are managed via the Axiom API `/v2/annotations` endpoint:
+
+```bash
+curl -X 'POST' 'https://api.axiom.co/v2/annotations' \
+  -H 'Authorization: Bearer $AXIOM_TOKEN' \
+  -H 'Content-Type: application/json' \
+  -d '{
+    "time": "2024-03-18T08:39:28.382Z",
+    "type": "deploy",
+    "datasets": ["http-logs"],
+    "title": "Production deployment",
+    "description": "Deploy v2.1.0",
+    "url": "https://github.com/org/repo/releases/tag/v2.1.0"
+  }'
+```
+
+Or use GitHub Actions:
+```yaml
+- name: Add annotation
+  uses: axiomhq/annotation-action@v0.1.0
+  with:
+    axiomToken: ${{ secrets.AXIOM_TOKEN }}
+    datasets: http-logs
+    type: "deploy"
+    title: "Production deployment"
+```
+
+## Comparison Period (Against)
+
+Compare current time range against a historical period:
+- `-1D`: Same time yesterday
+- `-1W`: Same time last week
+- Custom offset
+
+Use in dashboard URL: `?t_qr=24h&t_against=-1d`
+
+## Custom Time Range per Panel
+
+Individual panels can override the dashboard time range:
+- Set `overrideDashboardTimeRange: true` in chart config
+- Via UI: Edit panel → Time range → Custom
diff --git a/.agents/skills/building-dashboards/reference/chart-cookbook.md b/.agents/skills/building-dashboards/reference/chart-cookbook.md
new file mode 100644
index 00000000..a91d283a
--- /dev/null
+++ b/.agents/skills/building-dashboards/reference/chart-cookbook.md
@@ -0,0 +1,472 @@
+# Chart Cookbook
+
+Detailed APL patterns for each chart type with real-world examples.
+
+> **Note:** Dashboard panel queries inherit time from the UI picker—no explicit `_time` filter needed. The examples below show ad-hoc query patterns with time filters for testing in the Query tab. Remove the `where _time between (...)` line when using these in dashboards.
+
+---
+
+## Statistic
+
+Single-value panels for KPIs and current state.
+
+### Error Rate (Percentage)
+```apl
+['http-logs']
+| where _time between (ago(5m) .. now())
+| where service == "api-gateway"
+| summarize 
+    total = count(),
+    errors = countif(status >= 500)
+| extend error_rate = round(100.0 * errors / total, 2)
+| project error_rate
+```
+
+### Current p95 Latency
+```apl
+['http-logs']
+| where _time between (ago(5m) .. now())
+| where service == "api-gateway"
+| summarize p95 = percentile(duration_ms, 95)
+```
+
+### Request Rate (per second)
+```apl
+['http-logs']
+| where _time between (ago(5m) .. now())
+| where service == "api-gateway"
+| summarize requests = count()
+| extend rps = round(requests / 300.0, 1)  // 300 seconds = 5 min
+| project rps
+```
+
+### Active Errors (count)
+```apl
+['http-logs']
+| where _time between (ago(5m) .. now())
+| where status >= 500
+| summarize error_count = count()
+```
+
+### Comparison to Baseline
+```apl
+['http-logs']
+| where _time between (ago(1h) .. now())
+| summarize 
+    last_5m = countif(_time >= ago(5m) and status >= 500),
+    prev_55m = countif(_time < ago(5m) and status >= 500)
+| extend change_pct = round(100.0 * (last_5m - prev_55m/11) / (prev_55m/11 + 0.001), 1)
+| project last_5m, change_pct
+```
+
+---
+
+## TimeSeries
+
+Time-based trends with proper binning.
+
+### Traffic Over Time
+```apl
+['http-logs']
+| where _time between (ago(1h) .. now())
+| where service == "api-gateway"
+| summarize requests = count() by bin(_time, 1m)
+```
+
+### Error Rate Over Time
+```apl
+['http-logs']
+| where _time between (ago(1h) .. now())
+| where service == "api-gateway"
+| summarize 
+    total = count(),
+    errors = countif(status >= 500)
+  by bin(_time, 1m)
+| extend error_rate = 100.0 * errors / total
+| project _time, error_rate
+```
+
+### Latency Percentiles Over Time
+```apl
+['http-logs']
+| where _time between (ago(1h) .. now())
+| where service == "api-gateway"
+| summarize 
+    p50 = percentile(duration_ms, 50),
+    p95 = percentile(duration_ms, 95),
+    p99 = percentile(duration_ms, 99)
+  by bin(_time, 1m)
+```
+
+### Traffic by Status Class (Stacked)
+```apl
+['http-logs']
+| where _time between (ago(1h) .. now())
+| extend status_class = case(
+    status < 300, "2xx",
+    status < 400, "3xx",
+    status < 500, "4xx",
+    "5xx"
+  )
+| summarize count() by bin(_time, 1m), status_class
+```
+
+### Multi-Service Comparison
+```apl
+['http-logs']
+| where _time between (ago(1h) .. now())
+| where service in ("api-gateway", "auth-service", "payment-service")
+| summarize requests = count() by bin(_time, 1m), service
+```
+
+### Rate of Change (Derivative)
+```apl
+['http-logs']
+| where _time between (ago(1h) .. now())
+| summarize requests = count() by bin(_time, 1m)
+| order by _time asc
+| extend prev = prev(requests)
+| extend rate_change = requests - prev
+| where isnotnull(prev)
+```
+
+---
+
+## Table
+
+Top-N breakdowns and detailed lists.
+
+### Top 10 Failing Routes
+```apl
+['http-logs']
+| where _time between (ago(1h) .. now())
+| where status >= 500
+| summarize errors = count() by route
+| top 10 by errors
+| project Route = route, Errors = errors
+```
+
+### Top Error Messages
+```apl
+['http-logs']
+| where _time between (ago(1h) .. now())
+| where status >= 500
+| summarize count = count() by error_message
+| top 10 by count
+| project Message = error_message, Count = count
+```
+
+### Worst Pods by Error Rate
+```apl
+['http-logs']
+| where _time between (ago(1h) .. now())
+| summarize 
+    total = count(),
+    errors = countif(status >= 500)
+  by pod = ['kubernetes.pod.name']
+| extend error_rate = round(100.0 * errors / total, 2)
+| where total >= 100  // minimum sample size
+| top 10 by error_rate
+| project Pod = pod, "Error Rate %" = error_rate, Total = total, Errors = errors
+```
+
+### Latency by Route
+```apl
+['http-logs']
+| where _time between (ago(1h) .. now())
+| summarize 
+    requests = count(),
+    p50 = percentile(duration_ms, 50),
+    p95 = percentile(duration_ms, 95),
+    p99 = percentile(duration_ms, 99)
+  by route
+| top 10 by p95
+| project Route = route, Requests = requests, "p50 (ms)" = p50, "p95 (ms)" = p95, "p99 (ms)" = p99
+```
+
+### Recent Errors with Details
+```apl
+['http-logs']
+| where _time between (ago(15m) .. now())
+| where status >= 500
+| top 20 by _time
+| project Time = _time, Route = route, Status = status, Message = error_message, TraceID = trace_id
+```
+
+### Customer Impact Summary
+```apl
+['http-logs']
+| where _time between (ago(1h) .. now())
+| where status >= 500
+| summarize 
+    errors = count(),
+    affected_requests = dcount(trace_id)
+  by customer_id
+| top 10 by errors
+| project Customer = customer_id, Errors = errors, "Affected Requests" = affected_requests
+```
+
+---
+
+## Pie
+
+Share-of-total for low-cardinality dimensions only.
+
+### Status Code Distribution
+```apl
+['http-logs']
+| where _time between (ago(1h) .. now())
+| extend status_class = case(
+    status < 300, "2xx Success",
+    status < 400, "3xx Redirect",
+    status < 500, "4xx Client Error",
+    "5xx Server Error"
+  )
+| summarize count() by status_class
+```
+
+### Traffic by Region
+```apl
+['http-logs']
+| where _time between (ago(1h) .. now())
+| summarize count() by region
+| top 6 by count_  // Limit slices
+```
+
+### Error Types Distribution
+```apl
+['http-logs']
+| where _time between (ago(1h) .. now())
+| where status >= 500
+| extend error_type = case(
+    status == 500, "Internal Error",
+    status == 502, "Bad Gateway",
+    status == 503, "Service Unavailable",
+    status == 504, "Gateway Timeout",
+    "Other 5xx"
+  )
+| summarize count() by error_type
+```
+
+### Request Method Mix
+```apl
+['http-logs']
+| where _time between (ago(1h) .. now())
+| summarize count() by method
+```
+
+**Warning:** If dimension has >6 values, use a Table instead.
+
+---
+
+## LogStream
+
+Raw event inspection with focused fields.
+
+### Recent Errors
+```apl
+['http-logs']
+| where _time between (ago(15m) .. now())
+| where status >= 500
+| project-keep _time, trace_id, service, route, status, error_message, duration_ms
+| order by _time desc
+| take 100
+```
+
+### Slow Requests
+```apl
+['http-logs']
+| where _time between (ago(15m) .. now())
+| where duration_ms > 5000
+| project-keep _time, trace_id, service, route, duration_ms, status
+| order by duration_ms desc
+| take 100
+```
+
+### Authentication Failures
+```apl
+['auth-logs']
+| where _time between (ago(1h) .. now())
+| where event_type == "login_failed"
+| project-keep _time, user_id, ip_address, failure_reason, user_agent
+| order by _time desc
+| take 100
+```
+
+### Kubernetes Events
+```apl
+['k8s-events']
+| where _time between (ago(1h) .. now())
+| where type in ("Warning", "Error")
+| project-keep _time, type, reason, ['involvedObject.name'], message
+| order by _time desc
+| take 100
+```
+
+### Filtered by Trace ID
+```apl
+['http-logs']
+| where _time between (ago(24h) .. now())
+| where trace_id == "abc123xyz"
+| project-keep _time, service, route, status, duration_ms, error_message
+| order by _time asc
+```
+
+---
+
+## SmartFilter
+
+No APL needed—configure these fields for interactive filtering:
+
+### Recommended Filter Fields
+- `service` — Which service to focus on
+- `environment` — prod/staging/dev
+- `region` — Geographic region
+- `route` — API endpoint
+- `status` — HTTP status code
+- `customer_id` — For multi-tenant systems
+- `kubernetes.namespace` — K8s namespace
+- `kubernetes.pod.name` — Specific pod
+
+### Configuration Tips
+- Place SmartFilter at top of dashboard
+- Include 3–5 most useful filter dimensions
+- Avoid high-cardinality fields as primary filters (trace_id, request_id)
+
+---
+
+## Note
+
+Markdown panels for context and navigation.
+
+### Dashboard Header
+```markdown
+# API Gateway - Oncall Dashboard
+
+**Purpose:** Quick triage for API-related incidents.
+
+**Escalation:** If error rate > 5%, page #platform-oncall.
+
+**Runbook:** [API Incident Response](https://wiki.example.com/api-runbook)
+```
+
+### Section Divider
+```markdown
+---
+## Error Analysis
+```
+
+### Instructions
+```markdown
+### How to Use This Dashboard
+
+1. Check the error rate stat (top-left)
+2. If elevated, check the "Top Failing Routes" table
+3. Click a route to filter logs below
+4. Copy trace_id for detailed investigation
+```
+
+---
+
+## Heatmap
+
+Visualize distributions and density patterns.
+
+### Latency Distribution Over Time
+```apl
+['http-logs']
+| where _time between (ago(1h) .. now())
+| summarize histogram(duration_ms, 20) by bin_auto(_time)
+```
+
+### Response Size Distribution
+```apl
+['http-logs']
+| where _time between (ago(1h) .. now())
+| summarize histogram(resp_body_size_bytes, 15) by bin_auto(_time)
+```
+
+### Request Rate by Hour of Day
+```apl
+['http-logs']
+| where _time between (ago(7d) .. now())
+| extend hour = hourofday(_time), day = dayofweek(_time)
+| summarize count() by hour, day
+```
+
+---
+
+## Scatter Plot
+
+Identify correlations between metrics.
+
+### Latency vs Response Size
+```apl
+['http-logs']
+| where _time between (ago(1h) .. now())
+| summarize avg(duration_ms), avg(resp_body_size_bytes) by route
+```
+
+### Request Rate vs Error Rate by Route
+```apl
+['http-logs']
+| where _time between (ago(1h) .. now())
+| summarize 
+    requests = count(),
+    error_rate = round(100.0 * countif(status >= 500) / count(), 2)
+  by route
+| where requests >= 10
+```
+
+### CPU vs Memory by Pod
+```apl
+['metrics']
+| where _time between (ago(1h) .. now())
+| summarize avg(cpu_percent), avg(memory_percent) by pod
+```
+
+---
+
+## Filter Bar
+
+Interactive filters for dashboard-wide filtering.
+
+### Dynamic Country Filter Query
+```apl
+['http-logs']
+| where _time between (ago(1h) .. now())
+| distinct ['geo.country']
+| project key=['geo.country'], value=['geo.country']
+| sort by key asc
+```
+
+### Panel Using Filters
+```apl
+declare query_parameters (_country:string = "", _status:string = "");
+['http-logs']
+| where _time between (ago(1h) .. now())
+| where isempty(_country) or ['geo.country'] == _country
+| where isempty(_status) or tostring(status) == _status
+| summarize count() by bin_auto(_time)
+```
+
+### Dependent City Filter (depends on country)
+```apl
+declare query_parameters (_country:string = "");
+['http-logs']
+| where _time between (ago(1h) .. now())
+| where isnotempty(['geo.country']) and isnotempty(['geo.city'])
+| where ['geo.country'] == _country
+| distinct ['geo.city']
+| project key=['geo.city'], value=['geo.city']
+| sort by key asc
+```
+
+### Dataset Selector Filter
+For multi-dataset dashboards, let users choose which dataset to view:
+```apl
+declare query_parameters (_dataset:string = "http-logs");
+table(_dataset)
+| where _time between (ago(1h) .. now())
+| summarize count() by bin_auto(_time)
+```
diff --git a/.agents/skills/building-dashboards/reference/design-playbook.md b/.agents/skills/building-dashboards/reference/design-playbook.md
new file mode 100644
index 00000000..cb028939
--- /dev/null
+++ b/.agents/skills/building-dashboards/reference/design-playbook.md
@@ -0,0 +1,182 @@
+# Dashboard Design Playbook
+
+## Decision-First Design
+
+Every dashboard exists to help someone make a decision. Before adding panels, answer:
+
+1. **Who is the audience?**
+   - Oncall engineer (needs fast triage, error focus)
+   - Team lead (needs weekly trends, SLO tracking)
+   - Executive (needs high-level health, business impact)
+
+2. **What decisions will they make?**
+   - "Should I page someone?"
+   - "Which service is causing this?"
+   - "Are we meeting our SLOs?"
+   - "What changed after the deploy?"
+
+3. **What actions follow?**
+   - Rollback, scale, investigate, escalate, ignore
+
+If a panel doesn't inform a decision → remove it.
+
+---
+
+## The Overview → Drilldown → Evidence Pattern
+
+Structure dashboards in layers:
+
+```
+┌─────────────────────────────────────────────────────────┐
+│ OVERVIEW: Is anything broken? (Stats + TimeSeries)      │
+│ Answer in <5 seconds                                    │
+├─────────────────────────────────────────────────────────┤
+│ DRILLDOWN: Where is it broken? (Tables + Pies)          │
+│ Identify the component/route/customer                   │
+├─────────────────────────────────────────────────────────┤
+│ EVIDENCE: What exactly happened? (LogStream)            │
+│ Raw events for root cause                               │
+└─────────────────────────────────────────────────────────┘
+```
+
+Users should be able to:
+1. Glance at overview → "something's wrong with errors"
+2. Scan drilldown → "it's the /checkout route"
+3. Dive into evidence → "null pointer in payment handler"
+
+---
+
+## Audience-Specific Defaults
+
+### Oncall Dashboard
+- **Time window:** 15m–1h
+- **Refresh:** 30s–1m
+- **Focus:** Errors, latency spikes, recent changes
+- **Stats:** Current error rate, p95, traffic
+- **Priority:** Speed over completeness
+
+### Team Health Dashboard
+- **Time window:** 24h–7d
+- **Refresh:** 5m–15m
+- **Focus:** SLO tracking, trends, regression detection
+- **Stats:** SLO budget remaining, weekly error rate
+- **Priority:** Context over immediacy
+
+### Executive Dashboard
+- **Time window:** 7d–30d
+- **Refresh:** 1h
+- **Focus:** Business metrics, availability, cost
+- **Stats:** Uptime %, request volume, top customers
+- **Priority:** Clarity over detail
+
+---
+
+## Anti-Patterns
+
+### Too Many Panels
+**Problem:** Cognitive overload, slow rendering, no clear hierarchy.
+**Fix:** Limit to 8–12 panels max. If more needed, split into multiple dashboards.
+
+### Pie Charts for High Cardinality
+**Problem:** 50+ slices = unreadable rainbow.
+**Fix:** Use tables for high cardinality. Pies only for ≤6 categories.
+
+### Missing Time Filters (Ad-hoc Queries Only)
+**Problem:** Ad-hoc queries scan entire dataset history.
+**Fix:** Always `where _time between (...)` as first filter in Query tab.
+**Note:** Dashboard panel queries don't need this—they inherit time from the UI picker.
+
+### Averages Without Percentiles
+**Problem:** Averages hide tail latency that affects real users.
+**Fix:** Show p50, p95, p99 together. If only one, show p95 or p99.
+
+### Unbounded GROUP BY
+**Problem:** `summarize by user_id` returns millions of rows.
+**Fix:** Always add `| top N by ...` after high-cardinality groupings.
+
+### No Drilldown Path
+**Problem:** Dashboard shows "errors are high" but no way to find where.
+**Fix:** Always include breakdown tables that show top contributors.
+
+### Stale Data with Fast Refresh
+**Problem:** Dashboard refreshes every 30s but queries 7 days.
+**Fix:** Match refresh to time window. Fast refresh = short window.
+
+### Generic Panel Names
+**Problem:** "Errors", "Latency", "Traffic" don't explain what you're looking at.
+**Fix:** Question-style names: "Error rate by route", "p95 latency trend", "Requests per minute".
+
+---
+
+## Golden Signals Coverage
+
+Every service dashboard should cover the four golden signals:
+
+| Signal | What to show | Chart type |
+|--------|--------------|------------|
+| **Traffic** | Requests/sec over time | TimeSeries |
+| **Errors** | Error rate %, error count by type | TimeSeries + Table |
+| **Latency** | p50/p95/p99 over time | TimeSeries |
+| **Saturation** | CPU, memory, connections, queue depth | TimeSeries |
+
+If you can't show all four, prioritize: Errors > Latency > Traffic > Saturation.
+
+---
+
+## Time Window Guidelines
+
+| Use case | Window | Bin size |
+|----------|--------|----------|
+| Active incident | 15m–1h | 10s–1m |
+| Recent regression | 6h–24h | 5m–15m |
+| Weekly review | 7d | 1h |
+| Capacity planning | 30d | 6h–1d |
+
+**Rule of thumb:** Aim for 50–200 data points per series.
+- 1h window ÷ 1m bins = 60 points ✓
+- 24h window ÷ 1m bins = 1440 points ✗ (too dense)
+- 24h window ÷ 15m bins = 96 points ✓
+
+---
+
+## Refresh Rate Guidelines
+
+| Dashboard type | Refresh |
+|----------------|---------|
+| Oncall/incident | 30s–1m |
+| Operational | 1m–5m |
+| Daily health | 5m–15m |
+| Reporting | Manual or 1h |
+
+Fast refresh on long time windows wastes resources. Match them.
+
+---
+
+## Panel Ordering Principles
+
+1. **Most critical at top-left** (Stats row)
+2. **Time series below stats** (context for the numbers)
+3. **Breakdowns in middle** (drilldown path)
+4. **Raw logs at bottom** (evidence, least used)
+
+Visual flow should match investigation flow: notice → narrow → verify.
+
+---
+
+## Naming Conventions
+
+### Dashboard Names
+- Include service/scope: "API Gateway - Oncall"
+- Include purpose: "Payment Service - SLO Tracking"
+- Avoid generic: "Dashboard 1", "Main"
+
+### Panel Titles
+- Question format: "What is the error rate by route?"
+- Include units: "Latency (ms)", "Traffic (req/s)"
+- Include scope if multi-service: "[API] Error Rate"
+
+### Field Aliases
+In APL, use `project` or aliases to create readable column names:
+```apl
+| project Route = route, Errors = error_count, "Error Rate %" = error_rate
+```
diff --git a/.agents/skills/building-dashboards/reference/layout-recipes.md b/.agents/skills/building-dashboards/reference/layout-recipes.md
new file mode 100644
index 00000000..a5b793a1
--- /dev/null
+++ b/.agents/skills/building-dashboards/reference/layout-recipes.md
@@ -0,0 +1,226 @@
+# Layout Recipes
+
+Grid-based layout patterns for common dashboard structures.
+
+---
+
+## Grid Basics
+
+- **Dashboard width:** 24 units
+- **Minimum panel width:** 3 units
+- **Panel positioning:** (x, y) coordinates with (w)idth and (h)eight
+
+### Common Panel Sizes
+
+| Panel type | Width (w) | Height (h) | Description |
+|------------|-----------|------------|-------------|
+| Statistic (compact) | 6 | 2 | Quarter width, KPI |
+| Statistic (large) | 8 | 3 | Third width, featured KPI |
+| TimeSeries (half) | 12 | 4 | Side-by-side charts |
+| TimeSeries (full) | 24 | 4–6 | Full-width trend |
+| Table (half) | 12 | 4–6 | Side-by-side tables |
+| Table (full) | 24 | 5–8 | Detailed breakdown |
+| Pie | 8–12 | 4 | Share visualization |
+| LogStream | 24 | 6–10 | Raw events |
+| Note (header) | 24 | 1–2 | Section title |
+| SmartFilter | 24 | 2 | Dashboard filters |
+
+---
+
+## Service Overview Layout
+
+Classic 4-section structure for oncall dashboards.
+
+```
+┌──────────────────────────────────────────────────────────────────────────────┐
+│ Row 0-1: Stats (h=2)                                                          │
+│ ┌─────────────┬─────────────┬─────────────┬─────────────┐                    │
+│ │ Error Rate  │ p95 Latency │ Traffic/s   │ Active      │                    │
+│ │ x=0, w=6    │ x=6, w=6    │ x=12, w=6   │ Alerts x=18 │                    │
+│ └─────────────┴─────────────┴─────────────┴─────────────┘                    │
+├──────────────────────────────────────────────────────────────────────────────┤
+│ Row 2-5: TimeSeries (h=4)                                                     │
+│ ┌────────────────────────────────┬────────────────────────────────┐          │
+│ │ Traffic + Errors Over Time     │ Latency Percentiles            │          │
+│ │ x=0, w=12                      │ x=12, w=12                     │          │
+│ └────────────────────────────────┴────────────────────────────────┘          │
+├──────────────────────────────────────────────────────────────────────────────┤
+│ Row 6-9: Tables (h=4)                                                         │
+│ ┌────────────────────────────────┬────────────────────────────────┐          │
+│ │ Top Failing Routes             │ Top Error Messages             │          │
+│ │ x=0, w=12                      │ x=12, w=12                     │          │
+│ └────────────────────────────────┴────────────────────────────────┘          │
+├──────────────────────────────────────────────────────────────────────────────┤
+│ Row 10-15: LogStream (h=6)                                                    │
+│ ┌────────────────────────────────────────────────────────────────────────────┐
+│ │ Recent Errors                                                               │
+│ │ x=0, w=24                                                                   │
+│ └────────────────────────────────────────────────────────────────────────────┘
+└──────────────────────────────────────────────────────────────────────────────┘
+```
+
+**Layout JSON:**
+```json
+[
+  {"i": "error-rate", "x": 0, "y": 0, "w": 6, "h": 2},
+  {"i": "p95-latency", "x": 6, "y": 0, "w": 6, "h": 2},
+  {"i": "traffic", "x": 12, "y": 0, "w": 6, "h": 2},
+  {"i": "alerts", "x": 18, "y": 0, "w": 6, "h": 2},
+  {"i": "traffic-errors-ts", "x": 0, "y": 2, "w": 12, "h": 4},
+  {"i": "latency-ts", "x": 12, "y": 2, "w": 12, "h": 4},
+  {"i": "top-routes", "x": 0, "y": 6, "w": 12, "h": 4},
+  {"i": "top-errors", "x": 12, "y": 6, "w": 12, "h": 4},
+  {"i": "logs", "x": 0, "y": 10, "w": 24, "h": 6}
+]
+```
+
+---
+
+## Multi-Service Comparison Layout
+
+Side-by-side comparison of multiple services.
+
+```
+┌──────────────────────────────────────────────────────────────────────────────┐
+│ Row 0: SmartFilter                                                            │
+│ ┌────────────────────────────────────────────────────────────────────────────┐
+│ │ Filters: environment, region                                                │
+│ └────────────────────────────────────────────────────────────────────────────┘
+├──────────────────────────────────────────────────────────────────────────────┤
+│ Row 2-5: TimeSeries (by service)                                              │
+│ ┌────────────────────────────────────────────────────────────────────────────┐
+│ │ Traffic by Service (stacked)                                                │
+│ └────────────────────────────────────────────────────────────────────────────┘
+├──────────────────────────────────────────────────────────────────────────────┤
+│ Row 6-9: TimeSeries (by service)                                              │
+│ ┌────────────────────────────────────────────────────────────────────────────┐
+│ │ Error Rate by Service                                                       │
+│ └────────────────────────────────────────────────────────────────────────────┘
+├──────────────────────────────────────────────────────────────────────────────┤
+│ Row 10-13: Per-service columns                                                │
+│ ┌──────────────────┬──────────────────┬──────────────────┐                   │
+│ │ API Gateway      │ Auth Service     │ Payment Service  │                   │
+│ │ Stats + Table    │ Stats + Table    │ Stats + Table    │                   │
+│ └──────────────────┴──────────────────┴──────────────────┘                   │
+└──────────────────────────────────────────────────────────────────────────────┘
+```
+
+---
+
+## SLO Tracking Layout
+
+Focus on service level objectives and budget.
+
+```
+┌──────────────────────────────────────────────────────────────────────────────┐
+│ Row 0-2: SLO Stats (large)                                                    │
+│ ┌───────────────────┬───────────────────┬───────────────────┐                │
+│ │ Availability      │ Latency SLO       │ Error Budget      │                │
+│ │ 99.95%           │ p99 < 500ms       │ 23% remaining     │                │
+│ │ x=0, w=8, h=3     │ x=8, w=8, h=3     │ x=16, w=8, h=3    │                │
+│ └───────────────────┴───────────────────┴───────────────────┘                │
+├──────────────────────────────────────────────────────────────────────────────┤
+│ Row 3-8: SLO Trends                                                           │
+│ ┌────────────────────────────────────────────────────────────────────────────┐
+│ │ Availability Over Time (7d) with SLO threshold line                         │
+│ └────────────────────────────────────────────────────────────────────────────┘
+│ ┌────────────────────────────────────────────────────────────────────────────┐
+│ │ Error Budget Burn Rate                                                      │
+│ └────────────────────────────────────────────────────────────────────────────┘
+├──────────────────────────────────────────────────────────────────────────────┤
+│ Row 14+: SLO Violations                                                       │
+│ ┌────────────────────────────────────────────────────────────────────────────┐
+│ │ Table: SLO Violations by Route/Time                                         │
+│ └────────────────────────────────────────────────────────────────────────────┘
+└──────────────────────────────────────────────────────────────────────────────┘
+```
+
+---
+
+## Incident Investigation Layout
+
+Detailed drilldown for active incidents.
+
+```
+┌──────────────────────────────────────────────────────────────────────────────┐
+│ Row 0: SmartFilter (service, route, status, trace_id)                         │
+├──────────────────────────────────────────────────────────────────────────────┤
+│ Row 2-5: Impact Overview                                                      │
+│ ┌─────────────┬─────────────┬─────────────┬─────────────┐                    │
+│ │ Error Count │ Affected    │ Start Time  │ Duration    │                    │
+│ │             │ Customers   │             │             │                    │
+│ └─────────────┴─────────────┴─────────────┴─────────────┘                    │
+├──────────────────────────────────────────────────────────────────────────────┤
+│ Row 6-11: Timeline                                                            │
+│ ┌────────────────────────────────────────────────────────────────────────────┐
+│ │ Error Timeline (narrow bins, 10s-30s)                                       │
+│ └────────────────────────────────────────────────────────────────────────────┘
+├──────────────────────────────────────────────────────────────────────────────┤
+│ Row 12-17: Breakdown                                                          │
+│ ┌────────────────────────────────┬────────────────────────────────┐          │
+│ │ Errors by Route                │ Errors by Error Message        │          │
+│ └────────────────────────────────┴────────────────────────────────┘          │
+│ ┌────────────────────────────────┬────────────────────────────────┐          │
+│ │ Errors by Pod                  │ Errors by Customer             │          │
+│ └────────────────────────────────┴────────────────────────────────┘          │
+├──────────────────────────────────────────────────────────────────────────────┤
+│ Row 18+: Evidence (large LogStream)                                           │
+│ ┌────────────────────────────────────────────────────────────────────────────┐
+│ │ Raw Error Logs (h=10)                                                       │
+│ └────────────────────────────────────────────────────────────────────────────┘
+└──────────────────────────────────────────────────────────────────────────────┘
+```
+
+---
+
+## Kubernetes Cluster Overview
+
+Infrastructure-focused layout.
+
+```
+┌──────────────────────────────────────────────────────────────────────────────┐
+│ Row 0: Cluster Health Stats                                                   │
+│ ┌─────────────┬─────────────┬─────────────┬─────────────┐                    │
+│ │ Nodes Ready │ Pods Running│ Restarts    │ OOMKills    │                    │
+│ └─────────────┴─────────────┴─────────────┴─────────────┘                    │
+├──────────────────────────────────────────────────────────────────────────────┤
+│ Row 2-5: Resource Usage                                                       │
+│ ┌────────────────────────────────┬────────────────────────────────┐          │
+│ │ CPU Usage by Namespace         │ Memory Usage by Namespace      │          │
+│ └────────────────────────────────┴────────────────────────────────┘          │
+├──────────────────────────────────────────────────────────────────────────────┤
+│ Row 6-9: Pod Issues                                                           │
+│ ┌────────────────────────────────┬────────────────────────────────┐          │
+│ │ Pods with Restarts             │ Pods with High CPU             │          │
+│ └────────────────────────────────┴────────────────────────────────┘          │
+├──────────────────────────────────────────────────────────────────────────────┤
+│ Row 10+: Events                                                               │
+│ ┌────────────────────────────────────────────────────────────────────────────┐
+│ │ Warning/Error Events (LogStream)                                            │
+│ └────────────────────────────────────────────────────────────────────────────┘
+└──────────────────────────────────────────────────────────────────────────────┘
+```
+
+---
+
+## Layout Best Practices
+
+### Alignment
+- Align related panels vertically or horizontally
+- Keep consistent heights within rows
+- Don't mix units in adjacent panels without clear separation
+
+### Visual Hierarchy
+- Most important panels: top-left, larger
+- Supporting context: smaller, below/right
+- Evidence/logs: bottom, full-width
+
+### Responsive Considerations
+- Minimum useful width: w=6 for stats, w=12 for charts/tables
+- Full-width panels (w=24) for logs and complex tables
+- Test at common screen sizes
+
+### Section Separation
+- Use Note panels as section headers
+- Or use vertical spacing (leave y gaps)
+- Group related panels by theme/question
diff --git a/.agents/skills/building-dashboards/reference/metrics-mpl.md b/.agents/skills/building-dashboards/reference/metrics-mpl.md
new file mode 100644
index 00000000..8298d8df
--- /dev/null
+++ b/.agents/skills/building-dashboards/reference/metrics-mpl.md
@@ -0,0 +1,61 @@
+# Metrics/MPL Chart Contract
+
+This reference documents the chart query contract for *metrics-backed* dashboard charts.
+
+Metrics charts require **two** fields:
+
+- `query.apl` — the MPL pipeline string (same field name used for APL queries).
+- `query.metricsDataset` — the dataset name (e.g. `"otel-metrics"`). This field is what tells the backend to interpret `apl` as MPL. Without it, the chart will not behave correctly even if the pipeline string is well-formed.
+
+Do not send `query.mpl` in create payloads — the create API rejects it even though GET responses for existing metrics dashboards may include it.
+
+> **CRITICAL:** Run `scripts/metrics/metrics-spec <deployment> <dataset>` before composing your first MPL query in a session. NEVER guess MPL syntax.
+
+## Canonical JSON Shape
+
+```json
+{
+  "type": "TimeSeries",
+  "query": {
+    "apl": "`otel-metrics`:`http.server.duration`\n| where `service.name` == \"api\"\n| align to $__interval using avg\n| group by `service.name` using avg",
+    "metricsDataset": "otel-metrics"
+  }
+}
+```
+
+### Required and Optional Fields
+
+| Field | Required? | Description |
+|-------|-----------|-------------|
+| `apl` | ✅ Yes | The MPL pipeline string. Use this field even for MPL content. |
+| `metricsDataset` | ✅ Yes (for metrics charts) | Dataset name (e.g. `"otel-metrics"`). Denotes the chart as MPL — without it the backend treats `apl` as APL. |
+| `mpl` | ❌ No (rejected) | GET may return it for existing metrics charts, but create rejects it. Put the MPL string in `apl` instead. |
+| `metricsMetric` | ❌ No | UI/editor metadata; not needed for hand-authored create payloads |
+| `metricsFilter` | ❌ No | UI/editor metadata; not needed for hand-authored create payloads |
+| `metricsTransformations` | ❌ No | UI/editor metadata; not needed for hand-authored create payloads |
+
+> **Why both `apl` and `metricsDataset`?** The dashboard create API uses `apl` as the query text field for both APL and MPL queries. `metricsDataset` is the discriminator that flags the chart as MPL. The dataset/metric selector is also embedded in the MPL string itself (e.g. `` `otel-metrics`:`http.server.duration` ``), but `metricsDataset` must still be set explicitly.
+
+## Authoring Checklist
+
+When generating metrics chart JSON:
+
+1. Confirm dataset kind is `otel:metrics:v1` via `scripts/metrics/datasets <deploy>`.
+2. Run `scripts/metrics/metrics-spec` to learn the full MPL syntax — **mandatory, never guess**.
+3. Discover available metrics and tags with `scripts/metrics/metrics-info`. If results are empty, retry with `--start` set to 7 days ago (sparse metrics may not have data in the default 24h window).
+4. Put the full MPL pipeline in `query.apl` AND set `query.metricsDataset` to the dataset name. Do not set `query.mpl` — the create API rejects it.
+5. **Use `align to $__interval`, not a fixed window.** The dashboard runtime injects `$__interval` based on the time picker and panel width; a fixed `align to 1m` produces broken granularity outside its design range. Do not add `param $__interval: Duration;` to the chart string — the runtime injects it. Pre-validation via `scripts/metrics/metrics-query` requires substituting a concrete duration for that call only.
+6. Validate your query with `scripts/metrics/metrics-query` before embedding in the dashboard.
+
+> **Note:** `find-metrics <value>` searches tag values, not metric names. Use `metrics-info <deploy> <dataset> metrics` to list metric names.
+
+## Metrics Discovery & Query Scripts
+
+| Script | Usage |
+|--------|-------|
+| `scripts/metrics/datasets <deploy> [--kind <kind>]` | List datasets (with edge deployment info) |
+| `scripts/metrics/metrics-spec <deploy> <dataset>` | Fetch MPL query specification |
+| `scripts/metrics/metrics-info <deploy> <dataset> ...` | Discover metrics, tags, and values |
+| `scripts/metrics/metrics-query <deploy> <mpl> <start> <end>` | Execute a metrics query |
+
+> These scripts are vendored from `query-metrics`. Keep in sync if upstream behavior changes.
diff --git a/.agents/skills/building-dashboards/reference/smartfilter.md b/.agents/skills/building-dashboards/reference/smartfilter.md
new file mode 100644
index 00000000..1d0a01a8
--- /dev/null
+++ b/.agents/skills/building-dashboards/reference/smartfilter.md
@@ -0,0 +1,135 @@
+# SmartFilter (Filter Bar) Configuration
+
+SmartFilter is a **chart type** that creates dropdown/search filters. It requires TWO parts:
+1. A `SmartFilter` chart in the `charts` array with filter definitions
+2. `declare query_parameters` in each panel query that should respond to filters
+
+## SmartFilter Chart JSON Structure
+
+```json
+{
+  "id": "country-filter",
+  "name": "Filters",
+  "type": "SmartFilter",
+  "query": {"apl": ""},
+  "filters": [
+    {
+      "id": "country_filter",
+      "name": "Country",
+      "type": "select",
+      "selectType": "apl",
+      "active": true,
+      "apl": {
+        "apl": "['logs'] | distinct ['geo.country'] | project key=['geo.country'], value=['geo.country'] | sort by key asc",
+        "queryOptions": {"quickRange": "1h"}
+      },
+      "options": [
+        {"key": "All", "value": "", "default": true}
+      ]
+    }
+  ]
+}
+```
+
+## Filter Types
+
+### Dynamic APL Dropdown (`selectType: "apl"`)
+
+Populates options from an APL query.
+
+**Requirements:**
+- `apl.apl`: Query returning `key` and `value` columns
+- `apl.queryOptions.quickRange`: Time range for the query (e.g., `"1h"`, `"7d"`)
+- `options`: Must include at least `[{"key": "All", "value": "", "default": true}]`
+
+### Static List Dropdown (`selectType: "list"`)
+
+Uses predefined options only.
+
+```json
+{
+  "id": "status_filter",
+  "name": "Status",
+  "type": "select",
+  "selectType": "list",
+  "active": true,
+  "options": [
+    {"key": "All", "value": "", "default": true},
+    {"key": "2xx", "value": "2"},
+    {"key": "4xx", "value": "4"},
+    {"key": "5xx", "value": "5"}
+  ]
+}
+```
+
+### Search Filter (`type: "search"`)
+
+Free-text input instead of dropdown:
+
+```json
+{
+  "id": "trace_id",
+  "name": "Trace ID",
+  "type": "search",
+  "selectType": "list",
+  "active": true,
+  "options": [{"key": "All", "value": "", "default": true}]
+}
+```
+
+## Panel Query Integration
+
+Panel queries must declare parameters and handle empty (All) case:
+
+```apl
+declare query_parameters (country_filter:string = "");
+['logs']
+| where isempty(country_filter) or ['geo.country'] == country_filter
+| summarize count() by bin_auto(_time)
+```
+
+## Filter Query for Dynamic Dropdowns
+
+```apl
+['logs']
+| distinct ['geo.country']
+| project key=['geo.country'], value=['geo.country']
+| sort by key asc
+```
+
+## Dependent/Cascading Filters
+
+Filters can depend on other filters by declaring their parameters in the APL query:
+
+```json
+{
+  "id": "city_filter",
+  "name": "City",
+  "type": "select",
+  "selectType": "apl",
+  "active": true,
+  "apl": {
+    "apl": "declare query_parameters (country_filter:string=\"\");\n['logs']\n| where ['geo.country'] == country_filter\n| distinct ['geo.city']\n| project key=['geo.city'], value=['geo.city']",
+    "queryOptions": {"quickRange": "1h"}
+  },
+  "options": [{"key": "All", "value": "", "default": true}]
+}
+```
+
+The city dropdown re-queries when `country_filter` changes, showing only cities in the selected country.
+
+## Layout
+
+Place SmartFilter at y=0, full width (w=12, h=1), shift other panels down:
+
+```json
+{"i": "filters", "x": 0, "y": 0, "w": 12, "h": 1}
+```
+
+## Best Practices
+
+- Filter `id` must match the parameter name in `declare query_parameters`
+- Use `isempty(filter)` check so "All" option works (empty string = no filter)
+- One SmartFilter chart can contain multiple filters
+- Place at top of dashboard (y=0) for visibility
+- For cascading filters, order matters: parent filter should come before dependent filters
diff --git a/.agents/skills/building-dashboards/reference/splunk-migration.md b/.agents/skills/building-dashboards/reference/splunk-migration.md
new file mode 100644
index 00000000..beb87fa0
--- /dev/null
+++ b/.agents/skills/building-dashboards/reference/splunk-migration.md
@@ -0,0 +1,243 @@
+# Splunk Dashboard Migration
+
+Guide for converting Splunk dashboards to Axiom dashboards.
+
+---
+
+## Migration Workflow
+
+1. **Export Splunk dashboard** (XML or JSON)
+2. **Inventory panels** — list each panel with its SPL query and visualization type
+3. **Translate SPL → APL** using the `spl-to-apl` skill
+4. **Map visualization types** (see table below)
+5. **Test queries** with explicit time filters in Query tab (dashboards inherit time from UI picker)
+6. **Adjust binning** for Axiom visualization
+7. **Build Axiom dashboard** using templates (remove time filters from panel queries)
+8. **Validate and deploy** with `dashboard-validate` and `dashboard-create`
+
+---
+
+## Visualization Type Mapping
+
+| Splunk Visualization | Axiom Chart Type | Notes |
+|---------------------|------------------|-------|
+| Single Value | Statistic | Direct mapping |
+| Line Chart | TimeSeries | Ensure `bin(_time, ...)` |
+| Area Chart | TimeSeries | Same as line |
+| Column Chart | TimeSeries | Axiom renders as bars |
+| Bar Chart (horizontal) | Table | No horizontal bar; use table |
+| Pie Chart | Pie | Limit to ≤6 categories |
+| Table | Table | Direct mapping |
+| Events List | LogStream | Add `take N` and `project-keep` |
+| Choropleth Map | Table | No map support; use table |
+| Scatter Plot | Table | No scatter; use table with dimensions |
+
+---
+
+## Panel Translation Examples
+
+**Note:** Dashboard panel queries do NOT need time filters—the dashboard UI time picker applies to all panels automatically. The examples below show the final dashboard query format.
+
+### Single Value → Statistic
+
+**Splunk:**
+```spl
+index=web status>=500
+| stats count as errors
+```
+
+**Axiom (dashboard panel):**
+```apl
+['web-logs']
+| where status >= 500
+| summarize errors = count()
+```
+
+### Timechart → TimeSeries
+
+**Splunk:**
+```spl
+index=web
+| timechart span=5m count by status
+```
+
+**Axiom (dashboard panel):**
+```apl
+['web-logs']
+| summarize count() by bin_auto(_time), status
+```
+
+### Stats Table → Table
+
+**Splunk:**
+```spl
+index=web status>=500
+| stats count by uri
+| sort - count
+| head 10
+```
+
+**Axiom (dashboard panel):**
+```apl
+['web-logs']
+| where status >= 500
+| summarize count = count() by uri
+| top 10 by count
+| project URI = uri, Errors = count
+```
+
+### Top Command → Table
+
+**Splunk:**
+```spl
+index=web
+| top limit=10 user_agent
+```
+
+**Axiom (dashboard panel):**
+```apl
+['web-logs']
+| summarize count() by user_agent
+| top 10 by count_
+| project "User Agent" = user_agent, Count = count_
+```
+
+### Events Search → LogStream
+
+**Splunk:**
+```spl
+index=web status>=500
+| table _time, uri, status, error_message
+```
+
+**Axiom (dashboard panel):**
+```apl
+['web-logs']
+| where status >= 500
+| project-keep _time, uri, status, error_message
+| order by _time desc
+| take 100
+```
+
+### Chart with Eval → TimeSeries
+
+**Splunk:**
+```spl
+index=web
+| timechart span=5m count as total, count(eval(status>=500)) as errors
+| eval error_rate = round(errors/total*100, 2)
+```
+
+**Axiom (dashboard panel):**
+```apl
+['web-logs']
+| summarize 
+    total = count(),
+    errors = countif(status >= 500)
+  by bin_auto(_time)
+| extend error_rate = round(100.0 * errors / total, 2)
+| project _time, error_rate
+```
+
+---
+
+## Time Range Translation
+
+Splunk dashboards use time pickers. Axiom dashboards also have a time picker that automatically scopes all queries—**panel queries don't need explicit time filters**.
+
+For **ad-hoc testing** in the Query tab, use these time filters:
+
+| Splunk Time Picker | Axiom APL (for Query tab testing) |
+|-------------------|-----------------------------------|
+| Last 15 minutes | `where _time between (ago(15m) .. now())` |
+| Last 60 minutes | `where _time between (ago(1h) .. now())` |
+| Last 4 hours | `where _time between (ago(4h) .. now())` |
+| Last 24 hours | `where _time between (ago(24h) .. now())` |
+| Last 7 days | `where _time between (ago(7d) .. now())` |
+| Today | `where _time between (startofday(now()) .. now())` |
+| Yesterday | `where _time between (startofday(ago(1d)) .. startofday(now()))` |
+
+**Remember:** Remove time filters when placing queries in dashboard panels.
+
+---
+
+## Binning Adjustment
+
+Splunk `timechart span=` maps to Axiom `bin(_time, ...)`.
+
+| Splunk | Axiom |
+|--------|-------|
+| `span=1m` | `bin(_time, 1m)` |
+| `span=5m` | `bin(_time, 5m)` |
+| `span=1h` | `bin(_time, 1h)` |
+| `span=1d` | `bin(_time, 1d)` |
+
+Or use `bin_auto(_time)` for automatic sizing based on time range.
+
+---
+
+## Field Name Mapping
+
+Splunk and Axiom may have different field names for the same data.
+
+| Concept | Splunk (common) | Axiom (common) |
+|---------|-----------------|----------------|
+| Timestamp | `_time` | `_time` |
+| Raw event | `_raw` | `_raw` or structured fields |
+| Source | `source` | `_source` or custom |
+| Host | `host` | `host` or `['kubernetes.node.name']` |
+| Index | `index` | N/A (use dataset) |
+
+**Tip:** Run `getschema` on your Axiom dataset to discover actual field names:
+```apl
+['your-dataset'] | where _time between (ago(1h) .. now()) | getschema
+```
+
+---
+
+## Features Without Direct Equivalents
+
+| Splunk Feature | Axiom Approach |
+|----------------|----------------|
+| `transaction` | Use `summarize` with `make_list()` grouped by session/trace |
+| `streamstats` | No direct equivalent; approximate with window functions |
+| `eventstats` | Use subquery + join |
+| Drilldown actions | Use SmartFilter for interactive filtering |
+| Trellis layout | Create separate panels per dimension |
+| Real-time search | Use short time window + fast refresh |
+
+---
+
+## Common Migration Pitfalls
+
+### Unbounded Results
+**Problem:** Splunk implicitly limits; Axiom may return all rows.
+**Fix:** Add `| top N by ...` or `| take N` for tables/logs.
+
+### Case Sensitivity
+**Problem:** Splunk search is case-insensitive by default.
+**Fix:** Use `has` (case-insensitive) or `tolower()` for matching.
+
+### Field Escaping
+**Problem:** Splunk uses bare field names; Axiom needs brackets for dots.
+**Fix:** `field.name` → `['field.name']`
+
+### Different Aggregation Names
+**Problem:** Function names differ between SPL and APL.
+**Fix:** Consult `spl-to-apl` skill for complete mapping.
+
+---
+
+## Migration Checklist
+
+- [ ] Inventory all panels from Splunk dashboard
+- [ ] Map each panel's visualization type
+- [ ] Translate SPL queries using spl-to-apl
+- [ ] Verify field names with getschema
+- [ ] Test queries in Query tab (with time filters for testing)
+- [ ] Add `top N` or `take N` where needed
+- [ ] Test each query individually in Axiom
+- [ ] Build dashboard JSON (remove time filters from panel queries)
+- [ ] Validate with `dashboard-validate`
+- [ ] Deploy with `dashboard-create`
+- [ ] Compare visually to original Splunk dashboard
diff --git a/.agents/skills/building-dashboards/reference/templates/api-health.json b/.agents/skills/building-dashboards/reference/templates/api-health.json
new file mode 100644
index 00000000..39f345c7
--- /dev/null
+++ b/.agents/skills/building-dashboards/reference/templates/api-health.json
@@ -0,0 +1,131 @@
+{
+  "name": "{{service}} - API Health",
+  "description": "HTTP API health dashboard showing golden signals: traffic, errors, latency, and status distribution.",
+  "owner": "X-AXIOM-EVERYONE",
+  "datasets": ["{{dataset}}"],
+  "refreshTime": 60,
+  "schemaVersion": 2,
+  "timeWindowStart": "qr-now-1h",
+  "timeWindowEnd": "qr-now",
+  "charts": [
+    {
+      "id": "total-requests",
+      "name": "Total Requests",
+      "type": "Statistic",
+      "query": {
+        "apl": "['{{dataset}}'] | summarize total = count()"
+      }
+    },
+    {
+      "id": "error-rate",
+      "name": "Error Rate (%)",
+      "type": "Statistic",
+      "query": {
+        "apl": "['{{dataset}}'] | summarize total = count(), errors = countif(status >= 500) | extend error_rate = iff(total > 0, round(100.0 * errors / total, 2), 0.0) | project error_rate"
+      }
+    },
+    {
+      "id": "p50-latency",
+      "name": "p50 Latency (ms)",
+      "type": "Statistic",
+      "query": {
+        "apl": "['{{dataset}}'] | summarize p50 = percentile(duration_ms, 50)"
+      }
+    },
+    {
+      "id": "p99-latency",
+      "name": "p99 Latency (ms)",
+      "type": "Statistic",
+      "query": {
+        "apl": "['{{dataset}}'] | summarize p99 = percentile(duration_ms, 99)"
+      }
+    },
+    {
+      "id": "traffic-ts",
+      "name": "Request Rate Over Time",
+      "type": "TimeSeries",
+      "query": {
+        "apl": "['{{dataset}}'] | summarize requests = count() by bin_auto(_time)"
+      }
+    },
+    {
+      "id": "error-rate-ts",
+      "name": "Error Rate Over Time",
+      "type": "TimeSeries",
+      "query": {
+        "apl": "['{{dataset}}'] | summarize total = count(), errors = countif(status >= 500) by bin_auto(_time) | extend error_rate = iff(total > 0, round(100.0 * errors / total, 2), 0.0) | project _time, error_rate"
+      }
+    },
+    {
+      "id": "latency-percentiles-ts",
+      "name": "Latency Percentiles Over Time",
+      "type": "TimeSeries",
+      "query": {
+        "apl": "['{{dataset}}'] | summarize percentiles_array(duration_ms, 50, 95, 99) by bin_auto(_time)"
+      }
+    },
+    {
+      "id": "status-distribution",
+      "name": "Status Code Distribution",
+      "type": "Pie",
+      "query": {
+        "apl": "['{{dataset}}'] | extend status_class = case(status < 300, '2xx', status < 400, '3xx', status < 500, '4xx', '5xx') | summarize count() by status_class"
+      }
+    },
+    {
+      "id": "traffic-by-status-ts",
+      "name": "Traffic by Status Class",
+      "type": "TimeSeries",
+      "query": {
+        "apl": "['{{dataset}}'] | extend status_class = case(status < 300, '2xx', status < 400, '3xx', status < 500, '4xx', '5xx') | summarize count() by bin_auto(_time), status_class"
+      }
+    },
+    {
+      "id": "top-routes-by-traffic",
+      "name": "Top Routes by Traffic",
+      "type": "Table",
+      "query": {
+        "apl": "['{{dataset}}'] | summarize requests = count(), errors = countif(status >= 500), p95 = percentile(duration_ms, 95) by route | top 10 by requests | project Route = route, Requests = requests, Errors = errors, 'p95 (ms)' = p95"
+      }
+    },
+    {
+      "id": "top-routes-by-errors",
+      "name": "Top Routes by Errors",
+      "type": "Table",
+      "query": {
+        "apl": "['{{dataset}}'] | where status >= 500 | summarize errors = count() by route, status | top 10 by errors | project Route = route, Status = status, Errors = errors"
+      }
+    },
+    {
+      "id": "slowest-routes",
+      "name": "Slowest Routes (by p99)",
+      "type": "Table",
+      "query": {
+        "apl": "['{{dataset}}'] | summarize requests = count(), p99 = percentile(duration_ms, 99) by route | where requests >= 10 | top 10 by p99 | project Route = route, 'p99 (ms)' = p99, Requests = requests"
+      }
+    },
+    {
+      "id": "recent-errors",
+      "name": "Recent Errors",
+      "type": "LogStream",
+      "query": {
+        "apl": "['{{dataset}}'] | where status >= 500 | project-keep _time, trace_id, method, route, status, error_message, duration_ms | order by _time desc | take 100"
+      }
+    }
+  ],
+  "layout": [
+    {"i": "total-requests", "x": 0, "y": 0, "w": 3, "h": 2},
+    {"i": "error-rate", "x": 3, "y": 0, "w": 3, "h": 2},
+    {"i": "p50-latency", "x": 6, "y": 0, "w": 3, "h": 2},
+    {"i": "p99-latency", "x": 9, "y": 0, "w": 3, "h": 2},
+    {"i": "traffic-ts", "x": 0, "y": 2, "w": 6, "h": 4},
+    {"i": "error-rate-ts", "x": 6, "y": 2, "w": 6, "h": 4},
+    {"i": "latency-percentiles-ts", "x": 0, "y": 6, "w": 6, "h": 4},
+    {"i": "status-distribution", "x": 6, "y": 6, "w": 3, "h": 4},
+    {"i": "traffic-by-status-ts", "x": 9, "y": 6, "w": 3, "h": 4},
+    {"i": "top-routes-by-traffic", "x": 0, "y": 10, "w": 4, "h": 4},
+    {"i": "top-routes-by-errors", "x": 4, "y": 10, "w": 4, "h": 4},
+    {"i": "slowest-routes", "x": 8, "y": 10, "w": 4, "h": 4},
+    {"i": "recent-errors", "x": 0, "y": 14, "w": 12, "h": 6}
+  ]
+}
diff --git a/.agents/skills/building-dashboards/reference/templates/blank.json b/.agents/skills/building-dashboards/reference/templates/blank.json
new file mode 100644
index 00000000..636cc1a2
--- /dev/null
+++ b/.agents/skills/building-dashboards/reference/templates/blank.json
@@ -0,0 +1,12 @@
+{
+  "name": "{{name}}",
+  "description": "{{description}}",
+  "owner": "X-AXIOM-EVERYONE",
+  "datasets": ["{{dataset}}"],
+  "refreshTime": 60,
+  "schemaVersion": 2,
+  "timeWindowStart": "qr-now-1h",
+  "timeWindowEnd": "qr-now",
+  "charts": [],
+  "layout": []
+}
diff --git a/.agents/skills/building-dashboards/reference/templates/org-usage-cost-control.json b/.agents/skills/building-dashboards/reference/templates/org-usage-cost-control.json
new file mode 100644
index 00000000..08bfc155
--- /dev/null
+++ b/.agents/skills/building-dashboards/reference/templates/org-usage-cost-control.json
@@ -0,0 +1,384 @@
+{
+  "charts": [
+    {
+      "filters": [
+        {
+          "active": true,
+          "apl": {
+            "apl": "['axiom-audit'] | where action == 'usageCalculated' | distinct ['resource.id'] | project key=['resource.id'], value=['resource.id'] | sort by key asc",
+            "queryOptions": {
+              "quickRange": "24h"
+            }
+          },
+          "id": "org_filter",
+          "name": "Organization",
+          "options": [
+            {
+              "default": true,
+              "key": "All Orgs",
+              "value": ""
+            }
+          ],
+          "selectType": "apl",
+          "type": "select"
+        },
+        {
+          "active": true,
+          "apl": {
+            "apl": "['axiom-audit'] | where action == 'usageCalculated' | distinct tostring(['properties.dataset']) | project key=tostring(['properties.dataset']), value=tostring(['properties.dataset']) | sort by key asc",
+            "queryOptions": {
+              "quickRange": "24h"
+            }
+          },
+          "id": "dataset_filter",
+          "name": "Dataset",
+          "options": [
+            {
+              "default": true,
+              "key": "All Datasets",
+              "value": ""
+            }
+          ],
+          "selectType": "apl",
+          "type": "select"
+        },
+        {
+          "active": true,
+          "apl": {
+            "apl": "[\"axiom-audit\"] | where action == \"runAPLQueryCost\" | extend User = case(isnotempty([\"actor.email\"]), [\"actor.email\"], isnotempty([\"actor.name\"]), strcat(\"[\", [\"actor.name\"], \"]\"), \"[unknown]\") | distinct User | project key=User, value=User | sort by key asc",
+            "queryOptions": {
+              "quickRange": "24h"
+            }
+          },
+          "id": "user_filter",
+          "name": "User",
+          "options": [
+            {
+              "default": true,
+              "key": "All Users",
+              "value": ""
+            }
+          ],
+          "selectType": "apl",
+          "type": "select"
+        },
+        {
+          "active": true,
+          "id": "contract_gb",
+          "name": "Contract (GB/mo)",
+          "type": "search",
+          "options": [
+            {
+              "default": true,
+              "key": "",
+              "value": ""
+            }
+          ]
+        }
+      ],
+      "id": "filters",
+      "name": "Filters",
+      "query": {
+        "apl": ""
+      },
+      "type": "SmartFilter"
+    },
+    {
+      "colorScheme": "Blue",
+      "customUnits": "GB",
+      "id": "total-ingest-gb",
+      "name": "Total Ingest",
+      "query": {
+        "apl": "declare query_parameters (org_filter:string = \"\", dataset_filter:string = \"\");\n['axiom-audit']\n| where action == 'usageCalculated'\n| where isempty(org_filter) or ['resource.id'] == org_filter\n| where isempty(dataset_filter) or tostring(['properties.dataset']) == dataset_filter\n| summarize total_bytes = sum(['properties.hourly_ingest_bytes'])\n| extend ['Total'] = total_bytes / 1000000000\n| project ['Total']"
+      },
+      "type": "Statistic"
+    },
+    {
+      "colorScheme": "Orange",
+      "customUnits": "GB/day",
+      "id": "daily-burn-rate",
+      "name": "Daily Burn Rate",
+      "query": {
+        "apl": "declare query_parameters (org_filter:string = \"\", dataset_filter:string = \"\");\n['axiom-audit']\n| where action == 'usageCalculated'\n| where isempty(org_filter) or ['resource.id'] == org_filter\n| where isempty(dataset_filter) or tostring(['properties.dataset']) == dataset_filter\n| summarize total_bytes = sum(['properties.hourly_ingest_bytes']), day_count = dcount(bin(_time, 1d))\n| extend days = iff(day_count == 0, 1.0, todouble(day_count))\n| extend ['GB/day'] = round(total_bytes / days / 1000000000, 0)\n| project ['GB/day']"
+      },
+      "showChart": true,
+      "type": "Statistic",
+      "unit": "Gigabyte"
+    },
+    {
+      "colorScheme": "Purple",
+      "customUnits": "GB",
+      "errorThreshold": "Above",
+      "errorThresholdValue": "15000000",
+      "id": "monthly-projection",
+      "name": "30-Day Projection",
+      "query": {
+        "apl": "declare query_parameters (org_filter:string = \"\", dataset_filter:string = \"\");\n['axiom-audit']\n| where action == 'usageCalculated'\n| where isempty(org_filter) or ['resource.id'] == org_filter\n| where isempty(dataset_filter) or tostring(['properties.dataset']) == dataset_filter\n| summarize total_bytes = sum(['properties.hourly_ingest_bytes']), day_count = dcount(bin(_time, 1d))\n| extend days = iff(day_count == 0, 1.0, todouble(day_count))\n| extend daily_rate = total_bytes / days\n| extend ['30d GB'] = round(daily_rate * 30 / 1000000000, 0)\n| project ['30d GB']"
+      },
+      "type": "Statistic",
+      "warningThreshold": "Above",
+      "warningThresholdValue": "5000000"
+    },
+    {
+      "colorScheme": "Teal",
+      "customUnits": "GB·ms",
+      "id": "total-query-cost",
+      "name": "Query Cost",
+      "query": {
+        "apl": "declare query_parameters (org_filter:string = \"\", dataset_filter:string = \"\");\n[\"axiom-audit\"]\n| where action == \"usageCalculated\"\n| where isempty(org_filter) or [\"resource.id\"] == org_filter\n| where isempty(dataset_filter) or tostring([\"properties.dataset\"]) == dataset_filter\n| summarize [\"GB·ms\"] = round(sum([\"properties.hourly_billable_query_gbms\"]), 0)\n| project [\"GB·ms\"]"
+      },
+      "type": "Statistic"
+    },
+    {
+      "colorScheme": "Yellow",
+      "customUnits": "%",
+      "errorThreshold": "Above",
+      "errorThresholdValue": "25",
+      "id": "wow-change",
+      "name": "Week-over-Week",
+      "overrideDashboardTimeRange": true,
+      "query": {
+        "apl": "declare query_parameters (org_filter:string = \"\", dataset_filter:string = \"\");\n['axiom-audit']\n| where _time between (ago(14d) .. now())\n| where action == 'usageCalculated'\n| where isempty(org_filter) or ['resource.id'] == org_filter\n| where isempty(dataset_filter) or tostring(['properties.dataset']) == dataset_filter\n| summarize this_week = sumif(['properties.hourly_ingest_bytes'], _time >= ago(7d)), last_week = sumif(['properties.hourly_ingest_bytes'], _time < ago(7d) and _time >= ago(14d))\n| extend ['WoW %'] = iff(last_week == 0, real(null), round(100.0 * (this_week - last_week) / last_week, 1))\n| project ['WoW %']"
+      },
+      "type": "Statistic",
+      "warningThreshold": "Above",
+      "warningThresholdValue": "10"
+    },
+
+    {
+      "colorScheme": "Green",
+      "id": "active-datasets",
+      "name": "Active Datasets",
+      "query": {
+        "apl": "declare query_parameters (org_filter:string = \"\", dataset_filter:string = \"\");\n['axiom-audit']\n| where action == 'usageCalculated'\n| where isempty(org_filter) or ['resource.id'] == org_filter\n| where isempty(dataset_filter) or tostring(['properties.dataset']) == dataset_filter\n| summarize ['Datasets'] = dcount(['properties.dataset'])"
+      },
+      "type": "Statistic"
+    },
+    {
+      "datasetId": "axiom-audit",
+      "id": "ingest-by-dataset-ts",
+      "modified": 1769193868070,
+      "name": "Daily Ingest by Dataset (GB)",
+      "numSeries": 1,
+      "overrideDashboardCompareAgainst": false,
+      "overrideDashboardTimeRange": false,
+      "query": {
+        "apl": "declare query_parameters (org_filter:string = \"\", dataset_filter:string = \"\");\n[\"axiom-audit\"]\n| where action == \"usageCalculated\"\n| where isempty(org_filter) or [\"resource.id\"] == org_filter\n| where isempty(dataset_filter) or tostring([\"properties.dataset\"]) == dataset_filter\n| summarize [\"GB\"] = round(sum([\"properties.hourly_ingest_bytes\"]) / 1000000000, 1) by bin(_time, 1d), Dataset = tostring([\"properties.dataset\"])",
+        "endTime": "",
+        "libraries": [],
+        "queryOptions": {
+          "additionalQueryOptions": {
+            "aggChartOpts": "{\"*\":{\"variant\":\"bars\"},\"{\\\"alias\\\":\\\"GB\\\",\\\"field\\\":\\\"properties.hourly_ingest_bytes\\\",\\\"op\\\":\\\"computed\\\"}\":{\"displayNull\":\"auto\",\"variant\":\"bars\"}}"
+          },
+          "aggChartOpts": "{\"*\":{\"variant\":\"bars\"},\"{\\\"alias\\\":\\\"GB\\\",\\\"field\\\":\\\"properties.hourly_ingest_bytes\\\",\\\"op\\\":\\\"computed\\\"}\":{\"displayNull\":\"auto\",\"variant\":\"bars\"}}",
+          "containsTimeFilter": "false",
+          "endTime": "2026-01-23T18:44:21.579Z",
+          "quickRange": "7d",
+          "startTime": "2026-01-16T18:44:21.579Z",
+          "timeSeriesView": "charts"
+        },
+        "resolution": "auto",
+        "startTime": ""
+      },
+      "type": "TimeSeries"
+    },
+    {
+      "id": "top-datasets-ingest",
+      "name": "Top Datasets by Ingest",
+      "query": {
+        "apl": "declare query_parameters (org_filter:string = \"\", dataset_filter:string = \"\");\n[\"axiom-audit\"]\n| where action == \"usageCalculated\"\n| where isempty(org_filter) or [\"resource.id\"] == org_filter\n| where isempty(dataset_filter) or tostring([\"properties.dataset\"]) == dataset_filter\n| summarize ingest_gb = round(sum([\"properties.hourly_ingest_bytes\"]) / 1000000000, 1), query_gbms = round(sum([\"properties.hourly_billable_query_gbms\"]), 0) by Dataset = tostring([\"properties.dataset\"])\n| sort by ingest_gb desc\n| limit 15\n| project Dataset, [\"Ingest GB\"] = ingest_gb, [\"Query GB·ms\"] = query_gbms"
+      },
+      "tableSettings": {
+        "columns": [
+          {"name": "Dataset", "width": 200},
+          {"name": "Ingest GB", "width": 100},
+          {"name": "Query GB·ms", "width": 120}
+        ]
+      },
+      "type": "Table"
+    },
+    {
+      "id": "waste-candidates",
+      "name": "Lowest Query Activity",
+      "query": {
+        "apl": "declare query_parameters (org_filter:string = \"\", dataset_filter:string = \"\");\n[\"axiom-audit\"]\n| where action == \"usageCalculated\"\n| where isempty(org_filter) or [\"resource.id\"] == org_filter\n| where isempty(dataset_filter) or tostring([\"properties.dataset\"]) == dataset_filter\n| summarize ingest_bytes = sum([\"properties.hourly_ingest_bytes\"]), query_gbms = sum([\"properties.hourly_billable_query_gbms\"]) by Dataset = tostring([\"properties.dataset\"])\n| extend ingest_gb = round(ingest_bytes / 1000000000, 1)\n| extend work_per_gb = round(query_gbms / (ingest_gb + 0.001), 0)\n| where ingest_gb > 1\n| order by work_per_gb asc\n| project Dataset, [\"Ingest GB\"] = ingest_gb, [\"Query GB·ms\"] = round(query_gbms, 0), [\"Work/GB\"] = work_per_gb\n| take 10"
+      },
+      "tableSettings": {
+        "columns": [
+          {"name": "Dataset", "width": 200},
+          {"name": "Ingest GB", "width": 100},
+          {"name": "Query GB·ms", "width": 120},
+          {"name": "Work/GB", "width": 90}
+        ]
+      },
+      "type": "Table"
+    },
+    {
+      "id": "top-orgs",
+      "name": "Top Orgs by Usage",
+      "query": {
+        "apl": "declare query_parameters (org_filter:string = \"\", dataset_filter:string = \"\");\n[\"axiom-audit\"]\n| where action == \"usageCalculated\"\n| where isempty(org_filter) or [\"resource.id\"] == org_filter\n| where isempty(dataset_filter) or tostring([\"properties.dataset\"]) == dataset_filter\n| summarize ingest_gb = round(sum([\"properties.hourly_ingest_bytes\"]) / 1000000000, 1), query_gbms = round(sum([\"properties.hourly_billable_query_gbms\"]), 0), datasets = dcount([\"properties.dataset\"]) by Org = [\"resource.id\"]\n| sort by ingest_gb desc\n| limit 10\n| project Org, [\"Ingest GB\"] = ingest_gb, [\"Query GB·ms\"] = query_gbms, Datasets = datasets"
+      },
+      "tableSettings": {
+        "columns": [
+          {"name": "Org", "width": 180},
+          {"name": "Ingest GB", "width": 100},
+          {"name": "Query GB·ms", "width": 120},
+          {"name": "Datasets", "width": 80}
+        ]
+      },
+      "type": "Table"
+    },
+    {
+      "id": "note-actions",
+      "name": "Cost Optimization Actions",
+      "query": {},
+      "text": "## Cost Control Playbook\n\n### Understanding Work/GB\n\n**Work/GB** = Query Cost (GB·ms) ÷ Ingest GB\n\nThis ratio measures how much query activity occurs relative to the amount of data ingested. Lower values indicate data that is stored but rarely queried.\n\n- **0** = Data ingested but never queried\n- **Low values** = Candidates for optimization\n- **Higher values** = Actively used data\n\nThe **Lowest Query Activity** panel ranks datasets by this ratio, with the least-queried at the top.\n\n### Optimization Actions\n\n| Signal | Action |\n|--------|--------|\n| **Work/GB = 0** | Consider dropping or stop ingesting |\n| **Low Work/GB + High Ingest** | Partition, sample, or filter at source |\n| **WoW spike** | Investigate recent deploys or new services |\n\n### Investigate Further\n\nUse **[axiom-sre](https://github.com/axiomhq/skills)** to investigate which logs are ingested but never queried, or which applications contribute the most events.\n\n```\nnpx skills add axiomhq/skills -s axiom-sre\n```",
+      "type": "Note"
+    },
+    {
+      "id": "query-cost-by-dataset-ts",
+      "name": "Daily Query Cost by Dataset (GB·ms)",
+      "query": {
+        "apl": "declare query_parameters (org_filter:string = \"\", dataset_filter:string = \"\");\n[\"axiom-audit\"]\n| where action == \"usageCalculated\"\n| where isempty(org_filter) or [\"resource.id\"] == org_filter\n| where isempty(dataset_filter) or tostring([\"properties.dataset\"]) == dataset_filter\n| summarize [\"GB·ms\"] = round(sum([\"properties.hourly_billable_query_gbms\"]), 0) by bin(_time, 1d), Dataset = tostring([\"properties.dataset\"])",
+        "queryOptions": {
+          "aggChartOpts": "{\"{\\\"alias\\\":\\\"GB·ms\\\",\\\"field\\\":\\\"properties.hourly_billable_query_gbms\\\",\\\"op\\\":\\\"computed\\\"}\":{\"variant\":\"bars\",\"displayNull\":\"auto\"}}"
+        }
+      },
+      "type": "TimeSeries"
+    },
+    {
+      "id": "wow-ingest-delta",
+      "name": "Top Ingest Movers (WoW)",
+      "query": {
+        "apl": "declare query_parameters (org_filter:string = \"\", dataset_filter:string = \"\");\n[\"axiom-audit\"]\n| where _time between (ago(14d) .. now())\n| where action == \"usageCalculated\"\n| where isempty(org_filter) or [\"resource.id\"] == org_filter\n| where isempty(dataset_filter) or tostring([\"properties.dataset\"]) == dataset_filter\n| summarize this_week = sumif([\"properties.hourly_ingest_bytes\"], _time >= ago(7d)), last_week = sumif([\"properties.hourly_ingest_bytes\"], _time < ago(7d) and _time >= ago(14d)) by Dataset = tostring([\"properties.dataset\"])\n| extend delta_gb = (this_week - last_week) / 1000000000\n| extend [\"This Week GB\"] = round(this_week / 1000000000, 1), [\"Last Week GB\"] = round(last_week / 1000000000, 1)\n| extend [\"Delta GB\"] = round(delta_gb, 1)\n| extend [\"Delta %\"] = iff(last_week == 0, 100.0, round(100.0 * (this_week - last_week) / last_week, 1))\n| where delta_gb > 10\n| sort by delta_gb desc\n| limit 10\n| project Dataset, [\"This Week GB\"], [\"Last Week GB\"], [\"Delta GB\"], [\"Delta %\"]"
+      },
+      "tableSettings": {
+        "columns": [
+          {"name": "Dataset", "width": 180},
+          {"name": "This Week GB", "width": 110},
+          {"name": "Last Week GB", "width": 110},
+          {"name": "Delta GB", "width": 90},
+          {"name": "Delta %", "width": 80}
+        ]
+      },
+      "type": "Table"
+    },
+    {
+      "id": "top-users-query-cost",
+      "name": "Top 10 Users by Query Cost",
+      "query": {
+        "apl": "declare query_parameters (org_filter:string = \"\", dataset_filter:string = \"\", user_filter:string = \"\");\n[\"axiom-audit\"]\n| where action == \"runAPLQueryCost\"\n| where isempty(org_filter) or [\"resource.id\"] == org_filter\n| extend User = case(isnotempty([\"actor.email\"]), [\"actor.email\"], isnotempty([\"actor.name\"]), strcat(\"[\", [\"actor.name\"], \"]\"), isnotempty([\"actor.userAgent\"]), strcat(\"[\", [\"actor.userAgent\"], \"]\"), isnotempty([\"actor.id\"]), strcat(\"[id:\", substring([\"actor.id\"], 0, 8), \"]\"), isnotempty(source), strcat(\"[source:\", source, \"]\"), \"[unknown]\")\n| where isempty(user_filter) or User == user_filter\n| summarize query_cost = sum([\"properties.query_cost_gbms\"]), queries = count() by User\n| sort by query_cost desc\n| limit 10\n| project User, [\"Cost GB·ms\"] = round(query_cost, 0), Queries = queries"
+      },
+      "tableSettings": {
+        "columns": [
+          {"name": "User", "width": 250},
+          {"name": "Cost GB·ms", "width": 120},
+          {"name": "Queries", "width": 80}
+        ]
+      },
+      "type": "Table"
+    },
+    {
+      "id": "top-expensive-queries",
+      "name": "Top 10 Expensive Queries",
+      "query": {
+        "apl": "declare query_parameters (org_filter:string = \"\", dataset_filter:string = \"\", user_filter:string = \"\");\n[\"axiom-audit\"]\n| where action == \"runAPLQueryCost\"\n| where [\"properties.query_cost_gbms\"] > 0\n| where isempty(org_filter) or [\"resource.id\"] == org_filter\n| extend User = case(isnotempty([\"actor.email\"]), [\"actor.email\"], isnotempty([\"actor.name\"]), strcat(\"[\", [\"actor.name\"], \"]\"), isnotempty([\"actor.userAgent\"]), strcat(\"[\", [\"actor.userAgent\"], \"]\"), isnotempty([\"actor.id\"]), strcat(\"[id:\", substring([\"actor.id\"], 0, 8), \"]\"), isnotempty(source), strcat(\"[source:\", source, \"]\"), \"[unknown]\")\n| where isempty(user_filter) or User == user_filter\n| sort by [\"properties.query_cost_gbms\"] desc\n| limit 10\n| project User, [\"Cost GB·ms\"] = round([\"properties.query_cost_gbms\"], 0), Query = substring([\"properties.query_string\"], 0, 100)"
+      },
+      "tableSettings": {
+        "columns": [
+          {"name": "User", "width": 180},
+          {"name": "Cost GB·ms", "width": 100},
+          {"name": "Query", "width": 300}
+        ]
+      },
+      "type": "Table"
+    },
+    {
+      "id": "queries-per-user-ts",
+      "name": "Queries per User",
+      "query": {
+        "apl": "declare query_parameters (org_filter:string = \"\", dataset_filter:string = \"\", user_filter:string = \"\");\n[\"axiom-audit\"]\n| where action == \"runAPLQueryCost\"\n| where isempty(org_filter) or [\"resource.id\"] == org_filter\n| extend User = case(isnotempty([\"actor.email\"]), [\"actor.email\"], isnotempty([\"actor.name\"]), strcat(\"[\", [\"actor.name\"], \"]\"), isnotempty([\"actor.userAgent\"]), strcat(\"[\", [\"actor.userAgent\"], \"]\"), isnotempty([\"actor.id\"]), strcat(\"[id:\", substring([\"actor.id\"], 0, 8), \"]\"), isnotempty(source), strcat(\"[source:\", source, \"]\"), \"[unknown]\")\n| where isempty(user_filter) or User == user_filter\n| summarize Queries = count() by bin(_time, 1d), User",
+        "queryOptions": {
+          "aggChartOpts": "{\"{\\\"alias\\\":\\\"Queries\\\",\\\"op\\\":\\\"count\\\"}\":{\"variant\":\"line\",\"scaleDistr\":\"log\"}}"
+        }
+      },
+      "type": "TimeSeries"
+    },
+    {
+      "id": "over-contract-pct",
+      "name": "% Over Contract",
+      "type": "Statistic",
+      "colorScheme": "Red",
+      "customUnits": "%",
+      "errorThreshold": "Above",
+      "errorThresholdValue": "50",
+      "warningThreshold": "Above",
+      "warningThresholdValue": "20",
+      "query": {
+        "apl": "declare query_parameters (org_filter:string = \"\", dataset_filter:string = \"\", contract_gb:string = \"\");\n[\"axiom-audit\"]\n| where action == \"usageCalculated\"\n| where isempty(org_filter) or [\"resource.id\"] == org_filter\n| where isempty(dataset_filter) or tostring([\"properties.dataset\"]) == dataset_filter\n| summarize total_bytes = sum([\"properties.hourly_ingest_bytes\"]), day_count = dcount(bin(_time, 1d))\n| extend days = iff(day_count == 0, 1.0, todouble(day_count))\n| extend daily_gb = total_bytes / days / 1000000000\n| extend monthly_gb = daily_gb * 30\n| extend contract = todouble(contract_gb)\n| extend [\"%\"] = iff(isempty(contract_gb) or contract <= 0, real(null), round(100.0 * (monthly_gb - contract) / contract, 0))\n| project [\"%\"]"
+      }
+    },
+    {
+      "id": "required-cut-pct",
+      "name": "Required Cut",
+      "type": "Statistic",
+      "colorScheme": "Orange",
+      "customUnits": "%",
+      "query": {
+        "apl": "declare query_parameters (org_filter:string = \"\", dataset_filter:string = \"\", contract_gb:string = \"\");\n[\"axiom-audit\"]\n| where action == \"usageCalculated\"\n| where isempty(org_filter) or [\"resource.id\"] == org_filter\n| where isempty(dataset_filter) or tostring([\"properties.dataset\"]) == dataset_filter\n| summarize total_bytes = sum([\"properties.hourly_ingest_bytes\"]), day_count = dcount(bin(_time, 1d))\n| extend days = iff(day_count == 0, 1.0, todouble(day_count))\n| extend daily_gb = total_bytes / days / 1000000000\n| extend monthly_gb = daily_gb * 30\n| extend contract = todouble(contract_gb)\n| extend [\"%\"] = iff(isempty(contract_gb) or contract <= 0 or monthly_gb <= contract, real(null), round(100.0 * (monthly_gb - contract) / monthly_gb, 0))\n| project [\"%\"]"
+      }
+    },
+    {
+      "id": "top-queried-fields",
+      "name": "Query Filter Patterns",
+      "type": "Table",
+      "query": {
+        "apl": "declare query_parameters (org_filter:string = \"\", dataset_filter:string = \"\");\n[\"axiom-history\"]\n| where isnotempty([\"query.apl\"])\n| extend parsed = parse_apl([\"query.apl\"])\n| extend dataset = tostring(parsed.body.source.name.name)\n| where isempty(dataset_filter) or dataset == dataset_filter\n| extend ops = todynamic(tostring(parsed.body.operations))\n| mv-expand ops\n| where ops.kind == \"Where\"\n| extend p = ops.predicate\n| extend is_logical = p.op in (\"and\", \"or\")\n| extend f1 = iff(not(is_logical) and p.kind == \"BinaryExpr\", tostring(p.left.name), \"\")\n| extend o1 = iff(not(is_logical) and p.kind == \"BinaryExpr\", tostring(p.op), \"\")\n| extend v1 = iff(not(is_logical) and p.kind == \"BinaryExpr\", coalesce(tostring(p.right.value), tostring(p.right.name)), \"\")\n| extend f2 = iff(tostring(p.left.op) !in (\"and\", \"or\", \"\"), tostring(p.left.left.name), \"\")\n| extend o2 = iff(tostring(p.left.op) !in (\"and\", \"or\", \"\"), tostring(p.left.op), \"\")\n| extend v2 = coalesce(tostring(p.left.right.value), tostring(p.left.right.name))\n| extend f3 = iff(tostring(p.right.op) !in (\"and\", \"or\", \"\"), tostring(p.right.left.name), \"\")\n| extend o3 = iff(tostring(p.right.op) !in (\"and\", \"or\", \"\"), tostring(p.right.op), \"\")\n| extend v3 = coalesce(tostring(p.right.right.value), tostring(p.right.right.name))\n| extend in_field = iff(p.kind == \"InExpr\", tostring(p.left.name), \"\")\n| extend in_op = iff(p.kind == \"InExpr\", tostring(p.op), \"\")\n| extend in_vals = iff(p.kind == \"InExpr\", strcat(tostring(p.right.list[0].value), \", \", tostring(p.right.list[1].value), \"...\"), \"\")\n| extend func_name = case(p.kind == \"CallExpr\", tostring(p.func.name), p.left.kind == \"CallExpr\", tostring(p.left.func.name), p.right.kind == \"CallExpr\", tostring(p.right.func.name), \"\")\n| extend func_params_arr = case(p.kind == \"CallExpr\", p.params, p.left.kind == \"CallExpr\", p.left.params, p.right.kind == \"CallExpr\", p.right.params, dynamic([]))\n| extend func_fields = strcat(tostring(func_params_arr[0].expr.name), iff(array_length(func_params_arr) > 1, strcat(\", \", tostring(func_params_arr[1].expr.name)), \"\"))\n| extend field = coalesce(iff(isnotempty(in_field), in_field, \"\"), iff(isnotempty(func_fields) and func_fields != \"_time\", func_fields, \"\"), iff(isnotempty(f1) and f1 != \"_time\", f1, \"\"), iff(isnotempty(f2) and f2 != \"_time\", f2, \"\"), iff(isnotempty(f3) and f3 != \"_time\", f3, \"\"))\n| extend op = coalesce(iff(isnotempty(in_op), in_op, \"\"), iff(isnotempty(func_name), func_name, \"\"), iff(isnotempty(o1), o1, \"\"), iff(isnotempty(o2), o2, \"\"), iff(isnotempty(o3), o3, \"\"))\n| extend val = coalesce(iff(isnotempty(in_vals), in_vals, \"\"), iff(isnotempty(v1), v1, \"\"), iff(isnotempty(v2), v2, \"\"), iff(isnotempty(v3), v3, \"\"))\n| where isnotempty(field) and isnotempty(op)\n| summarize Queries = count() by dataset, field, op, val\n| sort by Queries desc\n| limit 20\n| project Dataset=dataset, Field=field, Op=op, Value=substring(val, 0, 40), Queries"
+      },
+      "tableSettings": {
+        "columns": [
+          {"name": "Dataset", "width": 140},
+          {"name": "Field", "width": 180},
+          {"name": "Op", "width": 70},
+          {"name": "Value", "width": 150},
+          {"name": "Queries", "width": 80}
+        ]
+      }
+    }
+  ],
+  "datasets": [
+    "axiom-audit",
+    "axiom-history"
+  ],
+  "description": "Usage monitoring dashboard for tracking ingest volume, query costs, burn rate projections, and waste identification. Filter by org to scope analysis.",
+  "layout": [
+    { "i": "filters", "x": 0, "y": 0, "w": 12, "h": 1 },
+    { "i": "total-ingest-gb", "x": 0, "y": 1, "w": 3, "h": 2 },
+    { "i": "daily-burn-rate", "x": 3, "y": 1, "w": 3, "h": 2 },
+    { "i": "monthly-projection", "x": 6, "y": 1, "w": 3, "h": 2 },
+    { "i": "over-contract-pct", "x": 9, "y": 1, "w": 3, "h": 2 },
+    { "i": "required-cut-pct", "x": 0, "y": 3, "w": 3, "h": 2 },
+    { "i": "wow-change", "x": 3, "y": 3, "w": 3, "h": 2 },
+    { "i": "total-query-cost", "x": 6, "y": 3, "w": 3, "h": 2 },
+    { "i": "active-datasets", "x": 9, "y": 3, "w": 3, "h": 2 },
+    { "i": "ingest-by-dataset-ts", "x": 0, "y": 5, "w": 6, "h": 4 },
+    { "i": "query-cost-by-dataset-ts", "x": 6, "y": 5, "w": 6, "h": 4 },
+    { "i": "wow-ingest-delta", "x": 0, "y": 9, "w": 6, "h": 4 },
+    { "i": "waste-candidates", "x": 6, "y": 9, "w": 6, "h": 4 },
+    { "i": "top-datasets-ingest", "x": 0, "y": 13, "w": 6, "h": 4 },
+    { "i": "top-orgs", "x": 6, "y": 13, "w": 6, "h": 4 },
+    { "i": "top-queried-fields", "x": 0, "y": 17, "w": 6, "h": 4 },
+    { "i": "note-actions", "x": 6, "y": 17, "w": 6, "h": 4 },
+    { "i": "top-users-query-cost", "x": 0, "y": 21, "w": 6, "h": 4 },
+    { "i": "top-expensive-queries", "x": 6, "y": 21, "w": 6, "h": 4 },
+    { "i": "queries-per-user-ts", "x": 0, "y": 25, "w": 12, "h": 4 }
+  ],
+  "name": "Org Usage & Cost Control",
+  "owner": "8bc79245-bc38-4a2b-a37e-2e5b2fd5ec70",
+  "refreshTime": 300,
+  "schemaVersion": 2,
+  "timeWindowEnd": "qr-now",
+  "timeWindowStart": "qr-now-30d",
+  "version": "1769205867543185447"
+}
diff --git a/.agents/skills/building-dashboards/reference/templates/service-overview-with-filters.json b/.agents/skills/building-dashboards/reference/templates/service-overview-with-filters.json
new file mode 100644
index 00000000..8f5405ae
--- /dev/null
+++ b/.agents/skills/building-dashboards/reference/templates/service-overview-with-filters.json
@@ -0,0 +1,132 @@
+{
+  "name": "{{service}} - Service Overview (Filtered)",
+  "description": "Interactive dashboard for {{service}} with SmartFilter. Allows filtering by route and status code.",
+  "owner": "X-AXIOM-EVERYONE",
+  "datasets": ["{{dataset}}"],
+  "refreshTime": 60,
+  "schemaVersion": 2,
+  "timeWindowStart": "qr-now-1h",
+  "timeWindowEnd": "qr-now",
+  "charts": [
+    {
+      "id": "filters",
+      "name": "Filters",
+      "type": "SmartFilter",
+      "query": {"apl": ""},
+      "filters": [
+        {
+          "id": "route_filter",
+          "name": "Route",
+          "type": "select",
+          "selectType": "apl",
+          "active": true,
+          "apl": {
+            "apl": "['{{dataset}}'] | where service == '{{service}}' | distinct route | project key=route, value=route | sort by key asc",
+            "queryOptions": {"quickRange": "1h"}
+          },
+          "options": [
+            {"key": "All", "value": "", "default": true}
+          ]
+        },
+        {
+          "id": "status_filter",
+          "name": "Status",
+          "type": "select",
+          "selectType": "list",
+          "active": true,
+          "options": [
+            {"key": "All", "value": "", "default": true},
+            {"key": "2xx", "value": "2"},
+            {"key": "3xx", "value": "3"},
+            {"key": "4xx", "value": "4"},
+            {"key": "5xx", "value": "5"}
+          ]
+        }
+      ]
+    },
+    {
+      "id": "error-rate",
+      "name": "Error Rate",
+      "type": "Statistic",
+      "query": {
+        "apl": "declare query_parameters (route_filter:string = \"\", status_filter:string = \"\");\n['{{dataset}}']\n| where service == '{{service}}'\n| where isempty(route_filter) or route == route_filter\n| where isempty(status_filter) or tostring(status) startswith status_filter\n| summarize total = count(), errors = countif(status >= 500)\n| extend error_rate = iff(total > 0, round(100.0 * errors / total, 2), 0.0)\n| project ['Error Rate %'] = error_rate"
+      }
+    },
+    {
+      "id": "p95-latency",
+      "name": "p95 Latency",
+      "type": "Statistic",
+      "query": {
+        "apl": "declare query_parameters (route_filter:string = \"\", status_filter:string = \"\");\n['{{dataset}}']\n| where service == '{{service}}'\n| where isempty(route_filter) or route == route_filter\n| where isempty(status_filter) or tostring(status) startswith status_filter\n| summarize ['p95 (ms)'] = round(percentile(duration_ms, 95), 1)"
+      }
+    },
+    {
+      "id": "traffic-rps",
+      "name": "Total Requests",
+      "type": "Statistic",
+      "query": {
+        "apl": "declare query_parameters (route_filter:string = \"\", status_filter:string = \"\");\n['{{dataset}}']\n| where service == '{{service}}'\n| where isempty(route_filter) or route == route_filter\n| where isempty(status_filter) or tostring(status) startswith status_filter\n| summarize ['Total Requests'] = count()"
+      }
+    },
+    {
+      "id": "error-count",
+      "name": "Errors",
+      "type": "Statistic",
+      "query": {
+        "apl": "declare query_parameters (route_filter:string = \"\", status_filter:string = \"\");\n['{{dataset}}']\n| where service == '{{service}}' and status >= 500\n| where isempty(route_filter) or route == route_filter\n| where isempty(status_filter) or tostring(status) startswith status_filter\n| summarize Errors = count()"
+      }
+    },
+    {
+      "id": "request-rate-ts",
+      "name": "Request Rate Over Time",
+      "type": "TimeSeries",
+      "query": {
+        "apl": "declare query_parameters (route_filter:string = \"\", status_filter:string = \"\");\n['{{dataset}}']\n| where service == '{{service}}'\n| where isempty(route_filter) or route == route_filter\n| where isempty(status_filter) or tostring(status) startswith status_filter\n| summarize ['req/min'] = count() by bin_auto(_time)"
+      }
+    },
+    {
+      "id": "error-rate-ts",
+      "name": "Error Rate Over Time (%)",
+      "type": "TimeSeries",
+      "query": {
+        "apl": "declare query_parameters (route_filter:string = \"\", status_filter:string = \"\");\n['{{dataset}}']\n| where service == '{{service}}'\n| where isempty(route_filter) or route == route_filter\n| where isempty(status_filter) or tostring(status) startswith status_filter\n| summarize total = count(), errors = countif(status >= 500) by bin_auto(_time)\n| extend ['error_rate_%'] = iff(total > 0, round(100.0 * errors / total, 2), 0.0)\n| project _time, ['error_rate_%']"
+      }
+    },
+    {
+      "id": "latency-heatmap",
+      "name": "Latency Distribution",
+      "type": "Heatmap",
+      "query": {
+        "apl": "declare query_parameters (route_filter:string = \"\", status_filter:string = \"\");\n['{{dataset}}']\n| where service == '{{service}}'\n| where isempty(route_filter) or route == route_filter\n| where isempty(status_filter) or tostring(status) startswith status_filter\n| summarize histogram(duration_ms, 15) by bin_auto(_time)"
+      }
+    },
+    {
+      "id": "top-routes",
+      "name": "Top Routes by Traffic",
+      "type": "Table",
+      "query": {
+        "apl": "declare query_parameters (route_filter:string = \"\", status_filter:string = \"\");\n['{{dataset}}']\n| where service == '{{service}}'\n| where isempty(route_filter) or route == route_filter\n| where isempty(status_filter) or tostring(status) startswith status_filter\n| summarize Requests = count(), Errors = countif(status >= 500), ['p95 (ms)'] = round(percentile(duration_ms, 95), 0) by route\n| top 10 by Requests\n| project Route = route, Requests, Errors, ['p95 (ms)']"
+      }
+    },
+    {
+      "id": "recent-errors",
+      "name": "Recent Errors",
+      "type": "LogStream",
+      "query": {
+        "apl": "declare query_parameters (route_filter:string = \"\", status_filter:string = \"\");\n['{{dataset}}']\n| where service == '{{service}}' and status >= 500\n| where isempty(route_filter) or route == route_filter\n| where isempty(status_filter) or tostring(status) startswith status_filter\n| project-keep _time, trace_id, route, status, error_message, duration_ms\n| order by _time desc\n| take 100"
+      }
+    }
+  ],
+  "layout": [
+    {"i": "filters", "x": 0, "y": 0, "w": 12, "h": 1},
+    {"i": "error-rate", "x": 0, "y": 1, "w": 3, "h": 2},
+    {"i": "p95-latency", "x": 3, "y": 1, "w": 3, "h": 2},
+    {"i": "traffic-rps", "x": 6, "y": 1, "w": 3, "h": 2},
+    {"i": "error-count", "x": 9, "y": 1, "w": 3, "h": 2},
+    {"i": "request-rate-ts", "x": 0, "y": 3, "w": 6, "h": 3},
+    {"i": "error-rate-ts", "x": 6, "y": 3, "w": 6, "h": 3},
+    {"i": "latency-heatmap", "x": 0, "y": 6, "w": 12, "h": 3},
+    {"i": "top-routes", "x": 0, "y": 9, "w": 6, "h": 4},
+    {"i": "recent-errors", "x": 6, "y": 9, "w": 6, "h": 4}
+  ]
+}
diff --git a/.agents/skills/building-dashboards/reference/templates/service-overview.json b/.agents/skills/building-dashboards/reference/templates/service-overview.json
new file mode 100644
index 00000000..2bbf598c
--- /dev/null
+++ b/.agents/skills/building-dashboards/reference/templates/service-overview.json
@@ -0,0 +1,113 @@
+{
+  "name": "{{service}} - Service Overview",
+  "description": "Oncall dashboard for {{service}} service. Shows traffic, errors, latency, and recent error logs.",
+  "owner": "X-AXIOM-EVERYONE",
+  "datasets": ["{{dataset}}"],
+  "refreshTime": 60,
+  "schemaVersion": 2,
+  "timeWindowStart": "qr-now-1h",
+  "timeWindowEnd": "qr-now",
+  "charts": [
+    {
+      "id": "error-rate",
+      "name": "Error Rate",
+      "type": "Statistic",
+      "query": {
+        "apl": "['{{dataset}}'] | where service == '{{service}}' | summarize total = count(), errors = countif(status >= 500) | extend error_rate = iff(total > 0, round(100.0 * errors / total, 2), 0.0) | project ['Error Rate %'] = error_rate"
+      }
+    },
+    {
+      "id": "p95-latency",
+      "name": "p95 Latency",
+      "type": "Statistic",
+      "query": {
+        "apl": "['{{dataset}}'] | where service == '{{service}}' | summarize ['p95 (ms)'] = round(percentile(duration_ms, 95), 1)"
+      }
+    },
+    {
+      "id": "traffic-rps",
+      "name": "Total Requests",
+      "type": "Statistic",
+      "query": {
+        "apl": "['{{dataset}}'] | where service == '{{service}}' | summarize ['Total Requests'] = count()"
+      }
+    },
+    {
+      "id": "error-count",
+      "name": "Errors",
+      "type": "Statistic",
+      "query": {
+        "apl": "['{{dataset}}'] | where service == '{{service}}' and status >= 500 | summarize Errors = count()"
+      }
+    },
+    {
+      "id": "request-rate-ts",
+      "name": "Request Rate Over Time",
+      "type": "TimeSeries",
+      "query": {
+        "apl": "['{{dataset}}'] | where service == '{{service}}' | summarize ['req/min'] = count() by bin_auto(_time)"
+      }
+    },
+    {
+      "id": "error-rate-ts",
+      "name": "Error Rate Over Time (%)",
+      "type": "TimeSeries",
+      "query": {
+        "apl": "['{{dataset}}'] | where service == '{{service}}' | summarize total = count(), errors = countif(status >= 500) by bin_auto(_time) | extend ['error_rate_%'] = iff(total > 0, round(100.0 * errors / total, 2), 0.0) | project _time, ['error_rate_%']"
+      }
+    },
+    {
+      "id": "latency-ts",
+      "name": "Latency Percentiles (ms)",
+      "type": "TimeSeries",
+      "query": {
+        "apl": "['{{dataset}}'] | where service == '{{service}}' | summarize percentiles_array(duration_ms, 50, 95, 99) by bin_auto(_time)"
+      }
+    },
+    {
+      "id": "latency-heatmap",
+      "name": "Latency Distribution",
+      "type": "Heatmap",
+      "query": {
+        "apl": "['{{dataset}}'] | where service == '{{service}}' | summarize histogram(duration_ms, 15) by bin_auto(_time)"
+      }
+    },
+    {
+      "id": "status-distribution",
+      "name": "Status Codes",
+      "type": "Pie",
+      "query": {
+        "apl": "['{{dataset}}'] | where service == '{{service}}' | extend status_class = case(status < 300, '2xx', status < 400, '3xx', status < 500, '4xx', '5xx') | summarize count() by status_class"
+      }
+    },
+    {
+      "id": "top-routes",
+      "name": "Top Routes by Traffic",
+      "type": "Table",
+      "query": {
+        "apl": "['{{dataset}}'] | where service == '{{service}}' | summarize Requests = count(), Errors = countif(status >= 500), ['p95 (ms)'] = round(percentile(duration_ms, 95), 0) by route | top 10 by Requests | project Route = route, Requests, Errors, ['p95 (ms)']"
+      }
+    },
+    {
+      "id": "recent-errors",
+      "name": "Recent Errors",
+      "type": "LogStream",
+      "query": {
+        "apl": "['{{dataset}}'] | where service == '{{service}}' and status >= 500 | project-keep _time, trace_id, route, status, error_message, duration_ms | order by _time desc | take 100"
+      }
+    }
+  ],
+  "layout": [
+    {"i": "error-rate", "x": 0, "y": 0, "w": 3, "h": 2},
+    {"i": "p95-latency", "x": 3, "y": 0, "w": 3, "h": 2},
+    {"i": "traffic-rps", "x": 6, "y": 0, "w": 3, "h": 2},
+    {"i": "error-count", "x": 9, "y": 0, "w": 3, "h": 2},
+    {"i": "request-rate-ts", "x": 0, "y": 2, "w": 6, "h": 3},
+    {"i": "error-rate-ts", "x": 6, "y": 2, "w": 6, "h": 3},
+    {"i": "latency-ts", "x": 0, "y": 5, "w": 6, "h": 3},
+    {"i": "latency-heatmap", "x": 6, "y": 5, "w": 6, "h": 3},
+    {"i": "status-distribution", "x": 0, "y": 8, "w": 4, "h": 3},
+    {"i": "top-routes", "x": 4, "y": 8, "w": 8, "h": 3},
+    {"i": "recent-errors", "x": 0, "y": 11, "w": 12, "h": 4}
+  ]
+}
diff --git a/.agents/skills/building-dashboards/scripts/axiom-api b/.agents/skills/building-dashboards/scripts/axiom-api
new file mode 100755
index 00000000..27ad5f97
--- /dev/null
+++ b/.agents/skills/building-dashboards/scripts/axiom-api
@@ -0,0 +1,89 @@
+#!/usr/bin/env bash
+# axiom-api: Make authenticated requests to the Axiom DASHBOARD/APP API
+#
+# ⚠️ This script rewrites URLs (api.* → app.*/api) for the dashboard API.
+# For data/metrics API calls (/v1/query/*, /v1/datasets), use scripts/metrics/axiom-api instead.
+#
+# Usage: axiom-api <deployment> <method> <path> [json-body]
+#
+# Reads credentials from ~/.axiom.toml (shared with axiom-sre)
+#
+# Examples:
+#   axiom-api prod GET /dashboards
+#   axiom-api prod GET /dashboards/abc123
+#   axiom-api prod POST /dashboards '{"name":"Test",...}'
+#   axiom-api prod GET /user
+
+set -euo pipefail
+
+DEPLOYMENT="${1:-}"
+METHOD="${2:-}"
+PATH_="${3:-}"
+BODY="${4:-}"
+
+if [[ -z "$DEPLOYMENT" || -z "$METHOD" || -z "$PATH_" ]]; then
+    echo "Usage: axiom-api <deployment> <method> <path> [json-body]" >&2
+    exit 1
+fi
+
+# Reject data/metrics paths that should use scripts/metrics/axiom-api
+case "$PATH_" in
+    /v1/query/*|/v1/datasets*)
+        echo "Error: This script is for the dashboard/app API." >&2
+        echo "For data/metrics endpoints ($PATH_), use scripts/metrics/axiom-api instead." >&2
+        exit 2
+        ;;
+esac
+
+CONFIG_FILE="$HOME/.axiom.toml"
+if [[ ! -f "$CONFIG_FILE" ]]; then
+    echo "Error: $CONFIG_FILE not found" >&2
+    exit 1
+fi
+
+# Parse TOML for deployment config
+extract_value() {
+    local key="$1"
+    awk -v deployment="$DEPLOYMENT" -v key="$key" '
+        /^[[:space:]]*\[deployments\./ { in_deployment = ($0 ~ "\\[deployments\\." deployment "\\]") }
+        in_deployment {
+            gsub(/^[[:space:]]+/, "")
+            if ($1 == key) {
+                sub(/^[^=]*=[[:space:]]*/, "")
+                if (match($0, /^"[^"]*"/)) {
+                    $0 = substr($0, RSTART+1, RLENGTH-2)
+                } else {
+                    sub(/[[:space:]]*#.*$/, "")
+                }
+                print
+                exit
+            }
+        }
+    ' "$CONFIG_FILE"
+}
+
+URL=$(extract_value "url")
+TOKEN=$(extract_value "token")
+ORG_ID=$(extract_value "org_id")
+
+if [[ -z "$URL" || -z "$TOKEN" || -z "$ORG_ID" ]]; then
+    echo "Error: Could not find deployment '$DEPLOYMENT' in $CONFIG_FILE" >&2
+    exit 1
+fi
+
+API_URL="${URL%/}/v2"
+
+CURL_ARGS=(
+    -s
+    -X "$METHOD"
+    -H "Authorization: Bearer $TOKEN"
+    -H "X-Axiom-Org-Id: $ORG_ID"
+    -H "Content-Type: application/json"
+    -H "Accept: application/json"
+)
+
+if [[ -n "$BODY" ]]; then
+    CURL_ARGS+=(-d "$BODY")
+fi
+
+curl "${CURL_ARGS[@]}" "${API_URL}${PATH_}"
diff --git a/.agents/skills/building-dashboards/scripts/dashboard-chart-patch b/.agents/skills/building-dashboards/scripts/dashboard-chart-patch
new file mode 100755
index 00000000..df969dad
--- /dev/null
+++ b/.agents/skills/building-dashboards/scripts/dashboard-chart-patch
@@ -0,0 +1,122 @@
+#!/usr/bin/env bash
+# dashboard-chart-patch: Patch one chart in an existing dashboard
+#
+# Usage:
+#   dashboard-chart-patch <deployment> <dashboard-uid> <chart-id> <patch-json-file> (--version <version> | --overwrite) [--message <message>]
+#
+# The patch file must contain a JSON object. It is sent as the `chart` JSON
+# merge patch, so null values remove existing chart fields.
+#
+# Examples:
+#   dashboard-chart-patch prod dash-uid error-rate ./chart.patch.json --version 12
+#   dashboard-chart-patch prod dash-uid error-rate ./chart.patch.json --overwrite --message "Update error chart"
+
+set -euo pipefail
+
+SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+
+usage() {
+    echo "Usage: dashboard-chart-patch <deployment> <dashboard-uid> <chart-id> <patch-json-file> (--version <version> | --overwrite) [--message <message>]" >&2
+}
+
+DEPLOYMENT="${1:-}"
+DASHBOARD_UID="${2:-}"
+CHART_ID="${3:-}"
+PATCH_FILE="${4:-}"
+
+if [[ $# -lt 4 || -z "$DEPLOYMENT" || -z "$DASHBOARD_UID" || -z "$CHART_ID" || -z "$PATCH_FILE" ]]; then
+    usage
+    exit 1
+fi
+
+shift 4
+
+VERSION=""
+OVERWRITE="false"
+MESSAGE=""
+
+while [[ $# -gt 0 ]]; do
+    case "$1" in
+        --version)
+            if [[ $# -lt 2 || -z "${2:-}" ]]; then
+                echo "Error: --version requires a value" >&2
+                usage
+                exit 1
+            fi
+            VERSION="$2"
+            shift 2
+            ;;
+        --overwrite)
+            OVERWRITE="true"
+            shift
+            ;;
+        --message)
+            if [[ $# -lt 2 ]]; then
+                echo "Error: --message requires a value" >&2
+                usage
+                exit 1
+            fi
+            MESSAGE="$2"
+            shift 2
+            ;;
+        -h|--help)
+            usage
+            exit 0
+            ;;
+        *)
+            echo "Error: Unknown option: $1" >&2
+            usage
+            exit 1
+            ;;
+    esac
+done
+
+if [[ ! -f "$PATCH_FILE" ]]; then
+    echo "Error: File not found: $PATCH_FILE" >&2
+    exit 1
+fi
+
+if [[ "$OVERWRITE" == "true" && -n "$VERSION" ]]; then
+    echo "Error: Use either --version or --overwrite, not both" >&2
+    exit 1
+fi
+
+if [[ "$OVERWRITE" == "false" && -z "$VERSION" ]]; then
+    echo "Error: --version is required unless --overwrite is set" >&2
+    echo "Fetch the current dashboard first: dashboard-get $DEPLOYMENT $DASHBOARD_UID" >&2
+    exit 1
+fi
+
+if [[ -n "$VERSION" && ! "$VERSION" =~ ^[0-9]+$ ]]; then
+    echo "Error: --version must be a numeric dashboard version" >&2
+    exit 1
+fi
+
+if ! jq -e 'type == "object"' "$PATCH_FILE" > /dev/null; then
+    echo "Error: chart patch must be a JSON object" >&2
+    exit 1
+fi
+
+PATCH_ID=$(jq -r 'if has("id") then .id else empty end' "$PATCH_FILE")
+if [[ -n "$PATCH_ID" && "$PATCH_ID" != "$CHART_ID" ]]; then
+    echo "Error: chart patch id must match chart id '$CHART_ID'" >&2
+    exit 1
+fi
+
+CHART_PATCH=$(jq -c '.' "$PATCH_FILE")
+
+BODY=$(jq -n \
+    --argjson chart "$CHART_PATCH" \
+    --arg message "$MESSAGE" \
+    --argjson overwrite "$OVERWRITE" \
+    '{chart: $chart}
+     + (if $overwrite then {overwrite: true} else {} end)
+     + (if $message != "" then {message: $message} else {} end)')
+
+if [[ -n "$VERSION" ]]; then
+    BODY=$(echo "$BODY" | jq --argjson version "$VERSION" '. + {version: $version}')
+fi
+
+RESPONSE=$("$SCRIPT_DIR/axiom-api" "$DEPLOYMENT" PATCH "/dashboards/uid/$DASHBOARD_UID/charts/$CHART_ID" "$BODY")
+
+echo "$RESPONSE" | jq .
diff --git a/.agents/skills/building-dashboards/scripts/dashboard-copy b/.agents/skills/building-dashboards/scripts/dashboard-copy
new file mode 100755
index 00000000..deb7321e
--- /dev/null
+++ b/.agents/skills/building-dashboards/scripts/dashboard-copy
@@ -0,0 +1,51 @@
+#!/usr/bin/env bash
+# dashboard-copy: Clone an existing dashboard
+#
+# Usage: dashboard-copy <deployment> <id> [new-name]
+#
+# Examples:
+#   dashboard-copy prod abc123                    # Creates "Original Name (copy)"
+#   dashboard-copy prod abc123 "My New Dashboard" # Creates with custom name
+
+set -euo pipefail
+
+SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+
+DEPLOYMENT="${1:-}"
+ID="${2:-}"
+NEW_NAME="${3:-}"
+
+if [[ -z "$DEPLOYMENT" || -z "$ID" ]]; then
+    echo "Usage: dashboard-copy <deployment> <id> [new-name]" >&2
+    exit 1
+fi
+
+# Fetch original
+ORIGINAL=$("$SCRIPT_DIR/axiom-api" "$DEPLOYMENT" GET "/dashboards/uid/$ID")
+
+# Get original name if new name not provided
+if [[ -z "$NEW_NAME" ]]; then
+    ORIG_NAME=$(echo "$ORIGINAL" | jq -r '.dashboard.name')
+    NEW_NAME="${ORIG_NAME} (copy)"
+fi
+
+# Strip server fields and set new name on the dashboard subobject
+BODY=$(echo "$ORIGINAL" | jq --arg name "$NEW_NAME" '
+    .dashboard |
+    del(.id, .uid, .version, .createdAt, .updatedAt, .createdBy, .updatedBy) |
+    .name = $name
+' | jq '{dashboard: .}')
+
+RESPONSE=$("$SCRIPT_DIR/axiom-api" "$DEPLOYMENT" POST "/dashboards" "$BODY")
+
+# Print new UID and name
+ID=$(echo "$RESPONSE" | jq -r '.dashboard.uid // empty')
+NAME=$(echo "$RESPONSE" | jq -r '.dashboard.dashboard.name // empty')
+
+if [[ -n "$ID" ]]; then
+    echo -e "${ID}\t${NAME}"
+else
+    echo "Error copying dashboard:" >&2
+    echo "$RESPONSE" | jq . >&2
+    exit 1
+fi
diff --git a/.agents/skills/building-dashboards/scripts/dashboard-create b/.agents/skills/building-dashboards/scripts/dashboard-create
new file mode 100755
index 00000000..00c81641
--- /dev/null
+++ b/.agents/skills/building-dashboards/scripts/dashboard-create
@@ -0,0 +1,55 @@
+#!/usr/bin/env bash
+# dashboard-create: Create a dashboard from JSON file
+#
+# Usage: dashboard-create <deployment> <json-file>
+#
+# The JSON file should NOT contain id, version, createdAt, updatedAt fields.
+# Use templates or dashboard-from-template to generate valid JSON.
+#
+# Examples:
+#   dashboard-create prod ./my-dashboard.json
+#   dashboard-create staging ./dashboard.json
+
+set -euo pipefail
+
+SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+
+DEPLOYMENT="${1:-}"
+JSON_FILE="${2:-}"
+
+if [[ -z "$DEPLOYMENT" || -z "$JSON_FILE" ]]; then
+    echo "Usage: dashboard-create <deployment> <json-file>" >&2
+    exit 1
+fi
+
+if [[ ! -f "$JSON_FILE" ]]; then
+    echo "Error: File not found: $JSON_FILE" >&2
+    exit 1
+fi
+
+# Validate dashboard structure before deploying
+if ! "$SCRIPT_DIR/dashboard-validate" "$JSON_FILE" --strict >&2; then
+    echo "Error: Dashboard validation failed. Fix the errors above before deploying." >&2
+    exit 1
+fi
+
+# Read, strip server-managed fields, and normalize layout for react-grid-layout
+BODY=$(jq -L "$SCRIPT_DIR" '
+  include "dashboard-normalize";
+  del(.id, .uid, .version, .createdAt, .updatedAt, .createdBy, .updatedBy) |
+  normalize_dashboard_layout
+' "$JSON_FILE")
+
+BODY=$(echo "$BODY" | jq '{dashboard: .}')
+
+RESPONSE=$("$SCRIPT_DIR/axiom-api" "$DEPLOYMENT" POST "/dashboards" "$BODY")
+
+# Extract and print the new dashboard UID
+ID=$(echo "$RESPONSE" | jq -r '.dashboard.uid // empty')
+if [[ -n "$ID" ]]; then
+    echo "$ID"
+else
+    echo "Error creating dashboard:" >&2
+    echo "$RESPONSE" | jq . >&2
+    exit 1
+fi
diff --git a/.agents/skills/building-dashboards/scripts/dashboard-delete b/.agents/skills/building-dashboards/scripts/dashboard-delete
new file mode 100755
index 00000000..7ae99eaf
--- /dev/null
+++ b/.agents/skills/building-dashboards/scripts/dashboard-delete
@@ -0,0 +1,32 @@
+#!/usr/bin/env bash
+# dashboard-delete: Delete a dashboard
+#
+# Usage: dashboard-delete <deployment> <id>
+#
+# ⚠️  This is irreversible! Axiom cannot restore deleted dashboards.
+#
+# Examples:
+#   dashboard-delete prod abc123
+
+set -euo pipefail
+
+SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+
+DEPLOYMENT="${1:-}"
+ID="${2:-}"
+
+if [[ -z "$DEPLOYMENT" || -z "$ID" ]]; then
+    echo "Usage: dashboard-delete <deployment> <id>" >&2
+    exit 1
+fi
+
+# Confirm
+read -p "Delete dashboard $ID? This cannot be undone. [y/N] " -n 1 -r
+echo
+if [[ ! $REPLY =~ ^[Yy]$ ]]; then
+    echo "Cancelled"
+    exit 0
+fi
+
+"$SCRIPT_DIR/axiom-api" "$DEPLOYMENT" DELETE "/dashboards/uid/$ID"
+echo "Deleted: $ID"
diff --git a/.agents/skills/building-dashboards/scripts/dashboard-from-template b/.agents/skills/building-dashboards/scripts/dashboard-from-template
new file mode 100755
index 00000000..40255660
--- /dev/null
+++ b/.agents/skills/building-dashboards/scripts/dashboard-from-template
@@ -0,0 +1,57 @@
+#!/usr/bin/env bash
+# dashboard-from-template: Instantiate a dashboard template with substitutions
+#
+# Usage: dashboard-from-template <template> <service> <dataset> [output-file]
+#
+# Arguments:
+#   template    - Template name: service-overview, api-health, blank
+#   service     - Service name
+#   dataset     - Dataset name
+#   output-file - Output path (default: stdout)
+#
+# Examples:
+#   dashboard-from-template service-overview "api-gateway" "http-logs"
+#   dashboard-from-template api-health "payment-api" "payment-logs" ./payment.json
+
+set -euo pipefail
+
+SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+TEMPLATE_DIR="$SCRIPT_DIR/../reference/templates"
+
+if [[ $# -lt 3 ]]; then
+    echo "Usage: dashboard-from-template <template> <service> <dataset> [output-file]" >&2
+    echo "" >&2
+    echo "Available templates:" >&2
+    for f in "$TEMPLATE_DIR"/*.json; do
+        basename "$f" .json >&2
+    done
+    exit 1
+fi
+
+TEMPLATE_NAME="$1"
+SERVICE="$2"
+DATASET="$3"
+OUTPUT="${4:-/dev/stdout}"
+
+TEMPLATE="$TEMPLATE_DIR/$TEMPLATE_NAME.json"
+
+if [[ ! -f "$TEMPLATE" ]]; then
+    echo "Error: Template '$TEMPLATE_NAME' not found" >&2
+    echo "" >&2
+    echo "Available templates:" >&2
+    for f in "$TEMPLATE_DIR"/*.json; do
+        basename "$f" .json >&2
+    done
+    exit 1
+fi
+
+# Replace placeholders
+sed -e "s/{{service}}/$SERVICE/g" \
+    -e "s/{{dataset}}/$DATASET/g" \
+    -e "s/{{name}}/$SERVICE/g" \
+    -e "s/{{description}}/Dashboard for $SERVICE/g" \
+    "$TEMPLATE" > "$OUTPUT"
+
+if [[ "$OUTPUT" != "/dev/stdout" ]]; then
+    echo "Created: $OUTPUT" >&2
+fi
diff --git a/.agents/skills/building-dashboards/scripts/dashboard-get b/.agents/skills/building-dashboards/scripts/dashboard-get
new file mode 100755
index 00000000..5e17186e
--- /dev/null
+++ b/.agents/skills/building-dashboards/scripts/dashboard-get
@@ -0,0 +1,22 @@
+#!/usr/bin/env bash
+# dashboard-get: Get a dashboard by ID
+#
+# Usage: dashboard-get <deployment> <id>
+#
+# Examples:
+#   dashboard-get prod abc123
+#   dashboard-get prod abc123 > dashboard.json
+
+set -euo pipefail
+
+SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+
+DEPLOYMENT="${1:-}"
+ID="${2:-}"
+
+if [[ -z "$DEPLOYMENT" || -z "$ID" ]]; then
+    echo "Usage: dashboard-get <deployment> <id>" >&2
+    exit 1
+fi
+
+"$SCRIPT_DIR/axiom-api" "$DEPLOYMENT" GET "/dashboards/uid/$ID" | jq '.dashboard'
diff --git a/.agents/skills/building-dashboards/scripts/dashboard-link b/.agents/skills/building-dashboards/scripts/dashboard-link
new file mode 100755
index 00000000..2e1870f3
--- /dev/null
+++ b/.agents/skills/building-dashboards/scripts/dashboard-link
@@ -0,0 +1,70 @@
+#!/usr/bin/env bash
+# dashboard-link: Generate a link to an Axiom dashboard
+#
+# Usage: dashboard-link <deployment> <dashboard-id>
+#
+# Examples:
+#   dashboard-link prod EHYTHcQmO0ZZCK0zdw
+#   dashboard-link staging abc123
+
+set -euo pipefail
+
+SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+
+DEPLOYMENT="${1:-}"
+DASHBOARD_ID="${2:-}"
+
+if [[ -z "$DEPLOYMENT" || -z "$DASHBOARD_ID" ]]; then
+    echo "Usage: dashboard-link <deployment> <dashboard-id>" >&2
+    exit 1
+fi
+
+CONFIG_FILE="$HOME/.axiom.toml"
+if [[ ! -f "$CONFIG_FILE" ]]; then
+    echo "Error: $CONFIG_FILE not found" >&2
+    exit 1
+fi
+
+# Parse TOML for deployment config
+extract_value() {
+    local key="$1"
+    awk -v deployment="$DEPLOYMENT" -v key="$key" '
+        /^[[:space:]]*\[deployments\./ { in_deployment = ($0 ~ "\\[deployments\\." deployment "\\]") }
+        in_deployment {
+            gsub(/^[[:space:]]+/, "")
+            if ($1 == key) {
+                sub(/^[^=]*=[[:space:]]*/, "")
+                if (match($0, /^"[^"]*"/)) {
+                    $0 = substr($0, RSTART+1, RLENGTH-2)
+                } else {
+                    sub(/[[:space:]]*#.*$/, "")
+                }
+                print
+                exit
+            }
+        }
+    ' "$CONFIG_FILE"
+}
+
+URL=$(extract_value "url")
+ORG_ID=$(extract_value "org_id")
+
+
+if [[ -z "$URL" || -z "$ORG_ID" ]]; then
+    echo "Error: Could not find deployment '$DEPLOYMENT' in $CONFIG_FILE" >&2
+    exit 1
+fi
+
+# Resolve short ID for the UI URL
+LINK_ID=$("$SCRIPT_DIR/axiom-api" "$DEPLOYMENT" GET "/dashboards/uid/$DASHBOARD_ID" | jq -r '.id // empty')
+if [[ -z "$LINK_ID" ]]; then
+    echo "Error: Could not resolve dashboard ID '$DASHBOARD_ID'" >&2
+    exit 1
+fi
+
+# Convert API URL to app URL
+# api.axiom.co -> app.axiom.co
+# api.dev.axiomtestlabs.co -> app.dev.axiomtestlabs.co
+APP_URL="${URL/api./app.}"
+
+echo "${APP_URL}/${ORG_ID}/dashboards/${LINK_ID}"
diff --git a/.agents/skills/building-dashboards/scripts/dashboard-list b/.agents/skills/building-dashboards/scripts/dashboard-list
new file mode 100755
index 00000000..9e37d405
--- /dev/null
+++ b/.agents/skills/building-dashboards/scripts/dashboard-list
@@ -0,0 +1,28 @@
+#!/usr/bin/env bash
+# dashboard-list: List all dashboards
+#
+# Usage: dashboard-list <deployment> [--json]
+#
+# Examples:
+#   dashboard-list prod              # tab-separated id, name
+#   dashboard-list prod --json       # full JSON
+
+set -euo pipefail
+
+SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+
+DEPLOYMENT="${1:-}"
+FORMAT="${2:-}"
+
+if [[ -z "$DEPLOYMENT" ]]; then
+    echo "Usage: dashboard-list <deployment> [--json]" >&2
+    exit 1
+fi
+
+RESPONSE=$("$SCRIPT_DIR/axiom-api" "$DEPLOYMENT" GET "/dashboards?limit=1000")
+
+if [[ "$FORMAT" == "--json" ]]; then
+    echo "$RESPONSE" | jq .
+else
+    echo "$RESPONSE" | jq -r '.[] | [.uid, .dashboard.name] | @tsv'
+fi
diff --git a/.agents/skills/building-dashboards/scripts/dashboard-new b/.agents/skills/building-dashboards/scripts/dashboard-new
new file mode 100755
index 00000000..67ee7171
--- /dev/null
+++ b/.agents/skills/building-dashboards/scripts/dashboard-new
@@ -0,0 +1,43 @@
+#!/usr/bin/env bash
+# dashboard-new: Create a new dashboard from the blank template
+#
+# Usage: dashboard-new <name> <dataset> [output-file]
+#
+# Arguments:
+#   name        - Dashboard name
+#   dataset     - Primary dataset name
+#   output-file - Output path (default: stdout)
+#
+# Examples:
+#   dashboard-new "API Gateway" "http-logs"
+#   dashboard-new "Payment Service" "payment-logs" ./payment.json
+
+set -euo pipefail
+
+SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+TEMPLATE_DIR="$SCRIPT_DIR/../reference/templates"
+TEMPLATE="$TEMPLATE_DIR/blank.json"
+
+if [[ $# -lt 2 ]]; then
+    echo "Usage: dashboard-new <name> <dataset> [output-file]" >&2
+    exit 1
+fi
+
+NAME="$1"
+DATASET="$2"
+OUTPUT="${3:-/dev/stdout}"
+
+if [[ ! -f "$TEMPLATE" ]]; then
+    echo "Error: Template not found at $TEMPLATE" >&2
+    exit 1
+fi
+
+# Replace placeholders
+sed -e "s/{{name}}/$NAME/g" \
+    -e "s/{{description}}/Dashboard for $NAME/g" \
+    -e "s/{{dataset}}/$DATASET/g" \
+    "$TEMPLATE" > "$OUTPUT"
+
+if [[ "$OUTPUT" != "/dev/stdout" ]]; then
+    echo "Created: $OUTPUT" >&2
+fi
diff --git a/.agents/skills/building-dashboards/scripts/dashboard-normalize.jq b/.agents/skills/building-dashboards/scripts/dashboard-normalize.jq
new file mode 100755
index 00000000..3dbbc86f
--- /dev/null
+++ b/.agents/skills/building-dashboards/scripts/dashboard-normalize.jq
@@ -0,0 +1,10 @@
+def normalize_dashboard_layout:
+  if .layout then
+    .layout = [.layout[] |
+      .minH = (.minH // (if .h <= 2 then .h else 2 end)) |
+      .minW = (.minW // 2) |
+      .moved = (.moved // false) |
+      .static = (.static // false)
+    ]
+  else .
+  end;
diff --git a/.agents/skills/building-dashboards/scripts/dashboard-update b/.agents/skills/building-dashboards/scripts/dashboard-update
new file mode 100755
index 00000000..12342a38
--- /dev/null
+++ b/.agents/skills/building-dashboards/scripts/dashboard-update
@@ -0,0 +1,60 @@
+#!/usr/bin/env bash
+# dashboard-update: Update an existing dashboard
+#
+# Usage: dashboard-update <deployment> <id> <json-file>
+#
+# The JSON file must contain the version field from the current dashboard
+# to avoid conflicts. Use dashboard-get to fetch the current version first.
+#
+# Examples:
+#   dashboard-get prod abc123 > dashboard.json
+#   # ... edit dashboard.json ...
+#   dashboard-update prod abc123 dashboard.json
+
+set -euo pipefail
+
+SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+
+DEPLOYMENT="${1:-}"
+ID="${2:-}"
+JSON_FILE="${3:-}"
+
+if [[ -z "$DEPLOYMENT" || -z "$ID" || -z "$JSON_FILE" ]]; then
+    echo "Usage: dashboard-update <deployment> <id> <json-file>" >&2
+    exit 1
+fi
+
+if [[ ! -f "$JSON_FILE" ]]; then
+    echo "Error: File not found: $JSON_FILE" >&2
+    exit 1
+fi
+
+# Validate dashboard structure before deploying
+if ! "$SCRIPT_DIR/dashboard-validate" "$JSON_FILE" --strict >&2; then
+    echo "Error: Dashboard validation failed. Fix the errors above before deploying." >&2
+    exit 1
+fi
+
+# Normalize layout for react-grid-layout
+BODY=$(jq -L "$SCRIPT_DIR" '
+  include "dashboard-normalize";
+  normalize_dashboard_layout
+' "$JSON_FILE")
+
+# Check for version field (required for PUT)
+VERSION=$(echo "$BODY" | jq -r '.version // empty')
+if [[ -z "$VERSION" ]]; then
+    echo "Error: version field is required for updates" >&2
+    echo "Fetch the current dashboard first: dashboard-get $DEPLOYMENT $ID" >&2
+    exit 1
+fi
+
+# Wrap body in v2 envelope: {dashboard: {...}, version: N}
+# Version must be a numeric int64. jq loses precision on large integers,
+# so we inject it via string substitution.
+DASHBOARD=$(echo "$BODY" | jq 'del(.version)')
+BODY="{\"dashboard\":${DASHBOARD},\"version\":${VERSION}}"
+
+RESPONSE=$("$SCRIPT_DIR/axiom-api" "$DEPLOYMENT" PUT "/dashboards/uid/$ID" "$BODY")
+
+echo "$RESPONSE" | jq .
diff --git a/.agents/skills/building-dashboards/scripts/dashboard-validate b/.agents/skills/building-dashboards/scripts/dashboard-validate
new file mode 100755
index 00000000..e41054a6
--- /dev/null
+++ b/.agents/skills/building-dashboards/scripts/dashboard-validate
@@ -0,0 +1,154 @@
+#!/usr/bin/env bash
+# dashboard-validate: Validate a dashboard JSON file
+#
+# Usage: dashboard-validate <path-to-json> [--strict]
+#
+# Checks:
+#   1. Valid JSON structure
+#   2. Required fields present (name, owner, charts, layout)
+#   3. All charts have an id field
+#   4. No duplicate chart IDs
+#   5. Chart IDs match layout IDs
+#   6. LogStream queries have take limits
+#   7. Grid layout doesn't exceed 12 columns
+#
+# Options:
+#   --strict  Treat warnings as errors
+#
+# Examples:
+#   dashboard-validate ./dashboard.json
+#   dashboard-validate ./dashboard.json --strict
+
+set -uo pipefail
+
+STRICT=false
+FILE=""
+
+for arg in "$@"; do
+    case $arg in
+        --strict)
+            STRICT=true
+            ;;
+        *)
+            FILE="$arg"
+            ;;
+    esac
+done
+
+if [[ -z "$FILE" ]]; then
+    echo "Usage: dashboard-validate <path-to-json> [--strict]"
+    exit 2
+fi
+
+if [[ ! -f "$FILE" ]]; then
+    echo "Error: File not found: $FILE"
+    exit 2
+fi
+
+errors=0
+warnings=0
+
+error() {
+    echo "ERROR: $1"
+    errors=$((errors + 1))
+}
+
+warn() {
+    echo "WARN: $1"
+    warnings=$((warnings + 1))
+}
+
+info() {
+    echo "INFO: $1"
+}
+
+# 1. Check valid JSON
+if ! jq empty "$FILE" 2>/dev/null; then
+    echo "ERROR: Invalid JSON"
+    exit 1
+fi
+info "Valid JSON structure"
+
+# 2. Check required fields
+for field in name owner; do
+    if [[ $(jq -r ".$field // empty" "$FILE") == "" ]]; then
+        error "Missing required field: $field"
+    fi
+done
+
+# 3. Check charts array exists
+if [[ $(jq -r '.charts // "null"' "$FILE") == "null" ]]; then
+    error "Missing charts array"
+fi
+
+# 4. Check layout array exists
+if [[ $(jq -r '.layout // "null"' "$FILE") == "null" ]]; then
+    error "Missing layout array"
+fi
+
+# 5. Check all charts have an id field
+missing_ids=$(jq -r '.charts | to_entries[] | select(.value.id == null or .value.id == "") | .key' "$FILE" 2>/dev/null || true)
+if [[ -n "$missing_ids" ]]; then
+    for idx in $missing_ids; do
+        chart_name=$(jq -r ".charts[$idx].name // \"(unnamed)\"" "$FILE")
+        error "Chart at index $idx ($chart_name) is missing required 'id' field"
+    done
+fi
+
+# 6. Check for duplicate chart IDs
+duplicate_ids=$(jq -r '[.charts[].id // empty] | group_by(.) | map(select(length > 1) | .[0]) | .[]' "$FILE" 2>/dev/null | sort -u || true)
+if [[ -n "$duplicate_ids" ]]; then
+    for dup_id in $duplicate_ids; do
+        error "Duplicate chart id: '$dup_id'"
+    done
+fi
+
+# 7. Check chart IDs match layout IDs
+chart_ids=$(jq -r '.charts[].id // empty' "$FILE" 2>/dev/null | sort)
+layout_ids=$(jq -r '.layout[].i' "$FILE" 2>/dev/null | sort)
+
+if [[ "$chart_ids" != "$layout_ids" ]]; then
+    error "Chart IDs and layout IDs don't match"
+    echo "  Charts: $(echo $chart_ids | tr '\n' ' ')"
+    echo "  Layout: $(echo $layout_ids | tr '\n' ' ')"
+fi
+
+# 8. Check LogStream queries for take limits
+logstream_without_take=$(jq -r '.charts[] | select(.type == "LogStream") | select(.query.apl != null) | select(.query.apl | test("take ") | not) | .id' "$FILE" 2>/dev/null || true)
+if [[ -n "$logstream_without_take" ]]; then
+    for id in $logstream_without_take; do
+        warn "LogStream '$id' may be missing 'take N' limit"
+    done
+fi
+
+# 9. Check grid width (should be 12 columns max)
+max_right=$(jq '[.layout[] | (.x + .w)] | max // 0' "$FILE" 2>/dev/null || echo 0)
+if [[ "$max_right" -gt 12 ]]; then
+    warn "Layout exceeds 12-column grid (max right edge = $max_right)"
+fi
+
+# 10. Check schemaVersion
+schema_version=$(jq -r '.schemaVersion // 0' "$FILE")
+if [[ "$schema_version" != "2" ]]; then
+    warn "schemaVersion is $schema_version, expected 2"
+fi
+
+# 11. Check layout entries have minH/minW
+missing_min=$(jq -r '.layout[] | select(.minH == null or .minW == null) | .i' "$FILE" 2>/dev/null || true)
+if [[ -n "$missing_min" ]]; then
+    info "$(echo "$missing_min" | wc -l | tr -d ' ') layout entries missing minH/minW (will be auto-fixed on deploy)"
+fi
+
+# Summary
+echo ""
+echo "Validation complete: $errors errors, $warnings warnings"
+
+if [[ $errors -gt 0 ]]; then
+    exit 1
+fi
+
+if [[ "$STRICT" == "true" && $warnings -gt 0 ]]; then
+    exit 1
+fi
+
+exit 0
diff --git a/.agents/skills/building-dashboards/scripts/metrics/axiom-api b/.agents/skills/building-dashboards/scripts/metrics/axiom-api
new file mode 100755
index 00000000..d821c9fe
--- /dev/null
+++ b/.agents/skills/building-dashboards/scripts/metrics/axiom-api
@@ -0,0 +1,80 @@
+#!/usr/bin/env bash
+# axiom-api: Make authenticated requests to the Axiom DATA/METRICS API
+#
+# ⚠️ This script uses raw URLs for the data API.
+# For dashboard/app API calls, use scripts/axiom-api instead.
+#
+# Usage: axiom-api <deployment> <method> <path> [json-body]
+#
+# Reads credentials from ~/.axiom.toml (shared with axiom-sre)
+# Set AXIOM_URL_OVERRIDE to route requests to a specific edge deployment endpoint.
+#
+# Examples:
+#   axiom-api prod GET /v1/datasets
+#   axiom-api prod POST /v1/query/_mpl '{"mpl":"..."}'
+
+set -euo pipefail
+
+DEPLOYMENT="${1:-}"
+METHOD="${2:-}"
+PATH_="${3:-}"
+BODY="${4:-}"
+
+if [[ -z "$DEPLOYMENT" || -z "$METHOD" || -z "$PATH_" ]]; then
+    echo "Usage: axiom-api <deployment> <method> <path> [json-body]" >&2
+    exit 1
+fi
+
+CONFIG_FILE="$HOME/.axiom.toml"
+if [[ ! -f "$CONFIG_FILE" ]]; then
+    echo "Error: $CONFIG_FILE not found. Run scripts/setup for help." >&2
+    exit 1
+fi
+
+# Parse TOML for deployment config
+extract_value() {
+    local key="$1"
+    awk -v deployment="$DEPLOYMENT" -v key="$key" '
+        /^[[:space:]]*\[deployments\./ { in_deployment = ($0 ~ "\\[deployments\\." deployment "\\]") }
+        in_deployment && $1 == key { gsub(/[" ]/, "", $3); print $3; exit }
+    ' "$CONFIG_FILE"
+}
+
+URL="${AXIOM_URL_OVERRIDE:-$(extract_value "url")}"
+TOKEN=$(extract_value "token")
+ORG_ID=$(extract_value "org_id")
+
+if [[ -z "$URL" || -z "$TOKEN" || -z "$ORG_ID" ]]; then
+    echo "Error: Could not find deployment '$DEPLOYMENT' in $CONFIG_FILE" >&2
+    echo "" >&2
+    echo "Available deployments:" >&2
+    grep '[[:space:]]*\[deployments\.' "$CONFIG_FILE" | sed 's/.*\[deployments\.\(.*\)\]/  - \1/' >&2
+    exit 1
+fi
+
+CURL_ARGS=(
+    -s
+    -w '\n%{http_code}'
+    -X "$METHOD"
+    -H "Authorization: Bearer $TOKEN"
+    -H "X-Axiom-Org-Id: $ORG_ID"
+    -H "Content-Type: application/json"
+    -H "Accept: ${AXIOM_ACCEPT:-application/json}"
+)
+
+if [[ -n "$BODY" ]]; then
+    CURL_ARGS+=(-d "$BODY")
+fi
+
+RESPONSE=$(curl "${CURL_ARGS[@]}" "${URL}${PATH_}")
+
+HTTP_CODE=$(echo "$RESPONSE" | tail -1)
+BODY_CONTENT=$(echo "$RESPONSE" | sed '$d')
+
+if [[ "$HTTP_CODE" -ge 200 && "$HTTP_CODE" -lt 300 ]]; then
+    echo "$BODY_CONTENT"
+else
+    echo "Error: HTTP $HTTP_CODE from $METHOD ${URL}${PATH_}" >&2
+    echo "$BODY_CONTENT" >&2
+    exit 1
+fi
diff --git a/.agents/skills/building-dashboards/scripts/metrics/datasets b/.agents/skills/building-dashboards/scripts/metrics/datasets
new file mode 100755
index 00000000..be5e773e
--- /dev/null
+++ b/.agents/skills/building-dashboards/scripts/metrics/datasets
@@ -0,0 +1,39 @@
+#!/usr/bin/env bash
+# datasets: List datasets in an Axiom deployment
+#
+# Usage:
+#   datasets <deployment>                  # List all datasets
+#   datasets <deployment> --kind <kind>    # Filter by kind (e.g., otel:metrics:v1)
+#
+# Examples:
+#   datasets prod
+#   datasets prod --kind otel:metrics:v1
+
+set -euo pipefail
+
+SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+
+DEPLOYMENT="${1:-}"
+shift || true
+
+if [[ -z "$DEPLOYMENT" ]]; then
+    echo "Usage: datasets <deployment> [--kind <kind>]" >&2
+    exit 1
+fi
+
+KIND_FILTER=""
+while [[ $# -gt 0 ]]; do
+    case "$1" in
+        --kind) KIND_FILTER="$2"; shift 2 ;;
+        *) echo "Unknown option: $1" >&2; exit 1 ;;
+    esac
+done
+
+RESPONSE=$("$SCRIPT_DIR/axiom-api" "$DEPLOYMENT" GET /v1/datasets)
+
+if [[ -n "$KIND_FILTER" ]]; then
+    echo "$RESPONSE" | jq --arg kind "$KIND_FILTER" \
+        '[.[] | select(.kind == $kind) | {name, edgeDeployment, kind}]'
+else
+    echo "$RESPONSE" | jq '[.[] | {name, edgeDeployment, kind}]'
+fi
diff --git a/.agents/skills/building-dashboards/scripts/metrics/metrics-info b/.agents/skills/building-dashboards/scripts/metrics/metrics-info
new file mode 100755
index 00000000..b75ac2e5
--- /dev/null
+++ b/.agents/skills/building-dashboards/scripts/metrics/metrics-info
@@ -0,0 +1,132 @@
+#!/usr/bin/env bash
+# metrics-info: Discover metrics, tags, and tag values in a dataset
+#
+# Usage:
+#   metrics-info <deployment> <dataset> metrics [--start T --end T]
+#   metrics-info <deployment> <dataset> tags    [--start T --end T]
+#   metrics-info <deployment> <dataset> tags <tag> values [--start T --end T]
+#   metrics-info <deployment> <dataset> metrics <metric> tags [--start T --end T]
+#   metrics-info <deployment> <dataset> metrics <metric> tags <tag> values [--start T --end T]
+#   metrics-info <deployment> <dataset> find-metrics <search-value> [--start T --end T]
+#
+# find-metrics searches TAG VALUES, not metric names. It returns metrics that have
+# a tag containing the given value. Use it when you know a specific entity name
+# (service, host, device) to find which metrics are associated with it.
+# To list metric names, use the "metrics" subcommand instead.
+#
+# --start and --end default to the last 24 hours if omitted.
+# For sparse metrics (sensors, batch jobs), try --start with a wider range (e.g. 7 days).
+#
+# Examples:
+#   metrics-info prod my-dataset metrics                    # List all metric names
+#   metrics-info prod my-dataset tags service.name values   # List values for a tag
+#   metrics-info prod my-dataset find-metrics "frontend"    # Find metrics with tag value "frontend"
+#   metrics-info prod my-dataset metrics http.server.duration tags --start 2025-06-01T00:00:00Z --end 2025-06-02T00:00:00Z
+
+set -euo pipefail
+
+SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+
+show_usage() {
+    echo "Usage:" >&2
+    echo "  metrics-info <deploy> <dataset> metrics" >&2
+    echo "  metrics-info <deploy> <dataset> tags" >&2
+    echo "  metrics-info <deploy> <dataset> tags <tag> values" >&2
+    echo "  metrics-info <deploy> <dataset> metrics <metric> tags" >&2
+    echo "  metrics-info <deploy> <dataset> metrics <metric> tags <tag> values" >&2
+    echo "  metrics-info <deploy> <dataset> find-metrics <search-value>  (searches tag values, not metric names)" >&2
+    echo "" >&2
+    echo "Options:" >&2
+    echo "  --start T   Start time (RFC3339). Default: 24h ago" >&2
+    echo "  --end T     End time (RFC3339). Default: now" >&2
+    exit 1
+}
+
+DEPLOYMENT="${1:-}"
+DATASET="${2:-}"
+
+if [[ -z "$DEPLOYMENT" || -z "$DATASET" ]]; then
+    show_usage
+fi
+
+shift 2
+
+# Collect positional args and parse --start/--end
+POSITIONAL=()
+START=""
+END=""
+while [[ $# -gt 0 ]]; do
+    case "$1" in
+        --start) START="$2"; shift 2 ;;
+        --end)   END="$2"; shift 2 ;;
+        *)       POSITIONAL+=("$1"); shift ;;
+    esac
+done
+
+# Default time range: last 24 hours
+if [[ -z "$START" ]]; then
+    if date --version &>/dev/null 2>&1; then
+        START=$(date -u -d '24 hours ago' '+%Y-%m-%dT%H:%M:%SZ')
+    else
+        START=$(date -u -v-24H '+%Y-%m-%dT%H:%M:%SZ')
+    fi
+fi
+if [[ -z "$END" ]]; then
+    END=$(date -u '+%Y-%m-%dT%H:%M:%SZ')
+fi
+
+TIME_PARAMS="start=${START}&end=${END}"
+BASE="/v1/query/metrics/info/datasets/${DATASET}"
+
+# Resolve the regional edge URL for this dataset
+RESOLVED_URL=$("$SCRIPT_DIR/resolve-url" "$DEPLOYMENT" "$DATASET" 2>/dev/null || true)
+if [[ -n "$RESOLVED_URL" ]]; then
+    export AXIOM_URL_OVERRIDE="$RESOLVED_URL"
+fi
+
+if [[ ${#POSITIONAL[@]} -eq 0 ]]; then
+    show_usage
+fi
+
+case "${POSITIONAL[0]}" in
+    metrics)
+        if [[ ${#POSITIONAL[@]} -eq 1 ]]; then
+            # List metrics
+            AXIOM_ACCEPT="application/vnd.metrics-info.v2+json" "$SCRIPT_DIR/axiom-api" "$DEPLOYMENT" GET "${BASE}/metrics?${TIME_PARAMS}"
+        elif [[ ${#POSITIONAL[@]} -eq 3 && "${POSITIONAL[2]}" == "tags" ]]; then
+            # List tags for a metric
+            METRIC="${POSITIONAL[1]}"
+            "$SCRIPT_DIR/axiom-api" "$DEPLOYMENT" GET "${BASE}/metrics/${METRIC}/tags?${TIME_PARAMS}"
+        elif [[ ${#POSITIONAL[@]} -eq 5 && "${POSITIONAL[2]}" == "tags" && "${POSITIONAL[4]}" == "values" ]]; then
+            # List tag values for a metric+tag
+            METRIC="${POSITIONAL[1]}"
+            TAG="${POSITIONAL[3]}"
+            "$SCRIPT_DIR/axiom-api" "$DEPLOYMENT" GET "${BASE}/metrics/${METRIC}/tags/${TAG}/values?${TIME_PARAMS}"
+        else
+            show_usage
+        fi
+        ;;
+    tags)
+        if [[ ${#POSITIONAL[@]} -eq 1 ]]; then
+            # List tags
+            "$SCRIPT_DIR/axiom-api" "$DEPLOYMENT" GET "${BASE}/tags?${TIME_PARAMS}"
+        elif [[ ${#POSITIONAL[@]} -eq 3 && "${POSITIONAL[2]}" == "values" ]]; then
+            # List values for a tag
+            TAG="${POSITIONAL[1]}"
+            "$SCRIPT_DIR/axiom-api" "$DEPLOYMENT" GET "${BASE}/tags/${TAG}/values?${TIME_PARAMS}"
+        else
+            show_usage
+        fi
+        ;;
+    find-metrics)
+        if [[ ${#POSITIONAL[@]} -ne 2 ]]; then
+            show_usage
+        fi
+        VALUE="${POSITIONAL[1]}"
+        BODY=$(jq -n --arg v "$VALUE" '{value: $v}')
+        "$SCRIPT_DIR/axiom-api" "$DEPLOYMENT" POST "${BASE}/metrics?${TIME_PARAMS}" "$BODY"
+        ;;
+    *)
+        show_usage
+        ;;
+esac
diff --git a/.agents/skills/building-dashboards/scripts/metrics/metrics-query b/.agents/skills/building-dashboards/scripts/metrics/metrics-query
new file mode 100755
index 00000000..596a323b
--- /dev/null
+++ b/.agents/skills/building-dashboards/scripts/metrics/metrics-query
@@ -0,0 +1,58 @@
+#!/usr/bin/env bash
+# metrics-query: Execute a metrics query against Axiom MetricsDB
+#
+# Usage: metrics-query <deployment> <mpl> <startTime> <endTime>
+#
+# Times: RFC3339 (e.g. 2025-01-01T00:00:00Z) or relative (e.g. now-1h, now-1d).
+#
+# Examples:
+#   metrics-query prod '`otel-metrics`:`http.server.duration` | align to 5m using avg | group by `endpoint` using sum' \
+#     '2025-06-01T00:00:00Z' '2025-06-02T00:00:00Z'
+#   metrics-query prod '`otel-metrics`:`http.server.duration` | align to 5m using avg' now-1h now
+
+set -euo pipefail
+
+SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+
+DEPLOYMENT="${1:-}"
+MPL="${2:-}"
+START_TIME="${3:-}"
+END_TIME="${4:-}"
+
+if [[ -z "$DEPLOYMENT" || -z "$MPL" || -z "$START_TIME" || -z "$END_TIME" ]]; then
+    echo "Usage: metrics-query <deployment> <mpl> <startTime> <endTime>" >&2
+    echo "" >&2
+    echo "Times: RFC3339 (e.g. 2025-01-01T00:00:00Z) or relative (e.g. now-1h, now-1d)." >&2
+    exit 1
+fi
+
+# Extract dataset name from MPL: `dataset`:`metric` ... or dataset:`metric` ...
+DATASET=$(echo "$MPL" | sed 's/`//g' | cut -d: -f1 | tr -d '[:space:]')
+
+QUERY_EDGE_DEPLOYMENT=""
+if [[ -n "$DATASET" ]]; then
+    RESOLVED_URL=$("$SCRIPT_DIR/resolve-url" "$DEPLOYMENT" "$DATASET" 2>/dev/null || true)
+    if [[ -n "$RESOLVED_URL" ]]; then
+        export AXIOM_URL_OVERRIDE="$RESOLVED_URL"
+        # Derive queryEdgeDeployment from edge URL: https://eu-central-1.aws.edge.axiom.co → cloud.eu-central-1.aws
+        if [[ "$RESOLVED_URL" == *".edge.axiom.co" ]]; then
+            QUERY_EDGE_DEPLOYMENT="cloud.$(echo "$RESOLVED_URL" | sed 's|https://||;s|\.edge\.axiom\.co||')"
+        fi
+    fi
+fi
+
+JQ_ARGS=(
+    --arg apl "$MPL"
+    --arg startT "$START_TIME"
+    --arg endT "$END_TIME"
+)
+JQ_EXPR='{"apl": $apl, "startTime": $startT, "endTime": $endT}'
+
+if [[ -n "$QUERY_EDGE_DEPLOYMENT" ]]; then
+    JQ_ARGS+=(--arg edgeDeployment "$QUERY_EDGE_DEPLOYMENT")
+    JQ_EXPR='{"apl": $apl, "startTime": $startT, "endTime": $endT, "queryEdgeDeployment": $edgeDeployment}'
+fi
+
+BODY=$(jq -n "${JQ_ARGS[@]}" "$JQ_EXPR")
+
+AXIOM_ACCEPT="application/json+metrics.v2" "$SCRIPT_DIR/axiom-api" "$DEPLOYMENT" POST "/v1/query/_mpl" "$BODY"
diff --git a/.agents/skills/building-dashboards/scripts/metrics/metrics-spec b/.agents/skills/building-dashboards/scripts/metrics/metrics-spec
new file mode 100755
index 00000000..5f022253
--- /dev/null
+++ b/.agents/skills/building-dashboards/scripts/metrics/metrics-spec
@@ -0,0 +1,31 @@
+#!/usr/bin/env bash
+# metrics-spec: Fetch the metrics query specification from Axiom
+#
+# Usage: metrics-spec <deployment> <dataset>
+#
+# Calls OPTIONS /v1/query/_mpl to retrieve the complete metrics query
+# spec with syntax, operators, and examples. Read this before composing queries.
+#
+# The dataset is needed to resolve the correct edge deployment URL.
+#
+# Example:
+#   metrics-spec prod my-metrics-dataset
+
+set -euo pipefail
+
+SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+
+DEPLOYMENT="${1:-}"
+DATASET="${2:-}"
+
+if [[ -z "$DEPLOYMENT" || -z "$DATASET" ]]; then
+    echo "Usage: metrics-spec <deployment> <dataset>" >&2
+    exit 1
+fi
+
+RESOLVED_URL=$("$SCRIPT_DIR/resolve-url" "$DEPLOYMENT" "$DATASET" 2>/dev/null || true)
+if [[ -n "$RESOLVED_URL" ]]; then
+    export AXIOM_URL_OVERRIDE="$RESOLVED_URL"
+fi
+
+AXIOM_ACCEPT="text/markdown" "$SCRIPT_DIR/axiom-api" "$DEPLOYMENT" OPTIONS "/v1/query/_mpl"
diff --git a/.agents/skills/building-dashboards/scripts/metrics/resolve-url b/.agents/skills/building-dashboards/scripts/metrics/resolve-url
new file mode 100755
index 00000000..be559818
--- /dev/null
+++ b/.agents/skills/building-dashboards/scripts/metrics/resolve-url
@@ -0,0 +1,68 @@
+#!/usr/bin/env bash
+# resolve-url: Resolve the regional edge URL for a dataset
+#
+# Usage: resolve-url <deployment> <dataset>
+#
+# Fetches the dataset's edgeDeployment from the Axiom API and maps it to the
+# correct edge URL. Prints the URL to stdout.
+#
+# Edge deployment mapping:
+#   cloud.us-east-1.aws    → https://us-east-1.aws.edge.axiom.co
+#   cloud.eu-central-1.aws → https://eu-central-1.aws.edge.axiom.co
+#   (null/empty)           → falls back to deployment URL from config
+
+set -euo pipefail
+
+SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+
+DEPLOYMENT="${1:-}"
+DATASET="${2:-}"
+
+if [[ -z "$DEPLOYMENT" || -z "$DATASET" ]]; then
+    echo "Usage: resolve-url <deployment> <dataset>" >&2
+    exit 1
+fi
+
+CONFIG_FILE="$HOME/.axiom.toml"
+CACHE_DIR="${TMPDIR:-/tmp}/axiom-resolve-url"
+CACHE_FILE="${CACHE_DIR}/${DEPLOYMENT}__${DATASET}"
+CACHE_TTL=3600  # 1 hour
+
+# Return cached result if fresh
+if [[ -f "$CACHE_FILE" ]]; then
+    if [[ "$(uname)" == "Darwin" ]]; then
+        FILE_AGE=$(( $(date +%s) - $(stat -f %m "$CACHE_FILE") ))
+    else
+        FILE_AGE=$(( $(date +%s) - $(stat -c %Y "$CACHE_FILE") ))
+    fi
+    if [[ "$FILE_AGE" -lt "$CACHE_TTL" ]]; then
+        cat "$CACHE_FILE"
+        exit 0
+    fi
+fi
+
+extract_value() {
+    local key="$1"
+    awk -v deployment="$DEPLOYMENT" -v key="$key" '
+        /^[[:space:]]*\[deployments\./ { in_deployment = ($0 ~ "\\[deployments\\." deployment "\\]") }
+        in_deployment && $1 == key { gsub(/[" ]/, "", $3); print $3; exit }
+    ' "$CONFIG_FILE"
+}
+
+FALLBACK_URL=$(extract_value "url")
+
+EDGE_DEPLOYMENT=$("$SCRIPT_DIR/axiom-api" "$DEPLOYMENT" GET /v1/datasets \
+    | jq -r --arg name "$DATASET" '.[] | select(.name == $name) | .edgeDeployment // empty')
+
+RESOLVED_URL=""
+if [[ -z "$EDGE_DEPLOYMENT" || "$EDGE_DEPLOYMENT" == "null" ]]; then
+    RESOLVED_URL="$FALLBACK_URL"
+else
+    # cloud.us-east-1.aws → https://us-east-1.aws.edge.axiom.co
+    EDGE_HOST="${EDGE_DEPLOYMENT#cloud.}"
+    RESOLVED_URL="https://${EDGE_HOST}.edge.axiom.co"
+fi
+
+mkdir -p "$CACHE_DIR"
+echo "$RESOLVED_URL" > "$CACHE_FILE"
+echo "$RESOLVED_URL"
diff --git a/.agents/skills/building-dashboards/scripts/setup b/.agents/skills/building-dashboards/scripts/setup
new file mode 100755
index 00000000..252e1533
--- /dev/null
+++ b/.agents/skills/building-dashboards/scripts/setup
@@ -0,0 +1,91 @@
+#!/usr/bin/env bash
+# Setup building-dashboards skill
+# Usage: scripts/setup
+#
+# This script:
+#   1. Checks for required tools (curl, jq)
+#   2. Checks for ~/.axiom.toml (shared with axiom-sre)
+#   3. Makes scripts executable
+
+set -euo pipefail
+
+SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+
+echo "=== building-dashboards Setup ==="
+echo ""
+
+# --- Check required tools ---
+echo "[1/3] Checking required tools..."
+
+MISSING=()
+for cmd in curl jq; do
+    if command -v "$cmd" &> /dev/null; then
+        echo "✓ $cmd found"
+    else
+        echo "✗ $cmd not found"
+        MISSING+=("$cmd")
+    fi
+done
+
+if [[ ${#MISSING[@]} -gt 0 ]]; then
+    echo ""
+    echo "Install missing tools:"
+    for cmd in "${MISSING[@]}"; do
+        case "$cmd" in
+            jq) echo "  brew install jq  # or apt-get install jq" ;;
+            curl) echo "  brew install curl  # or apt-get install curl" ;;
+        esac
+    done
+    exit 1
+fi
+
+# --- Make scripts executable ---
+echo ""
+echo "[2/3] Making scripts executable..."
+chmod +x "$SCRIPT_DIR"/*
+echo "✓ Scripts ready"
+
+# --- Check Axiom config ---
+echo ""
+echo "[3/3] Checking Axiom configuration..."
+
+AXIOM_CONFIG="$HOME/.axiom.toml"
+if [[ -f "$AXIOM_CONFIG" ]]; then
+    DEPLOYMENTS=$(grep -cE '^\s*\[deployments\.' "$AXIOM_CONFIG" 2>/dev/null || echo 0)
+    echo "✓ Found $AXIOM_CONFIG with $DEPLOYMENTS deployment(s)"
+    
+    # List deployments
+    echo "  Deployments:"
+    grep -E '^\s*\[deployments\.' "$AXIOM_CONFIG" | sed 's/^[[:space:]]*//' | sed 's/\[deployments\.\(.*\)\]/    - \1/'
+else
+    echo "⚠ $AXIOM_CONFIG not found"
+    echo ""
+    echo "Create it to enable dashboard operations:"
+    echo ""
+    cat << 'EOF'
+[deployments.prod]
+url = "https://api.axiom.co"
+token = "xaat-your-token-here"
+org_id = "your-org-id"
+
+[deployments.staging]
+url = "https://api.axiom.co"
+token = "xaat-your-staging-token"
+org_id = "your-org-id"
+EOF
+    echo ""
+fi
+
+echo ""
+echo "=== Setup Complete ==="
+echo ""
+echo "Usage:"
+echo "  scripts/dashboard-list prod           # List dashboards"
+echo "  scripts/dashboard-get prod <id>       # Get dashboard JSON"
+echo "  scripts/dashboard-create prod <file>  # Create dashboard"
+echo "  scripts/dashboard-copy prod <id>      # Clone dashboard"
+echo "  scripts/dashboard-link prod <id>      # Get dashboard URL"
+echo ""
+echo "Templates:"
+echo "  scripts/dashboard-from-template service-overview <service> <dataset>"
+echo ""
diff --git a/.agents/skills/building-dashboards/scripts/test-toml-parsing b/.agents/skills/building-dashboards/scripts/test-toml-parsing
new file mode 100755
index 00000000..9cb57c43
--- /dev/null
+++ b/.agents/skills/building-dashboards/scripts/test-toml-parsing
@@ -0,0 +1,275 @@
+#!/usr/bin/env bash
+# Tests for TOML parsing logic used across building-dashboards scripts.
+#
+# Covers extract_value (from axiom-api, dashboard-link) and
+# the grep-based deployment listing in setup.
+#
+# Usage: scripts/test-toml-parsing
+
+set -euo pipefail
+
+PASS=0
+FAIL=0
+
+assert_eq() {
+    local label="$1" expected="$2" actual="$3"
+    if [[ "$expected" == "$actual" ]]; then
+        echo "  ✓ $label"
+        PASS=$((PASS + 1))
+    else
+        echo "  ✗ $label"
+        echo "    expected: $(printf '%q' "$expected")"
+        echo "    actual:   $(printf '%q' "$actual")"
+        FAIL=$((FAIL + 1))
+    fi
+}
+
+# --- extract_value (shared by axiom-api, dashboard-link) ---
+# Extracts a value from ~/.axiom.toml for a given deployment.
+# We inline the awk logic here to test it in isolation.
+run_extract_value() {
+    local config_file="$1" deployment="$2" key="$3"
+    awk -v deployment="$deployment" -v key="$key" '
+        /^[[:space:]]*\[deployments\./ { in_deployment = ($0 ~ "\\[deployments\\." deployment "\\]") }
+        in_deployment {
+            gsub(/^[[:space:]]+/, "")
+            if ($1 == key) {
+                sub(/^[^=]*=[[:space:]]*/, "")
+                if (match($0, /^"[^"]*"/)) {
+                    $0 = substr($0, RSTART+1, RLENGTH-2)
+                } else {
+                    sub(/[[:space:]]*#.*$/, "")
+                }
+                print
+                exit
+            }
+        }
+    ' "$config_file"
+}
+
+# --- setup grep logic ---
+# Count and list deployments from config.
+run_count_deployments() {
+    local config_file="$1"
+    local count
+    count=$(grep -cE '^\s*\[deployments\.' "$config_file" 2>/dev/null) || true
+    echo "${count:-0}"
+}
+
+run_list_deployments() {
+    local config_file="$1"
+    grep -E '^\s*\[deployments\.' "$config_file" | sed 's/^[[:space:]]*//' | sed 's/\[deployments\.\(.*\)\]/\1/'
+}
+
+# ========== Test fixtures ==========
+
+TMPDIR_TEST=$(mktemp -d)
+trap 'rm -rf "$TMPDIR_TEST"' EXIT
+
+# Fixture: standard (no indentation)
+cat > "$TMPDIR_TEST/standard.toml" << 'EOF'
+[deployments.prod]
+url = "https://api.axiom.co"
+token = "xaat-prod-token"
+org_id = "org-prod-123"
+
+[deployments.staging]
+url = "https://api.staging.axiom.co"
+token = "xaat-staging-token"
+org_id = "org-staging-456"
+EOF
+
+# Fixture: indented sections (the bug this PR fixes)
+cat > "$TMPDIR_TEST/indented.toml" << 'EOF'
+    [deployments.prod]
+    url = "https://api.axiom.co"
+    token = "xaat-prod-token"
+    org_id = "org-prod-123"
+
+    [deployments.staging]
+    url = "https://api.staging.axiom.co"
+    token = "xaat-staging-token"
+    org_id = "org-staging-456"
+EOF
+
+# Fixture: mixed indentation
+cat > "$TMPDIR_TEST/mixed.toml" << 'EOF'
+[deployments.prod]
+url = "https://api.axiom.co"
+token = "xaat-prod-token"
+org_id = "org-prod-123"
+
+    [deployments.staging]
+    url = "https://api.staging.axiom.co"
+    token = "xaat-staging-token"
+    org_id = "org-staging-456"
+EOF
+
+# Fixture: tabs
+cat > "$TMPDIR_TEST/tabs.toml" <<- 'EOF'
+	[deployments.prod]
+	url = "https://api.axiom.co"
+	token = "xaat-prod-token"
+	org_id = "org-prod-123"
+EOF
+
+# Fixture: single deployment
+cat > "$TMPDIR_TEST/single.toml" << 'EOF'
+[deployments.prod]
+url = "https://api.axiom.co"
+token = "xaat-prod-token"
+org_id = "org-prod-123"
+EOF
+
+# Fixture: unquoted values
+cat > "$TMPDIR_TEST/unquoted.toml" << 'EOF'
+[deployments.prod]
+url = https://api.axiom.co
+token = xaat-prod-token
+org_id = org-prod-123
+EOF
+
+# Fixture: values with extra spacing around =
+cat > "$TMPDIR_TEST/spacing.toml" << 'EOF'
+[deployments.prod]
+url   =   "https://api.axiom.co"
+token =    "xaat-prod-token"
+org_id =  "org-prod-123"
+EOF
+
+# Fixture: inline comments
+cat > "$TMPDIR_TEST/comments.toml" << 'EOF'
+[deployments.prod]
+url = "https://api.axiom.co" # production API
+token = "xaat-prod-token" # keep secret
+org_id = "org-prod-123"
+EOF
+
+
+# Fixture: values with hash inside quotes
+cat > "$TMPDIR_TEST/hash_in_value.toml" << 'EOF'
+[deployments.prod]
+url = "https://example.com/path#fragment"
+token = "xaat-prod-token"
+org_id = "org-prod-123"
+EOF
+
+# Fixture: values with brackets in values (e.g. IPv6)
+cat > "$TMPDIR_TEST/brackets.toml" << 'EOF'
+[deployments.prod]
+url = "http://[::1]:3000"
+token = "xaat-prod-token"
+org_id = "org-prod-123"
+EOF
+
+# Fixture: empty (no deployments)
+cat > "$TMPDIR_TEST/empty.toml" << 'EOF'
+# No deployments configured
+EOF
+
+# ========== Tests ==========
+
+echo "=== extract_value: standard config ==="
+assert_eq "url from prod"     "https://api.axiom.co"          "$(run_extract_value "$TMPDIR_TEST/standard.toml" prod url)"
+assert_eq "token from prod"   "xaat-prod-token"               "$(run_extract_value "$TMPDIR_TEST/standard.toml" prod token)"
+assert_eq "org_id from prod"  "org-prod-123"                  "$(run_extract_value "$TMPDIR_TEST/standard.toml" prod org_id)"
+assert_eq "url from staging"  "https://api.staging.axiom.co"  "$(run_extract_value "$TMPDIR_TEST/standard.toml" staging url)"
+
+echo ""
+echo "=== extract_value: indented sections ==="
+assert_eq "url from prod"     "https://api.axiom.co"          "$(run_extract_value "$TMPDIR_TEST/indented.toml" prod url)"
+assert_eq "token from prod"   "xaat-prod-token"               "$(run_extract_value "$TMPDIR_TEST/indented.toml" prod token)"
+assert_eq "org_id from prod"  "org-prod-123"                  "$(run_extract_value "$TMPDIR_TEST/indented.toml" prod org_id)"
+assert_eq "url from staging"  "https://api.staging.axiom.co"  "$(run_extract_value "$TMPDIR_TEST/indented.toml" staging url)"
+
+echo ""
+echo "=== extract_value: mixed indentation ==="
+assert_eq "url from prod (not indented)"    "https://api.axiom.co"          "$(run_extract_value "$TMPDIR_TEST/mixed.toml" prod url)"
+assert_eq "url from staging (indented)"     "https://api.staging.axiom.co"  "$(run_extract_value "$TMPDIR_TEST/mixed.toml" staging url)"
+assert_eq "token from staging (indented)"   "xaat-staging-token"            "$(run_extract_value "$TMPDIR_TEST/mixed.toml" staging token)"
+
+echo ""
+echo "=== extract_value: tab indentation ==="
+assert_eq "url from prod"    "https://api.axiom.co"  "$(run_extract_value "$TMPDIR_TEST/tabs.toml" prod url)"
+assert_eq "token from prod"  "xaat-prod-token"       "$(run_extract_value "$TMPDIR_TEST/tabs.toml" prod token)"
+
+echo ""
+echo "=== extract_value: unquoted values ==="
+assert_eq "url unquoted"    "https://api.axiom.co"  "$(run_extract_value "$TMPDIR_TEST/unquoted.toml" prod url)"
+assert_eq "token unquoted"  "xaat-prod-token"       "$(run_extract_value "$TMPDIR_TEST/unquoted.toml" prod token)"
+
+echo ""
+echo "=== extract_value: extra spacing ==="
+assert_eq "url with spacing"     "https://api.axiom.co"  "$(run_extract_value "$TMPDIR_TEST/spacing.toml" prod url)"
+assert_eq "token with spacing"   "xaat-prod-token"       "$(run_extract_value "$TMPDIR_TEST/spacing.toml" prod token)"
+assert_eq "org_id no space before ="  "org-prod-123"     "$(run_extract_value "$TMPDIR_TEST/spacing.toml" prod org_id)"
+
+echo ""
+echo "=== extract_value: inline comments ==="
+assert_eq "url with comment"    "https://api.axiom.co"  "$(run_extract_value "$TMPDIR_TEST/comments.toml" prod url)"
+assert_eq "token with comment"  "xaat-prod-token"       "$(run_extract_value "$TMPDIR_TEST/comments.toml" prod token)"
+assert_eq "org_id no comment"   "org-prod-123"          "$(run_extract_value "$TMPDIR_TEST/comments.toml" prod org_id)"
+
+
+echo ""
+echo "=== extract_value: hash inside quoted value ==="
+assert_eq "url with hash fragment"  "https://example.com/path#fragment"  "$(run_extract_value "$TMPDIR_TEST/hash_in_value.toml" prod url)"
+assert_eq "token after hash url"    "xaat-prod-token"                    "$(run_extract_value "$TMPDIR_TEST/hash_in_value.toml" prod token)"
+
+echo ""
+echo "=== extract_value: brackets inside value ==="
+assert_eq "ipv6 url"             "http://[::1]:3000"  "$(run_extract_value "$TMPDIR_TEST/brackets.toml" prod url)"
+assert_eq "token after bracket"  "xaat-prod-token"    "$(run_extract_value "$TMPDIR_TEST/brackets.toml" prod token)"
+assert_eq "org_id after bracket" "org-prod-123"       "$(run_extract_value "$TMPDIR_TEST/brackets.toml" prod org_id)"
+
+echo ""
+echo "=== extract_value: missing deployment ==="
+assert_eq "nonexistent deployment"  ""  "$(run_extract_value "$TMPDIR_TEST/standard.toml" nonexistent url)"
+
+echo ""
+echo "=== extract_value: missing key ==="
+assert_eq "nonexistent key"  ""  "$(run_extract_value "$TMPDIR_TEST/standard.toml" prod nonexistent_key)"
+
+echo ""
+echo "=== extract_value: no cross-section leaking ==="
+assert_eq "prod token stays in prod"       "xaat-prod-token"     "$(run_extract_value "$TMPDIR_TEST/standard.toml" prod token)"
+assert_eq "staging token stays in staging" "xaat-staging-token"   "$(run_extract_value "$TMPDIR_TEST/standard.toml" staging token)"
+
+echo ""
+echo "=== count deployments: standard ==="
+assert_eq "2 deployments"  "2"  "$(run_count_deployments "$TMPDIR_TEST/standard.toml")"
+
+echo ""
+echo "=== count deployments: indented ==="
+assert_eq "2 deployments (indented)"  "2"  "$(run_count_deployments "$TMPDIR_TEST/indented.toml")"
+
+echo ""
+echo "=== count deployments: mixed ==="
+assert_eq "2 deployments (mixed)"  "2"  "$(run_count_deployments "$TMPDIR_TEST/mixed.toml")"
+
+echo ""
+echo "=== count deployments: single ==="
+assert_eq "1 deployment"  "1"  "$(run_count_deployments "$TMPDIR_TEST/single.toml")"
+
+echo ""
+echo "=== count deployments: empty ==="
+assert_eq "0 deployments"  "0"  "$(run_count_deployments "$TMPDIR_TEST/empty.toml")"
+
+echo ""
+echo "=== list deployments: standard ==="
+LISTED=$(run_list_deployments "$TMPDIR_TEST/standard.toml")
+assert_eq "lists prod"     "prod"     "$(echo "$LISTED" | head -1)"
+assert_eq "lists staging"  "staging"  "$(echo "$LISTED" | tail -1)"
+
+echo ""
+echo "=== list deployments: indented ==="
+LISTED=$(run_list_deployments "$TMPDIR_TEST/indented.toml")
+assert_eq "lists prod (indented)"     "prod"     "$(echo "$LISTED" | head -1)"
+assert_eq "lists staging (indented)"  "staging"  "$(echo "$LISTED" | tail -1)"
+
+echo ""
+echo "==========================="
+echo "Results: $PASS passed, $FAIL failed"
+if [[ $FAIL -gt 0 ]]; then
+    exit 1
+fi
diff --git a/.agents/skills/building-dashboards/tests/manual-tests.md b/.agents/skills/building-dashboards/tests/manual-tests.md
new file mode 100644
index 00000000..e9d36cd4
--- /dev/null
+++ b/.agents/skills/building-dashboards/tests/manual-tests.md
@@ -0,0 +1,479 @@
+# Building Dashboards - Manual Test Guide
+
+Comprehensive manual testing for all skill features. Run through each section to validate the skill works correctly.
+
+**Test environment:** Use [Axiom Playground](https://play.axiom.co) with `sample-http-logs` dataset.
+
+---
+
+## Prerequisites
+
+Before testing:
+
+1. Run setup (checks config):
+   ```bash
+   cd skills/building-dashboards
+   ./scripts/setup
+   ```
+
+2. Create `~/.axiom.toml` with your credentials (for deployment tests):
+   ```toml
+   [deployments.prod]
+   url = "https://api.axiom.co"
+   token = "xaat-your-token"
+   org_id = "your-org-id"
+   ```
+   
+   This config is shared with axiom-sre.
+
+---
+
+## Test 1: Skill Loading
+
+**Prompt:** "Help me build a dashboard"
+
+**Expected behavior:**
+- Skill should activate (look for building-dashboards patterns)
+- Agent should ask intake questions: audience, scope, datasets, signals
+
+**Validation:**
+- [ ] Skill activates on dashboard-related requests
+- [ ] Agent asks clarifying questions before designing
+
+---
+
+## Test 2: Intake Workflow
+
+**Prompt:** "I want to create an oncall dashboard for our API gateway service"
+
+**Expected behavior:**
+- Agent asks about datasets
+- Agent asks about key metrics (errors, latency, traffic)
+- Agent asks about drilldown dimensions
+
+**Validation:**
+- [ ] Agent identifies this as oncall use case
+- [ ] Agent requests dataset information
+- [ ] Agent proposes golden signals coverage
+
+---
+
+## Test 3: Template Usage
+
+**Prompt:** "Create a service overview dashboard for 'payment-api' using the 'http-logs' dataset"
+
+**Expected behavior:**
+- Agent uses `dashboard-from-template` or manually applies service-overview template
+- Replaces placeholders with provided values
+- Outputs valid dashboard JSON
+
+**Validation:**
+- [ ] Uses service-overview template
+- [ ] Replaces {{service}}, {{dataset}} correctly
+- [ ] Output is valid JSON
+- [ ] Chart queries reference correct dataset
+
+---
+
+## Test 4: Chart Type Selection
+
+**Prompt:** "What chart type should I use to show error rate over time?"
+
+**Expected:** TimeSeries
+
+**Prompt:** "What chart type for a single KPI like current p95 latency?"
+
+**Expected:** Statistic
+
+**Prompt:** "What chart type to show top 10 failing routes?"
+
+**Expected:** Table
+
+**Prompt:** "What chart type for status code distribution with 4 categories?"
+
+**Expected:** Pie
+
+**Prompt:** "What chart type to show raw error logs?"
+
+**Expected:** LogStream
+
+**Validation:**
+- [ ] Correctly recommends TimeSeries for trends
+- [ ] Correctly recommends Statistic for single values
+- [ ] Correctly recommends Table for top-N lists
+- [ ] Correctly recommends Pie for low-cardinality distributions
+- [ ] Correctly recommends LogStream for raw events
+
+---
+
+## Test 5: APL Query Generation
+
+Test each chart type APL pattern against `sample-http-logs` in Axiom Playground.
+
+**Note:** These are **ad-hoc queries** for the Query tab, so they include explicit `_time` filters. Dashboard panel queries don't need time filters—they inherit from the UI picker.
+
+### 5.1 Statistic - Error Rate
+```apl
+['sample-http-logs']
+| where _time between (ago(1h) .. now())
+| summarize total = count(), errors = countif(toint(status) >= 500)
+| extend error_rate = round(100.0 * errors / total, 2)
+| project error_rate
+```
+
+**Expected:** Returns single row with error_rate percentage
+
+- [ ] Query executes without error
+- [ ] Returns single numeric value
+
+### 5.2 TimeSeries - Traffic Over Time
+```apl
+['sample-http-logs']
+| where _time between (ago(1h) .. now())
+| summarize requests = count() by bin_auto(_time)
+```
+
+**Expected:** Returns time-binned counts
+
+- [ ] Query executes without error
+- [ ] Returns multiple rows with _time and count
+
+### 5.3 TimeSeries - Latency Percentiles
+```apl
+['sample-http-logs']
+| where _time between (ago(1h) .. now())
+| summarize percentiles_array(req_duration_ms, 50, 95, 99) by bin_auto(_time)
+```
+
+**Expected:** Returns percentile array over time, renders as overlaid series
+
+- [ ] Query executes without error
+- [ ] Chart shows p50, p95, p99 as overlaid lines (not stacked rows)
+
+### 5.4 Table - Top Routes by Traffic
+```apl
+['sample-http-logs']
+| where _time between (ago(1h) .. now())
+| summarize requests = count() by uri
+| top 10 by requests
+| project URI = uri, Requests = requests
+```
+
+**Expected:** Returns top 10 URIs by request count
+
+- [ ] Query executes without error
+- [ ] Returns exactly 10 rows (or fewer if less data)
+- [ ] Sorted by requests descending
+
+### 5.5 Pie - Status Distribution
+```apl
+['sample-http-logs']
+| where _time between (ago(1h) .. now())
+| extend status_class = case(
+    toint(status) < 300, "2xx",
+    toint(status) < 400, "3xx",
+    toint(status) < 500, "4xx",
+    "5xx"
+  )
+| summarize count() by status_class
+```
+
+**Expected:** Returns 2-4 rows with status class counts
+
+- [ ] Query executes without error
+- [ ] Returns low cardinality (≤6 categories)
+
+### 5.6 LogStream - Recent Requests
+```apl
+['sample-http-logs']
+| where _time between (ago(15m) .. now())
+| project-keep _time, method, uri, status, req_duration_ms
+| order by _time desc
+| take 100
+```
+
+**Expected:** Returns raw log entries
+
+- [ ] Query executes without error
+- [ ] Returns ≤100 rows
+- [ ] Only specified columns shown
+
+### 5.7 Metrics MPL Query Shape
+
+Verify metrics charts set BOTH `query.apl` (MPL pipeline) and `query.metricsDataset` (dataset name), and do NOT set `query.mpl` (rejected by create API).
+
+**Prompt:**
+"Create a metrics TimeSeries chart for `otel-metrics:http.server.duration` filtered to `service.name=api` and `deployment.environment=prod`, with `align to 1m using avg`."
+
+**Expected chart query shape:**
+```json
+{
+  "query": {
+    "apl": "`otel-metrics`:`http.server.duration`\n| where `service.name` == \"api\"\n| where `deployment.environment` == \"prod\"\n| align to 1m using avg",
+    "metricsDataset": "otel-metrics"
+  }
+}
+```
+
+**Validation:**
+- [ ] Agent sets `query.apl` to the MPL pipeline string (NOT `query.mpl`. "mpl" is incorrect).
+- [ ] Agent sets `query.metricsDataset` to the dataset name
+- [ ] Agent does NOT set `query.mpl` (rejected on create)
+- [ ] Agent runs `scripts/metrics/metrics-spec` before composing MPL queries
+- [ ] Pipeline order matches intended execution order
+- [ ] Dotted identifiers are backtick-escaped in MPL
+
+---
+
+## Test 6: Layout Recommendations
+
+**Prompt:** "How should I layout a dashboard with 4 stats, 2 timeseries, 2 tables, and a logstream?"
+
+**Expected behavior:**
+- Agent recommends grid-based layout
+- Stats at top (row 0-1, w=6 each)
+- TimeSeries below (row 2-5, w=12 each)
+- Tables middle (row 6-9, w=12 each)
+- LogStream bottom (row 10+, w=12)
+
+**Validation:**
+- [ ] Recommends logical section ordering
+- [ ] Suggests appropriate widths/heights
+- [ ] Follows overview → drilldown → evidence pattern
+
+---
+
+## Test 7: Script Execution
+
+### 7.1 dashboard-new
+```bash
+cd skills/building-dashboards
+./scripts/dashboard-new "Test Dashboard" "synthetic_http" /tmp/test-new.json
+cat /tmp/test-new.json | jq .
+```
+
+**Expected:** Valid JSON with provided values
+
+- [ ] Script runs without error
+- [ ] Output is valid JSON
+- [ ] name, owner, dataset fields populated
+
+### 7.2 dashboard-from-template
+```bash
+./scripts/dashboard-from-template service-overview "test-api" "synthetic_http" /tmp/test-template.json
+cat /tmp/test-template.json | jq .
+```
+
+**Expected:** Full dashboard JSON from template
+
+- [ ] Script runs without error
+- [ ] All {{placeholders}} replaced
+- [ ] Charts and layout present
+
+### 7.3 dashboard-validate
+```bash
+./scripts/dashboard-validate /tmp/test-template.json
+```
+
+**Expected:** Validation passes
+
+- [ ] Script runs without error
+- [ ] Reports any warnings (take limits, grid width)
+- [ ] Exits 0 if valid
+
+### 7.4 dashboard-validate with bad input
+```bash
+echo '{"name": "bad"}' > /tmp/bad-dashboard.json
+./scripts/dashboard-validate /tmp/bad-dashboard.json
+```
+
+**Expected:** Reports missing fields
+
+- [ ] Script reports missing charts/layout
+- [ ] Exits non-zero
+
+### 7.5 dashboard-list
+```bash
+./scripts/dashboard-list prod
+```
+
+**Expected:** Tab-separated list of dashboard IDs and names
+
+- [ ] Script runs without error
+- [ ] Output shows id<TAB>name format
+
+### 7.6 dashboard-get
+```bash
+./scripts/dashboard-get prod <dashboard-id>
+```
+
+**Expected:** Full dashboard JSON
+
+- [ ] Script runs without error  
+- [ ] Output is valid JSON with charts, layout, etc.
+
+### 7.7 axiom-api (low-level)
+```bash
+./scripts/axiom-api prod GET /dashboards | jq '.[0].uid'
+```
+
+**Expected:** Returns a dashboard UID
+
+- [ ] Script runs without error
+- [ ] Returns valid JSON
+
+---
+
+## Test 8: Splunk Migration
+
+**Prompt:** "Convert this Splunk dashboard panel to Axiom:
+```spl
+index=http_logs status>=500 
+| timechart span=5m count by host
+```
+"
+
+**Expected behavior:**
+- Agent recognizes SPL and suggests using spl-to-apl
+- Translates to APL with:
+  - Explicit time filter
+  - `summarize count() by bin_auto(_time), host`
+- Recommends TimeSeries chart type
+
+**Validation:**
+- [ ] Recognizes SPL syntax
+- [ ] Adds time filter
+- [ ] Correct summarize/bin pattern
+- [ ] Recommends appropriate chart type
+
+---
+
+## Test 9: Integration with axiom-sre
+
+**Prompt:** "I have a dataset called 'app-logs' but I don't know what fields are available. Help me design a dashboard for it."
+
+**Expected behavior:**
+- Agent suggests running getschema first
+- Provides query: `['app-logs'] | where _time between (ago(1h) .. now()) | getschema`
+- After schema discovery, proposes dashboard structure based on available fields
+
+**Validation:**
+- [ ] Recommends schema discovery first
+- [ ] Doesn't guess field names
+- [ ] Adapts recommendations to actual schema
+
+---
+
+## Test 10: Design Best Practices
+
+**Prompt:** "I want to create a pie chart showing errors by user_id"
+
+**Expected behavior:**
+- Agent warns about high cardinality
+- Recommends Table instead of Pie
+- Suggests `top N` to limit rows
+
+**Validation:**
+- [ ] Warns about cardinality issues
+- [ ] Recommends Table over Pie for high cardinality
+- [ ] Suggests bounded query with top N
+
+**Prompt:** "Create a dashboard with 20 panels"
+
+**Expected behavior:**
+- Agent warns about cognitive overload
+- Recommends 8-12 panels max
+- Suggests splitting into multiple dashboards
+
+**Validation:**
+- [ ] Warns about too many panels
+- [ ] Recommends focused dashboards
+
+---
+
+## Test 11: End-to-End Dashboard Creation
+
+**Prompt:** "Create a complete oncall dashboard for 'sample-http-logs' with:
+- Error rate stat
+- p95 latency stat
+- Traffic over time
+- Error rate over time
+- Top failing URIs
+- Recent errors
+
+Output the complete dashboard JSON."
+
+**Expected behavior:**
+- Agent produces complete, valid dashboard JSON
+- All queries target sample-http-logs
+- All queries have time filters
+- Layout is logical (stats top, timeseries middle, table/logs bottom)
+
+**Validation:**
+- [ ] Complete JSON output
+- [ ] All 6 panels present
+- [ ] Queries syntactically correct
+- [ ] Layout makes sense
+- [ ] Can be validated with dashboard-validate
+
+---
+
+## Test 12: Dashboard Deployment (Optional)
+
+Requires Axiom API access and completed setup.
+
+```bash
+# Generate dashboard
+./scripts/dashboard-from-template service-overview "test-api" "synthetic_http" /tmp/deploy-test.json
+
+# Validate
+./scripts/dashboard-validate /tmp/deploy-test.json
+
+# Deploy to prod (uses ~/.axiom.toml)
+DASHBOARD_ID=$(./scripts/dashboard-create prod /tmp/deploy-test.json)
+
+# Get link
+./scripts/dashboard-link prod $DASHBOARD_ID
+```
+
+**Validation:**
+- [ ] Dashboard created in Axiom
+- [ ] All panels render correctly
+- [ ] Queries execute without error
+
+---
+
+## Validation Checklist
+
+### Core Features
+- [ ] Skill activates on dashboard requests
+- [ ] Intake workflow asks right questions
+- [ ] Template instantiation works
+- [ ] Chart type recommendations correct
+- [ ] APL patterns execute successfully
+- [ ] Layout recommendations sensible
+
+### Scripts
+- [ ] dashboard-new works
+- [ ] dashboard-from-template works
+- [ ] dashboard-validate catches issues
+- [ ] dashboard-list shows dashboards
+- [ ] dashboard-get fetches dashboard JSON
+- [ ] dashboard-create deploys dashboard
+- [ ] axiom-api makes authenticated requests
+
+### Integration
+- [ ] SPL migration triggers spl-to-apl patterns
+- [ ] Unknown schemas trigger discovery workflow
+- [ ] Best practices warnings fire correctly
+
+### Quality
+- [ ] No hardcoded field names (uses placeholders)
+- [ ] Dashboard APL has NO time filters (inherits from UI picker)
+- [ ] Ad-hoc/exploration APL has explicit time filters
+- [ ] LogStream queries have take limits
+- [ ] Layout IDs match chart IDs
+
+---
+
+**Last validated:** _(fill in after testing)_
diff --git a/.agents/skills/building-dashboards/tests/test-normalize.sh b/.agents/skills/building-dashboards/tests/test-normalize.sh
new file mode 100755
index 00000000..d27bf3f4
--- /dev/null
+++ b/.agents/skills/building-dashboards/tests/test-normalize.sh
@@ -0,0 +1,99 @@
+#!/usr/bin/env bash
+# test-normalize.sh: Test dashboard-normalize.jq layout normalization
+#
+# Usage: ./test-normalize.sh
+
+set -euo pipefail
+
+SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+SCRIPTS_DIR="$SCRIPT_DIR/../scripts"
+
+passed=0
+failed=0
+
+ok() {
+    ((passed++)) || true
+    echo "  ✓ $1"
+}
+
+fail() {
+    ((failed++)) || true
+    echo "  ✗ $1: $2"
+}
+
+normalize() {
+    jq -L "$SCRIPTS_DIR" 'include "dashboard-normalize"; normalize_dashboard_layout'
+}
+
+echo "Layout Normalization"
+echo "===================="
+
+# 1. Fills missing fields with defaults
+result=$(echo '{"layout":[{"i":"a","x":0,"y":0,"w":6,"h":4}]}' | normalize)
+if echo "$result" | jq -e '.layout[0] | .minH == 2 and .minW == 2 and .moved == false and .static == false' > /dev/null 2>&1; then
+    ok "fills missing fields with defaults"
+else
+    fail "fills missing fields with defaults" "got: $result"
+fi
+
+# 2. Preserves existing values
+result=$(echo '{"layout":[{"i":"a","x":0,"y":0,"w":6,"h":4,"minH":3,"minW":4,"moved":true,"static":true}]}' | normalize)
+if echo "$result" | jq -e '.layout[0] | .minH == 3 and .minW == 4 and .moved == true and .static == true' > /dev/null 2>&1; then
+    ok "preserves existing values"
+else
+    fail "preserves existing values" "got: $result"
+fi
+
+# 3. minH capped at h when h <= 2
+result=$(echo '{"layout":[{"i":"a","x":0,"y":0,"w":6,"h":1}]}' | normalize)
+if echo "$result" | jq -e '.layout[0].minH == 1' > /dev/null 2>&1; then
+    ok "minH equals h when h <= 2"
+else
+    fail "minH equals h when h <= 2" "got minH=$(echo "$result" | jq '.layout[0].minH')"
+fi
+
+# 4. minH defaults to 2 when h > 2
+result=$(echo '{"layout":[{"i":"a","x":0,"y":0,"w":6,"h":8}]}' | normalize)
+if echo "$result" | jq -e '.layout[0].minH == 2' > /dev/null 2>&1; then
+    ok "minH defaults to 2 when h > 2"
+else
+    fail "minH defaults to 2 when h > 2" "got minH=$(echo "$result" | jq '.layout[0].minH')"
+fi
+
+# 5. Empty layout array
+result=$(echo '{"layout":[]}' | normalize)
+if echo "$result" | jq -e '.layout == []' > /dev/null 2>&1; then
+    ok "empty layout array"
+else
+    fail "empty layout array" "got: $result"
+fi
+
+# 6. No layout key (should pass through unchanged)
+result=$(echo '{"name":"test"}' | normalize)
+if echo "$result" | jq -e '.name == "test" and .layout == null' > /dev/null 2>&1; then
+    ok "no layout key passes through"
+else
+    fail "no layout key passes through" "got: $result"
+fi
+
+# 7. Multiple layout entries
+result=$(echo '{"layout":[{"i":"a","x":0,"y":0,"w":6,"h":4},{"i":"b","x":6,"y":0,"w":6,"h":1}]}' | normalize)
+if echo "$result" | jq -e '.layout | length == 2 and .[0].minH == 2 and .[1].minH == 1' > /dev/null 2>&1; then
+    ok "multiple layout entries"
+else
+    fail "multiple layout entries" "got: $result"
+fi
+
+# 8. Other fields preserved
+result=$(echo '{"name":"dash","owner":"me","layout":[{"i":"a","x":0,"y":0,"w":6,"h":4}]}' | normalize)
+if echo "$result" | jq -e '.name == "dash" and .owner == "me"' > /dev/null 2>&1; then
+    ok "other fields preserved"
+else
+    fail "other fields preserved" "got: $result"
+fi
+
+echo ""
+echo "===================="
+echo "Passed: $passed | Failed: $failed"
+
+[[ $failed -eq 0 ]]
diff --git a/.agents/skills/building-dashboards/tests/test-script-output.sh b/.agents/skills/building-dashboards/tests/test-script-output.sh
new file mode 100755
index 00000000..42375435
--- /dev/null
+++ b/.agents/skills/building-dashboards/tests/test-script-output.sh
@@ -0,0 +1,132 @@
+#!/usr/bin/env bash
+# test-script-output.sh: Ensure deployment scripts keep machine-readable stdout
+#
+# Usage: ./test-script-output.sh
+
+set -euo pipefail
+
+SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+SCRIPTS_DIR="$SCRIPT_DIR/../scripts"
+
+passed=0
+failed=0
+
+ok() {
+    ((passed++)) || true
+    echo "  ✓ $1"
+}
+
+fail() {
+    ((failed++)) || true
+    echo "  ✗ $1: $2"
+}
+
+TMPDIR="$(mktemp -d)"
+trap 'rm -rf "$TMPDIR"' EXIT
+
+cp "$SCRIPTS_DIR/dashboard-create" "$TMPDIR/"
+cp "$SCRIPTS_DIR/dashboard-update" "$TMPDIR/"
+cp "$SCRIPTS_DIR/dashboard-chart-patch" "$TMPDIR/"
+cp "$SCRIPTS_DIR/dashboard-validate" "$TMPDIR/"
+cp "$SCRIPTS_DIR/dashboard-normalize.jq" "$TMPDIR/"
+
+chmod +x "$TMPDIR/dashboard-create" "$TMPDIR/dashboard-update" "$TMPDIR/dashboard-chart-patch" "$TMPDIR/dashboard-validate"
+
+cat > "$TMPDIR/input.json" <<'JSON'
+{
+  "id": "dashboard-root-id",
+  "version": "v1",
+  "createdAt": "2026-02-01T10:00:00Z",
+  "updatedAt": "2026-02-02T11:00:00Z",
+  "createdBy": "alice@example.com",
+  "updatedBy": "bob@example.com",
+  "schemaVersion": 2,
+  "name": "Test Dashboard",
+  "description": "Test",
+  "owner": "user-123",
+  "charts": [
+    {
+      "id": "error-rate",
+      "name": "Error Rate",
+      "type": "Statistic",
+      "query": { "apl": "['logs'] | summarize c=count()" }
+    }
+  ],
+  "layout": [
+    { "i": "error-rate", "x": 0, "y": 0, "w": 3, "h": 2 }
+  ]
+}
+JSON
+
+cat > "$TMPDIR/chart.patch.json" <<'JSON'
+{
+  "name": "Error Rate (5m)",
+  "query": { "apl": "['logs'] | summarize errors=countif(status >= 500)" },
+  "config": { "stale": null }
+}
+JSON
+
+cat > "$TMPDIR/axiom-api" <<'BASH'
+#!/usr/bin/env bash
+set -euo pipefail
+METHOD="${2:-}"
+PATH_="${3:-}"
+BODY="${4:-}"
+
+case "$METHOD:$PATH_" in
+  "POST:/dashboards")
+    echo '{"status":"created","dashboard":{"uid":"created-uid","id":"created-id","version":1,"dashboard":{"name":"Test Dashboard"},"createdAt":"2026-02-01T10:00:00Z","updatedAt":"2026-02-01T10:00:00Z","createdBy":"alice@example.com","updatedBy":"alice@example.com"}}'
+    ;;
+  "PUT:/dashboards/uid/dashboard-root-id")
+    echo '{"status":"updated","dashboard":{"uid":"dashboard-root-id","id":"dashboard-root-id","version":2,"dashboard":{"name":"Test Dashboard","updated":true},"createdAt":"2026-02-01T10:00:00Z","updatedAt":"2026-02-02T11:00:00Z","createdBy":"alice@example.com","updatedBy":"bob@example.com"}}'
+    ;;
+  "PATCH:/dashboards/uid/dashboard-root-id/charts/error-rate")
+    echo "$BODY" | jq -e '
+      .chart.name == "Error Rate (5m)" and
+      .chart.query.apl == "['\''logs'\''] | summarize errors=countif(status >= 500)" and
+      (.chart.config | has("stale")) and
+      .chart.config.stale == null and
+      .version == 7 and
+      .message == "Tune error chart" and
+      (.overwrite | not)
+    ' > /dev/null
+    echo '{"status":"updated","dashboard":{"uid":"dashboard-root-id","id":"dashboard-root-id","version":8,"dashboard":{"name":"Test Dashboard","chartPatched":true},"createdAt":"2026-02-01T10:00:00Z","updatedAt":"2026-02-02T12:00:00Z","createdBy":"alice@example.com","updatedBy":"bob@example.com"}}'
+    ;;
+  *)
+    echo "Unexpected call: $METHOD $PATH_" >&2
+    exit 1
+    ;;
+esac
+BASH
+
+chmod +x "$TMPDIR/axiom-api"
+
+echo "Script Stdout Contract"
+echo "======================"
+
+create_out=$("$TMPDIR/dashboard-create" prod "$TMPDIR/input.json")
+if [[ "$create_out" == "created-uid" ]]; then
+    ok "dashboard-create outputs only dashboard UID"
+else
+    fail "dashboard-create outputs only dashboard UID" "got: $create_out"
+fi
+
+update_out=$("$TMPDIR/dashboard-update" prod dashboard-root-id "$TMPDIR/input.json")
+if echo "$update_out" | jq -e '.dashboard.uid == "dashboard-root-id" and .dashboard.dashboard.updated == true' > /dev/null 2>&1; then
+    ok "dashboard-update outputs valid JSON only"
+else
+    fail "dashboard-update outputs valid JSON only" "got: $update_out"
+fi
+
+patch_out=$("$TMPDIR/dashboard-chart-patch" prod dashboard-root-id error-rate "$TMPDIR/chart.patch.json" --version 7 --message "Tune error chart")
+if echo "$patch_out" | jq -e '.dashboard.uid == "dashboard-root-id" and .dashboard.dashboard.chartPatched == true' > /dev/null 2>&1; then
+    ok "dashboard-chart-patch outputs valid JSON only"
+else
+    fail "dashboard-chart-patch outputs valid JSON only" "got: $patch_out"
+fi
+
+echo ""
+echo "======================"
+echo "Passed: $passed | Failed: $failed"
+
+[[ $failed -eq 0 ]]
diff --git a/.agents/skills/building-dashboards/tests/test-templates.sh b/.agents/skills/building-dashboards/tests/test-templates.sh
new file mode 100755
index 00000000..f37cb896
--- /dev/null
+++ b/.agents/skills/building-dashboards/tests/test-templates.sh
@@ -0,0 +1,79 @@
+#!/usr/bin/env bash
+# test-templates.sh: Validate dashboard templates
+#
+# Checks only high-value things:
+#   - Valid JSON
+#   - Chart IDs match layout IDs (catches real bugs)
+#   - Scripts are executable
+#
+# Usage: ./test-templates.sh
+
+set -euo pipefail
+
+SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+TEMPLATES_DIR="$SCRIPT_DIR/../reference/templates"
+SCRIPTS_DIR="$SCRIPT_DIR/../scripts"
+
+passed=0
+failed=0
+
+ok() {
+    ((passed++)) || true
+    echo "  ✓ $1"
+}
+
+fail() {
+    ((failed++)) || true
+    echo "  ✗ $1: $2"
+}
+
+echo "Template Validation"
+echo "==================="
+
+# Check jq is available
+if ! command -v jq &> /dev/null; then
+    echo "Error: jq is required" >&2
+    exit 1
+fi
+
+echo ""
+echo "[Templates]"
+
+for template in "$TEMPLATES_DIR"/*.json; do
+    name=$(basename "$template")
+    
+    # Valid JSON?
+    if ! jq empty "$template" 2>/dev/null; then
+        fail "$name" "invalid JSON"
+        continue
+    fi
+    
+    # Chart IDs match layout IDs?
+    chart_ids=$(jq -r '.charts[].id // empty' "$template" 2>/dev/null | sort)
+    layout_ids=$(jq -r '.layout[].i // empty' "$template" 2>/dev/null | sort)
+    
+    if [[ "$chart_ids" == "$layout_ids" ]]; then
+        ok "$name"
+    else
+        fail "$name" "chart/layout ID mismatch"
+    fi
+done
+
+echo ""
+echo "[Scripts]"
+
+for script in "$SCRIPTS_DIR"/*; do
+    name=$(basename "$script")
+    
+    if [[ -x "$script" ]]; then
+        ok "$name"
+    else
+        fail "$name" "not executable"
+    fi
+done
+
+echo ""
+echo "==================="
+echo "Passed: $passed | Failed: $failed"
+
+[[ $failed -eq 0 ]]
diff --git a/.agents/skills/query-metrics/README.md b/.agents/skills/query-metrics/README.md
new file mode 100644
index 00000000..82a4f0b4
--- /dev/null
+++ b/.agents/skills/query-metrics/README.md
@@ -0,0 +1,89 @@
+# query-metrics
+
+Runs metrics queries against Axiom MetricsDB and discovers available metrics, tags, and tag values.
+
+## What It Does
+
+- **Dataset Discovery** - List datasets with edge deployment info, auto-resolve regional edge URLs
+- **Metrics Queries** - Execute queries against OpenTelemetry metrics stored in Axiom MetricsDB
+- **Discovery** - List metrics, tags, and tag values in a dataset before writing queries
+- **Search** - Find metrics matching a known tag value (e.g., a service name)
+- **Spec** - Fetch the self-describing query specification with syntax and examples
+
+## Installation
+
+```bash
+# Amp
+amp skill add axiomhq/skills/query-metrics
+
+# npx (Claude Code, Cursor, Codex, and more)
+npx skills add axiomhq/skills -s query-metrics
+```
+
+## Prerequisites
+
+- Target dataset must be of kind `otel:metrics:v1`
+- Tools: `jq`, `curl`
+
+## Configuration
+
+Create `~/.axiom.toml` with your Axiom deployment(s):
+
+```toml
+[deployments.prod]
+url = "https://api.axiom.co"
+token = "xaat-your-api-token"
+org_id = "your-org-id"
+```
+
+Get your org_id from Settings → Organization. For the token, create a scoped **API token** (Settings → API Tokens) with the permissions your workflow needs. Avoid Personal Access Tokens for automated tooling.
+
+**Tip:** Run `scripts/setup` from the `axiom-sre` skill for interactive configuration.
+
+## Usage
+
+```bash
+# Setup and check requirements
+scripts/setup
+
+# List all datasets (with edge deployment info)
+scripts/datasets prod
+
+# List only metrics datasets
+scripts/datasets prod --kind otel:metrics:v1
+
+# Fetch the metrics query spec
+scripts/metrics-spec prod
+
+# List available metrics in a dataset
+scripts/metrics-info prod my-dataset metrics
+
+# List tags and tag values
+scripts/metrics-info prod my-dataset tags
+scripts/metrics-info prod my-dataset tags service.name values
+
+# Find metrics matching a value
+scripts/metrics-info prod my-dataset find-metrics "frontend"
+
+# Run a metrics query
+scripts/metrics-query prod \
+  '`my-dataset`:`http.server.duration` | align to 5m using avg | group by `endpoint` using sum' \
+  '2025-06-01T00:00:00Z' '2025-06-02T00:00:00Z'
+```
+
+## Scripts
+
+| Script | Purpose |
+|--------|---------|
+| `setup` | Check requirements and config |
+| `datasets` | List datasets with edge deployment info |
+| `metrics-spec` | Fetch metrics query specification |
+| `metrics-query` | Execute a metrics query (auto-resolves edge deployment) |
+| `metrics-info` | Discover metrics, tags, and values (auto-resolves edge deployment) |
+| `resolve-url` | Resolve dataset to edge deployment URL |
+| `axiom-api` | Low-level authenticated API calls |
+
+## Related Skills
+
+- `axiom-sre` - For running APL log queries and schema discovery
+- `building-dashboards` - For creating dashboards that include metrics panels
diff --git a/.agents/skills/query-metrics/SKILL.md b/.agents/skills/query-metrics/SKILL.md
new file mode 100644
index 00000000..c163affc
--- /dev/null
+++ b/.agents/skills/query-metrics/SKILL.md
@@ -0,0 +1,203 @@
+---
+name: query-metrics
+description: Runs metrics queries against Axiom MetricsDB via scripts. Discovers available metrics, tags, and tag values. Use when asked to query metrics, explore metric datasets, check metric values, or investigate OTel metrics data.
+---
+
+> **CRITICAL:** ALL script paths are relative to this skill's folder. Run them with full path (e.g., `scripts/metrics-query`).
+
+# Querying Axiom Metrics
+
+Query OpenTelemetry metrics stored in Axiom's MetricsDB.
+
+## Setup
+
+Run `scripts/setup` to check requirements (curl, jq, ~/.axiom.toml).
+
+Config in `~/.axiom.toml` (shared with axiom-sre):
+```toml
+[deployments.prod]
+url = "https://api.axiom.co"
+token = "xaat-your-token"
+org_id = "your-org-id"
+```
+
+The target dataset must be of kind `otel:metrics:v1`.
+
+---
+
+## Discovering Datasets
+
+List all datasets in a deployment:
+
+```bash
+scripts/datasets <deployment>
+```
+
+Filter to only metrics datasets:
+
+```bash
+scripts/datasets <deployment> --kind otel:metrics:v1
+```
+
+This returns each dataset's `name`, `edgeDeployment`, and `kind`. Use the dataset name in subsequent `metrics-info` and `metrics-query` calls.
+
+---
+
+## Edge Deployment Resolution
+
+Datasets can live in different edge deployments (e.g., `us-east-1` vs `eu-central-1`). The scripts **automatically resolve** the correct regional edge URL before querying. No manual configuration is needed — `metrics-info` and `metrics-query` detect the dataset's edge deployment and route requests to the right endpoint.
+
+| Edge Deployment | Edge Endpoint |
+|---|---|
+| `cloud.us-east-1.aws` | `https://us-east-1.aws.edge.axiom.co` |
+| `cloud.eu-central-1.aws` | `https://eu-central-1.aws.edge.axiom.co` |
+
+If resolution fails or the edge deployment is unknown, requests fall back to the deployment URL in `~/.axiom.toml`.
+
+---
+
+## Learning the Metrics Query Syntax
+
+> **CRITICAL:** You MUST run `metrics-spec` before composing your first query in a session. NEVER guess MPL syntax — it changes over time and the spec is the only source of truth.
+
+```bash
+scripts/metrics-spec <deployment> <dataset>
+```
+
+Re-consult the spec when using an unfamiliar operator, when a query returns a syntax error, or when constructing histogram/multi-metric queries.
+
+---
+
+## Workflow
+
+1. **List datasets**: Run `scripts/datasets <deployment>` to see available datasets and their edge deployments
+2. **Fetch the spec**: Run `scripts/metrics-spec <deployment> <dataset>` — **this step is mandatory before writing any query**
+3. **Discover metrics**: List available metrics via `scripts/metrics-info <deployment> <dataset> metrics`
+4. **Explore tags**: List tags and tag values to understand filtering options. If metrics listing fails, use tags and tag values to identify relevant entities, then use those to list metrics for specific tags.
+5. **Write and execute query**: Compose a metrics query and run it via `scripts/metrics-query`
+6. **Iterate**: Refine filters, aggregations, and groupings based on results
+
+If the user provides a specific service, host, or entity name to search for, use `find-metrics` to locate matching metrics:
+```bash
+scripts/metrics-info <deployment> <dataset> find-metrics "frontend"
+```
+Do NOT use `find-metrics` as a general discovery step — it requires a known search value.
+
+---
+
+## Query Metrics
+
+Execute a metrics query against a dataset:
+
+```bash
+scripts/metrics-query <deployment> '<mpl>' '<startTime>' '<endTime>'
+```
+
+**Examples:**
+```bash
+# Simple query
+scripts/metrics-query prod \
+  '`my-dataset`:`http.server.duration` | align to 5m using avg' \
+  '2025-06-01T00:00:00Z' \
+  '2025-06-02T00:00:00Z'
+
+# Query with filtering (note backticks on dotted tag names)
+scripts/metrics-query prod \
+  '`my-dataset`:`http.server.duration` | where `service.name` == "frontend" and method == "GET" | align to 5m using avg | group by status_code using sum' \
+  'now-1d' \
+  'now'
+```
+
+| Parameter | Required | Description |
+|-----------|----------|-------------|
+| `deployment` | Yes | Name from `~/.axiom.toml` (e.g., `prod`) |
+| `mpl` | Yes | Metrics query string. Dataset is extracted from the query itself. |
+| `startTime` | Yes | RFC3339 (e.g., `2025-01-01T00:00:00Z`) or relative expression (e.g., `now-1h`, `now-1d`) |
+| `endTime` | Yes | RFC3339 (e.g., `2025-01-02T00:00:00Z`) or relative expression (e.g., `now`) |
+
+---
+
+## Discovery (Info Endpoints)
+
+Use `scripts/metrics-info` to explore what metrics, tags, and values exist in a dataset before writing queries. Time range defaults to the last 24 hours; override with `--start` and `--end`.
+
+### List metrics in a dataset
+
+```bash
+scripts/metrics-info <deployment> <dataset> metrics
+```
+
+### List tags in a dataset
+
+```bash
+scripts/metrics-info <deployment> <dataset> tags
+```
+
+### List values for a specific tag
+
+```bash
+scripts/metrics-info <deployment> <dataset> tags <tag> values
+```
+
+### List tags for a specific metric
+
+```bash
+scripts/metrics-info <deployment> <dataset> metrics <metric> tags
+```
+
+### List tag values for a specific metric and tag
+
+```bash
+scripts/metrics-info <deployment> <dataset> metrics <metric> tags <tag> values
+```
+
+### Find metrics matching a tag value
+
+```bash
+scripts/metrics-info <deployment> <dataset> find-metrics "<search-value>"
+```
+
+### Custom time range
+
+All info commands accept `--start` and `--end` for custom time ranges:
+
+```bash
+scripts/metrics-info prod my-dataset metrics \
+  --start 2025-06-01T00:00:00Z \
+  --end 2025-06-02T00:00:00Z
+```
+
+---
+
+## Error Handling
+
+HTTP errors return JSON with `message`, `code`, and optional `detail` fields:
+```json
+{"message": "description", "code": 400, "detail": {"errorType": 1, "message": "raw error"}}
+```
+
+Common status codes:
+- 400 — Invalid query syntax or bad dataset name
+- 401 — Missing or invalid authentication
+- 403 — No permission to query/ingest this dataset
+- 404 — Dataset not found
+- 429 — Rate limited
+- 500 — Internal server error
+
+On a **500 error**, re-run the failing script call with `curl -v` flags to capture response headers, then report the `traceparent` or `x-axiom-trace-id` header value to the user. This trace ID is essential for debugging the failure with the backend team.
+
+---
+
+## Scripts
+
+| Script | Usage |
+|--------|-------|
+| `scripts/setup` | Check requirements and config |
+| `scripts/datasets <deploy> [--kind <kind>]` | List datasets (with edge deployment info) |
+| `scripts/metrics-spec <deploy> <dataset>` | Fetch metrics query specification |
+| `scripts/metrics-query <deploy> <mpl> <start> <end>` | Execute a metrics query |
+| `scripts/metrics-info <deploy> <dataset> ...` | Discover metrics, tags, and values |
+| `scripts/axiom-api <deploy> <method> <path> [body]` | Low-level API calls |
+| `scripts/resolve-url <deploy> <dataset>` | Resolve dataset to edge deployment URL |
+
+Run any script without arguments to see full usage.
diff --git a/.agents/skills/query-metrics/scripts/axiom-api b/.agents/skills/query-metrics/scripts/axiom-api
new file mode 100755
index 00000000..8b9d7227
--- /dev/null
+++ b/.agents/skills/query-metrics/scripts/axiom-api
@@ -0,0 +1,77 @@
+#!/usr/bin/env bash
+# axiom-api: Make authenticated requests to Axiom API
+#
+# Usage: axiom-api <deployment> <method> <path> [json-body]
+#
+# Reads credentials from ~/.axiom.toml (shared with axiom-sre)
+# Set AXIOM_URL_OVERRIDE to route requests to a specific edge deployment endpoint.
+#
+# Examples:
+#   axiom-api prod GET /v1/datasets
+#   axiom-api prod POST /v1/query/_mpl '{"mpl":"..."}'
+
+set -euo pipefail
+
+DEPLOYMENT="${1:-}"
+METHOD="${2:-}"
+PATH_="${3:-}"
+BODY="${4:-}"
+
+if [[ -z "$DEPLOYMENT" || -z "$METHOD" || -z "$PATH_" ]]; then
+    echo "Usage: axiom-api <deployment> <method> <path> [json-body]" >&2
+    exit 1
+fi
+
+CONFIG_FILE="$HOME/.axiom.toml"
+if [[ ! -f "$CONFIG_FILE" ]]; then
+    echo "Error: $CONFIG_FILE not found. Run scripts/setup for help." >&2
+    exit 1
+fi
+
+# Parse TOML for deployment config
+extract_value() {
+    local key="$1"
+    awk -v deployment="$DEPLOYMENT" -v key="$key" '
+        /^[[:space:]]*\[deployments\./ { in_deployment = ($0 ~ "\\[deployments\\." deployment "\\]") }
+        in_deployment && $1 == key { gsub(/[" ]/, "", $3); print $3; exit }
+    ' "$CONFIG_FILE"
+}
+
+URL="${AXIOM_URL_OVERRIDE:-$(extract_value "url")}"
+TOKEN=$(extract_value "token")
+ORG_ID=$(extract_value "org_id")
+
+if [[ -z "$URL" || -z "$TOKEN" || -z "$ORG_ID" ]]; then
+    echo "Error: Could not find deployment '$DEPLOYMENT' in $CONFIG_FILE" >&2
+    echo "" >&2
+    echo "Available deployments:" >&2
+    grep '[[:space:]]*\[deployments\.' "$CONFIG_FILE" | sed 's/.*\[deployments\.\(.*\)\]/  - \1/' >&2
+    exit 1
+fi
+
+CURL_ARGS=(
+    -s
+    -w '\n%{http_code}'
+    -X "$METHOD"
+    -H "Authorization: Bearer $TOKEN"
+    -H "X-Axiom-Org-Id: $ORG_ID"
+    -H "Content-Type: application/json"
+    -H "Accept: ${AXIOM_ACCEPT:-application/json}"
+)
+
+if [[ -n "$BODY" ]]; then
+    CURL_ARGS+=(-d "$BODY")
+fi
+
+RESPONSE=$(curl "${CURL_ARGS[@]}" "${URL}${PATH_}")
+
+HTTP_CODE=$(echo "$RESPONSE" | tail -1)
+BODY_CONTENT=$(echo "$RESPONSE" | sed '$d')
+
+if [[ "$HTTP_CODE" -ge 200 && "$HTTP_CODE" -lt 300 ]]; then
+    echo "$BODY_CONTENT"
+else
+    echo "Error: HTTP $HTTP_CODE from $METHOD ${URL}${PATH_}" >&2
+    echo "$BODY_CONTENT" >&2
+    exit 1
+fi
diff --git a/.agents/skills/query-metrics/scripts/datasets b/.agents/skills/query-metrics/scripts/datasets
new file mode 100755
index 00000000..be5e773e
--- /dev/null
+++ b/.agents/skills/query-metrics/scripts/datasets
@@ -0,0 +1,39 @@
+#!/usr/bin/env bash
+# datasets: List datasets in an Axiom deployment
+#
+# Usage:
+#   datasets <deployment>                  # List all datasets
+#   datasets <deployment> --kind <kind>    # Filter by kind (e.g., otel:metrics:v1)
+#
+# Examples:
+#   datasets prod
+#   datasets prod --kind otel:metrics:v1
+
+set -euo pipefail
+
+SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+
+DEPLOYMENT="${1:-}"
+shift || true
+
+if [[ -z "$DEPLOYMENT" ]]; then
+    echo "Usage: datasets <deployment> [--kind <kind>]" >&2
+    exit 1
+fi
+
+KIND_FILTER=""
+while [[ $# -gt 0 ]]; do
+    case "$1" in
+        --kind) KIND_FILTER="$2"; shift 2 ;;
+        *) echo "Unknown option: $1" >&2; exit 1 ;;
+    esac
+done
+
+RESPONSE=$("$SCRIPT_DIR/axiom-api" "$DEPLOYMENT" GET /v1/datasets)
+
+if [[ -n "$KIND_FILTER" ]]; then
+    echo "$RESPONSE" | jq --arg kind "$KIND_FILTER" \
+        '[.[] | select(.kind == $kind) | {name, edgeDeployment, kind}]'
+else
+    echo "$RESPONSE" | jq '[.[] | {name, edgeDeployment, kind}]'
+fi
diff --git a/.agents/skills/query-metrics/scripts/metrics-info b/.agents/skills/query-metrics/scripts/metrics-info
new file mode 100755
index 00000000..b75ac2e5
--- /dev/null
+++ b/.agents/skills/query-metrics/scripts/metrics-info
@@ -0,0 +1,132 @@
+#!/usr/bin/env bash
+# metrics-info: Discover metrics, tags, and tag values in a dataset
+#
+# Usage:
+#   metrics-info <deployment> <dataset> metrics [--start T --end T]
+#   metrics-info <deployment> <dataset> tags    [--start T --end T]
+#   metrics-info <deployment> <dataset> tags <tag> values [--start T --end T]
+#   metrics-info <deployment> <dataset> metrics <metric> tags [--start T --end T]
+#   metrics-info <deployment> <dataset> metrics <metric> tags <tag> values [--start T --end T]
+#   metrics-info <deployment> <dataset> find-metrics <search-value> [--start T --end T]
+#
+# find-metrics searches TAG VALUES, not metric names. It returns metrics that have
+# a tag containing the given value. Use it when you know a specific entity name
+# (service, host, device) to find which metrics are associated with it.
+# To list metric names, use the "metrics" subcommand instead.
+#
+# --start and --end default to the last 24 hours if omitted.
+# For sparse metrics (sensors, batch jobs), try --start with a wider range (e.g. 7 days).
+#
+# Examples:
+#   metrics-info prod my-dataset metrics                    # List all metric names
+#   metrics-info prod my-dataset tags service.name values   # List values for a tag
+#   metrics-info prod my-dataset find-metrics "frontend"    # Find metrics with tag value "frontend"
+#   metrics-info prod my-dataset metrics http.server.duration tags --start 2025-06-01T00:00:00Z --end 2025-06-02T00:00:00Z
+
+set -euo pipefail
+
+SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+
+show_usage() {
+    echo "Usage:" >&2
+    echo "  metrics-info <deploy> <dataset> metrics" >&2
+    echo "  metrics-info <deploy> <dataset> tags" >&2
+    echo "  metrics-info <deploy> <dataset> tags <tag> values" >&2
+    echo "  metrics-info <deploy> <dataset> metrics <metric> tags" >&2
+    echo "  metrics-info <deploy> <dataset> metrics <metric> tags <tag> values" >&2
+    echo "  metrics-info <deploy> <dataset> find-metrics <search-value>  (searches tag values, not metric names)" >&2
+    echo "" >&2
+    echo "Options:" >&2
+    echo "  --start T   Start time (RFC3339). Default: 24h ago" >&2
+    echo "  --end T     End time (RFC3339). Default: now" >&2
+    exit 1
+}
+
+DEPLOYMENT="${1:-}"
+DATASET="${2:-}"
+
+if [[ -z "$DEPLOYMENT" || -z "$DATASET" ]]; then
+    show_usage
+fi
+
+shift 2
+
+# Collect positional args and parse --start/--end
+POSITIONAL=()
+START=""
+END=""
+while [[ $# -gt 0 ]]; do
+    case "$1" in
+        --start) START="$2"; shift 2 ;;
+        --end)   END="$2"; shift 2 ;;
+        *)       POSITIONAL+=("$1"); shift ;;
+    esac
+done
+
+# Default time range: last 24 hours
+if [[ -z "$START" ]]; then
+    if date --version &>/dev/null 2>&1; then
+        START=$(date -u -d '24 hours ago' '+%Y-%m-%dT%H:%M:%SZ')
+    else
+        START=$(date -u -v-24H '+%Y-%m-%dT%H:%M:%SZ')
+    fi
+fi
+if [[ -z "$END" ]]; then
+    END=$(date -u '+%Y-%m-%dT%H:%M:%SZ')
+fi
+
+TIME_PARAMS="start=${START}&end=${END}"
+BASE="/v1/query/metrics/info/datasets/${DATASET}"
+
+# Resolve the regional edge URL for this dataset
+RESOLVED_URL=$("$SCRIPT_DIR/resolve-url" "$DEPLOYMENT" "$DATASET" 2>/dev/null || true)
+if [[ -n "$RESOLVED_URL" ]]; then
+    export AXIOM_URL_OVERRIDE="$RESOLVED_URL"
+fi
+
+if [[ ${#POSITIONAL[@]} -eq 0 ]]; then
+    show_usage
+fi
+
+case "${POSITIONAL[0]}" in
+    metrics)
+        if [[ ${#POSITIONAL[@]} -eq 1 ]]; then
+            # List metrics
+            AXIOM_ACCEPT="application/vnd.metrics-info.v2+json" "$SCRIPT_DIR/axiom-api" "$DEPLOYMENT" GET "${BASE}/metrics?${TIME_PARAMS}"
+        elif [[ ${#POSITIONAL[@]} -eq 3 && "${POSITIONAL[2]}" == "tags" ]]; then
+            # List tags for a metric
+            METRIC="${POSITIONAL[1]}"
+            "$SCRIPT_DIR/axiom-api" "$DEPLOYMENT" GET "${BASE}/metrics/${METRIC}/tags?${TIME_PARAMS}"
+        elif [[ ${#POSITIONAL[@]} -eq 5 && "${POSITIONAL[2]}" == "tags" && "${POSITIONAL[4]}" == "values" ]]; then
+            # List tag values for a metric+tag
+            METRIC="${POSITIONAL[1]}"
+            TAG="${POSITIONAL[3]}"
+            "$SCRIPT_DIR/axiom-api" "$DEPLOYMENT" GET "${BASE}/metrics/${METRIC}/tags/${TAG}/values?${TIME_PARAMS}"
+        else
+            show_usage
+        fi
+        ;;
+    tags)
+        if [[ ${#POSITIONAL[@]} -eq 1 ]]; then
+            # List tags
+            "$SCRIPT_DIR/axiom-api" "$DEPLOYMENT" GET "${BASE}/tags?${TIME_PARAMS}"
+        elif [[ ${#POSITIONAL[@]} -eq 3 && "${POSITIONAL[2]}" == "values" ]]; then
+            # List values for a tag
+            TAG="${POSITIONAL[1]}"
+            "$SCRIPT_DIR/axiom-api" "$DEPLOYMENT" GET "${BASE}/tags/${TAG}/values?${TIME_PARAMS}"
+        else
+            show_usage
+        fi
+        ;;
+    find-metrics)
+        if [[ ${#POSITIONAL[@]} -ne 2 ]]; then
+            show_usage
+        fi
+        VALUE="${POSITIONAL[1]}"
+        BODY=$(jq -n --arg v "$VALUE" '{value: $v}')
+        "$SCRIPT_DIR/axiom-api" "$DEPLOYMENT" POST "${BASE}/metrics?${TIME_PARAMS}" "$BODY"
+        ;;
+    *)
+        show_usage
+        ;;
+esac
diff --git a/.agents/skills/query-metrics/scripts/metrics-query b/.agents/skills/query-metrics/scripts/metrics-query
new file mode 100755
index 00000000..596a323b
--- /dev/null
+++ b/.agents/skills/query-metrics/scripts/metrics-query
@@ -0,0 +1,58 @@
+#!/usr/bin/env bash
+# metrics-query: Execute a metrics query against Axiom MetricsDB
+#
+# Usage: metrics-query <deployment> <mpl> <startTime> <endTime>
+#
+# Times: RFC3339 (e.g. 2025-01-01T00:00:00Z) or relative (e.g. now-1h, now-1d).
+#
+# Examples:
+#   metrics-query prod '`otel-metrics`:`http.server.duration` | align to 5m using avg | group by `endpoint` using sum' \
+#     '2025-06-01T00:00:00Z' '2025-06-02T00:00:00Z'
+#   metrics-query prod '`otel-metrics`:`http.server.duration` | align to 5m using avg' now-1h now
+
+set -euo pipefail
+
+SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+
+DEPLOYMENT="${1:-}"
+MPL="${2:-}"
+START_TIME="${3:-}"
+END_TIME="${4:-}"
+
+if [[ -z "$DEPLOYMENT" || -z "$MPL" || -z "$START_TIME" || -z "$END_TIME" ]]; then
+    echo "Usage: metrics-query <deployment> <mpl> <startTime> <endTime>" >&2
+    echo "" >&2
+    echo "Times: RFC3339 (e.g. 2025-01-01T00:00:00Z) or relative (e.g. now-1h, now-1d)." >&2
+    exit 1
+fi
+
+# Extract dataset name from MPL: `dataset`:`metric` ... or dataset:`metric` ...
+DATASET=$(echo "$MPL" | sed 's/`//g' | cut -d: -f1 | tr -d '[:space:]')
+
+QUERY_EDGE_DEPLOYMENT=""
+if [[ -n "$DATASET" ]]; then
+    RESOLVED_URL=$("$SCRIPT_DIR/resolve-url" "$DEPLOYMENT" "$DATASET" 2>/dev/null || true)
+    if [[ -n "$RESOLVED_URL" ]]; then
+        export AXIOM_URL_OVERRIDE="$RESOLVED_URL"
+        # Derive queryEdgeDeployment from edge URL: https://eu-central-1.aws.edge.axiom.co → cloud.eu-central-1.aws
+        if [[ "$RESOLVED_URL" == *".edge.axiom.co" ]]; then
+            QUERY_EDGE_DEPLOYMENT="cloud.$(echo "$RESOLVED_URL" | sed 's|https://||;s|\.edge\.axiom\.co||')"
+        fi
+    fi
+fi
+
+JQ_ARGS=(
+    --arg apl "$MPL"
+    --arg startT "$START_TIME"
+    --arg endT "$END_TIME"
+)
+JQ_EXPR='{"apl": $apl, "startTime": $startT, "endTime": $endT}'
+
+if [[ -n "$QUERY_EDGE_DEPLOYMENT" ]]; then
+    JQ_ARGS+=(--arg edgeDeployment "$QUERY_EDGE_DEPLOYMENT")
+    JQ_EXPR='{"apl": $apl, "startTime": $startT, "endTime": $endT, "queryEdgeDeployment": $edgeDeployment}'
+fi
+
+BODY=$(jq -n "${JQ_ARGS[@]}" "$JQ_EXPR")
+
+AXIOM_ACCEPT="application/json+metrics.v2" "$SCRIPT_DIR/axiom-api" "$DEPLOYMENT" POST "/v1/query/_mpl" "$BODY"
diff --git a/.agents/skills/query-metrics/scripts/metrics-spec b/.agents/skills/query-metrics/scripts/metrics-spec
new file mode 100755
index 00000000..5f022253
--- /dev/null
+++ b/.agents/skills/query-metrics/scripts/metrics-spec
@@ -0,0 +1,31 @@
+#!/usr/bin/env bash
+# metrics-spec: Fetch the metrics query specification from Axiom
+#
+# Usage: metrics-spec <deployment> <dataset>
+#
+# Calls OPTIONS /v1/query/_mpl to retrieve the complete metrics query
+# spec with syntax, operators, and examples. Read this before composing queries.
+#
+# The dataset is needed to resolve the correct edge deployment URL.
+#
+# Example:
+#   metrics-spec prod my-metrics-dataset
+
+set -euo pipefail
+
+SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+
+DEPLOYMENT="${1:-}"
+DATASET="${2:-}"
+
+if [[ -z "$DEPLOYMENT" || -z "$DATASET" ]]; then
+    echo "Usage: metrics-spec <deployment> <dataset>" >&2
+    exit 1
+fi
+
+RESOLVED_URL=$("$SCRIPT_DIR/resolve-url" "$DEPLOYMENT" "$DATASET" 2>/dev/null || true)
+if [[ -n "$RESOLVED_URL" ]]; then
+    export AXIOM_URL_OVERRIDE="$RESOLVED_URL"
+fi
+
+AXIOM_ACCEPT="text/markdown" "$SCRIPT_DIR/axiom-api" "$DEPLOYMENT" OPTIONS "/v1/query/_mpl"
diff --git a/.agents/skills/query-metrics/scripts/resolve-url b/.agents/skills/query-metrics/scripts/resolve-url
new file mode 100755
index 00000000..be559818
--- /dev/null
+++ b/.agents/skills/query-metrics/scripts/resolve-url
@@ -0,0 +1,68 @@
+#!/usr/bin/env bash
+# resolve-url: Resolve the regional edge URL for a dataset
+#
+# Usage: resolve-url <deployment> <dataset>
+#
+# Fetches the dataset's edgeDeployment from the Axiom API and maps it to the
+# correct edge URL. Prints the URL to stdout.
+#
+# Edge deployment mapping:
+#   cloud.us-east-1.aws    → https://us-east-1.aws.edge.axiom.co
+#   cloud.eu-central-1.aws → https://eu-central-1.aws.edge.axiom.co
+#   (null/empty)           → falls back to deployment URL from config
+
+set -euo pipefail
+
+SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+
+DEPLOYMENT="${1:-}"
+DATASET="${2:-}"
+
+if [[ -z "$DEPLOYMENT" || -z "$DATASET" ]]; then
+    echo "Usage: resolve-url <deployment> <dataset>" >&2
+    exit 1
+fi
+
+CONFIG_FILE="$HOME/.axiom.toml"
+CACHE_DIR="${TMPDIR:-/tmp}/axiom-resolve-url"
+CACHE_FILE="${CACHE_DIR}/${DEPLOYMENT}__${DATASET}"
+CACHE_TTL=3600  # 1 hour
+
+# Return cached result if fresh
+if [[ -f "$CACHE_FILE" ]]; then
+    if [[ "$(uname)" == "Darwin" ]]; then
+        FILE_AGE=$(( $(date +%s) - $(stat -f %m "$CACHE_FILE") ))
+    else
+        FILE_AGE=$(( $(date +%s) - $(stat -c %Y "$CACHE_FILE") ))
+    fi
+    if [[ "$FILE_AGE" -lt "$CACHE_TTL" ]]; then
+        cat "$CACHE_FILE"
+        exit 0
+    fi
+fi
+
+extract_value() {
+    local key="$1"
+    awk -v deployment="$DEPLOYMENT" -v key="$key" '
+        /^[[:space:]]*\[deployments\./ { in_deployment = ($0 ~ "\\[deployments\\." deployment "\\]") }
+        in_deployment && $1 == key { gsub(/[" ]/, "", $3); print $3; exit }
+    ' "$CONFIG_FILE"
+}
+
+FALLBACK_URL=$(extract_value "url")
+
+EDGE_DEPLOYMENT=$("$SCRIPT_DIR/axiom-api" "$DEPLOYMENT" GET /v1/datasets \
+    | jq -r --arg name "$DATASET" '.[] | select(.name == $name) | .edgeDeployment // empty')
+
+RESOLVED_URL=""
+if [[ -z "$EDGE_DEPLOYMENT" || "$EDGE_DEPLOYMENT" == "null" ]]; then
+    RESOLVED_URL="$FALLBACK_URL"
+else
+    # cloud.us-east-1.aws → https://us-east-1.aws.edge.axiom.co
+    EDGE_HOST="${EDGE_DEPLOYMENT#cloud.}"
+    RESOLVED_URL="https://${EDGE_HOST}.edge.axiom.co"
+fi
+
+mkdir -p "$CACHE_DIR"
+echo "$RESOLVED_URL" > "$CACHE_FILE"
+echo "$RESOLVED_URL"
diff --git a/.agents/skills/query-metrics/scripts/setup b/.agents/skills/query-metrics/scripts/setup
new file mode 100755
index 00000000..7467df38
--- /dev/null
+++ b/.agents/skills/query-metrics/scripts/setup
@@ -0,0 +1,86 @@
+#!/usr/bin/env bash
+# Setup query-metrics skill
+# Usage: scripts/setup
+#
+# This script:
+#   1. Checks for required tools (curl, jq)
+#   2. Checks for ~/.axiom.toml (shared with axiom-sre)
+#   3. Makes scripts executable
+
+set -euo pipefail
+
+SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+
+echo "=== query-metrics Setup ==="
+echo ""
+
+# --- Check required tools ---
+echo "[1/3] Checking required tools..."
+
+MISSING=()
+for cmd in curl jq; do
+    if command -v "$cmd" &> /dev/null; then
+        echo "✓ $cmd found"
+    else
+        echo "✗ $cmd not found"
+        MISSING+=("$cmd")
+    fi
+done
+
+if [[ ${#MISSING[@]} -gt 0 ]]; then
+    echo ""
+    echo "Install missing tools:"
+    for cmd in "${MISSING[@]}"; do
+        case "$cmd" in
+            jq) echo "  brew install jq  # or apt-get install jq" ;;
+            curl) echo "  brew install curl  # or apt-get install curl" ;;
+        esac
+    done
+    exit 1
+fi
+
+# --- Make scripts executable ---
+echo ""
+echo "[2/3] Making scripts executable..."
+chmod +x "$SCRIPT_DIR"/*
+echo "✓ Scripts ready"
+
+# --- Check Axiom config ---
+echo ""
+echo "[3/3] Checking Axiom configuration..."
+
+AXIOM_CONFIG="$HOME/.axiom.toml"
+if [[ -f "$AXIOM_CONFIG" ]]; then
+    DEPLOYMENTS=$(grep -c '[[:space:]]*\[deployments\.' "$AXIOM_CONFIG" 2>/dev/null || echo 0)
+    echo "✓ Found $AXIOM_CONFIG with $DEPLOYMENTS deployment(s)"
+
+    # List deployments
+    echo "  Deployments:"
+    grep '[[:space:]]*\[deployments\.' "$AXIOM_CONFIG" | sed 's/.*\[deployments\.\(.*\)\]/    - \1/'
+else
+    echo "⚠ $AXIOM_CONFIG not found"
+    echo ""
+    echo "Create it to enable metrics queries:"
+    echo ""
+    cat << 'EOF'
+[deployments.prod]
+url = "https://api.axiom.co"
+token = "xaat-your-token-here"
+org_id = "your-org-id"
+EOF
+    echo ""
+    echo "Get your org_id from Settings → Organization."
+    echo "For the token, create a scoped API token (Settings → API Tokens) with the permissions your workflow needs."
+fi
+
+echo ""
+echo "=== Setup Complete ==="
+echo ""
+echo "Usage:"
+echo "  scripts/datasets prod                                   # List datasets"
+echo "  scripts/datasets prod --kind otel:metrics:v1            # List metrics datasets"
+echo "  scripts/metrics-spec prod <dataset>                      # Fetch query spec"
+echo "  scripts/metrics-info prod <dataset> metrics              # List metrics"
+echo "  scripts/metrics-info prod <dataset> tags                 # List tags"
+echo "  scripts/metrics-query prod '<mpl>' '<start>' '<end>'    # Run query"
+echo ""
diff --git a/skills-lock.json b/skills-lock.json
new file mode 100644
index 00000000..0ea9025b
--- /dev/null
+++ b/skills-lock.json
@@ -0,0 +1,17 @@
+{
+  "version": 1,
+  "skills": {
+    "building-dashboards": {
+      "source": "axiomhq/skills",
+      "sourceType": "github",
+      "skillPath": "skills/building-dashboards/SKILL.md",
+      "computedHash": "b261d40b92fad95950857cd302e57e8477d64b0cc3e23d98877e5c30f857bc70"
+    },
+    "query-metrics": {
+      "source": "axiomhq/skills",
+      "sourceType": "github",
+      "skillPath": "skills/query-metrics/SKILL.md",
+      "computedHash": "49868aa447c3196fd85ae3ba1462429a3fea8b2106a49d4f46d0f8925f1b8eb9"
+    }
+  }
+}