diff --git a/context/skills/self-driving/description.md b/context/skills/self-driving/description.md index 3c3adbf..c0a4b17 100644 --- a/context/skills/self-driving/description.md +++ b/context/skills/self-driving/description.md @@ -1,6 +1,6 @@ # PostHog Self-driving setup -This skill configures PostHog Signals for a project that already has PostHog installed: it switches on the signal sources (the inbox's "Responders") that match what the product actually uses, makes sure the GitHub integration is connected so Signals can research and fix issues in code, tunes the scout troop, designs custom scouts for the watchable surfaces the canonical troop doesn't cover (always proposed to the user first). Organization-level AI data processing approval — which everything downstream depends on — is enforced by the wizard itself before this skill runs. +This skill configures PostHog Signals for a project that already has PostHog installed: it switches on the signal sources (the inbox's "Responders") that match what the product actually uses, makes sure the GitHub integration is connected so Signals can research and fix issues in code, tunes the scout troop, designs custom scouts for the watchable surfaces the built-in troop doesn't cover (always proposed to the user first). Organization-level AI data processing approval — which everything downstream depends on — is enforced by the wizard itself before this skill runs. The wizard's run prompt supplies the project URLs (integrations settings, organization AI settings, new warehouse source, Signals inbox). Use those exact URLs whenever a step sends the user to the browser. @@ -18,7 +18,7 @@ Each step file points to the next. Run them in order. **Start by reading `refere - **Every write must be idempotent.** List before you create. A duplicate `inbox-source-configs-create` returns 400 — recover by finding the existing row's `id` and calling `inbox-source-configs-partial-update` with `enabled: true`. - **Never disable a source the user already enabled.** You only switch things on (and tune scouts off); existing enabled rows are someone's deliberate choice. - **Never enable a connected-tool source the user hasn't confirmed they use.** GitHub Issues, Linear, Zendesk, and pganalyze are ask-then-connect, never blind. -- **Stay off the internal surfaces.** Don't call `signals-scout-emit-signal` or any scratchpad-write tool, and don't change a scout's `emit` flag or `run_interval_minutes` — on configs, this skill only flips `enabled`. **Canonical scout bodies are never edited.** New scout skills are created in exactly one place: step 6b, and only ones the user approved there. +- **Stay off the internal surfaces.** Don't call `signals-scout-emit-signal` or any scratchpad-write tool, and don't change a scout's `emit` flag or `run_interval_minutes` — on configs, this skill only flips `enabled`. **Built-in scout bodies are never edited.** New scout skills are created in exactly one place: step 6b, and only ones the user approved there. - **Keep the scout troop small.** Every enabled scout is a recurring LLM spend. Step 6 enables only `signals-scout-general` plus the **one or two** specialists for the products this project uses most — never error tracking or session replay (those reach the inbox as native sources) — and step 6b adds **at most two** custom scouts. Everything else stays disabled. - **Batch your questions.** `wizard_ask` has a small per-run budget; one multi-select beats four yes/nos. Don't skip a step or drop a connector (e.g. Linear) or custom scouts setup to save calls. - **Decline goes first.** Every `wizard_ask` that offers choices must include a plain-language decline option (skip / none / "keep what's there"), and it must be the **first** option so it is the default highlight — an accidental `enter` then declines instead of committing the user to something. The **one exception is step 3's GitHub gate**: the run cannot proceed without GitHub, so there the affirmative ("Done — I've installed it") stays first and the decline ("I can't connect right now", which aborts) stays last. diff --git a/context/skills/self-driving/references/6-scouts.md b/context/skills/self-driving/references/6-scouts.md index 7195ffd..23dae09 100644 --- a/context/skills/self-driving/references/6-scouts.md +++ b/context/skills/self-driving/references/6-scouts.md @@ -20,7 +20,7 @@ Load via `ToolSearch select:mcp__posthog-wizard__signals-scout-config-sync,mcp__ ## Do -1. **Materialize**: call `signals-scout-config-sync`. It is idempotent — it seeds the canonical scout skills for this team and creates any missing configs, then returns the troop. +1. **Materialize**: call `signals-scout-config-sync`. It is idempotent — it seeds the built-in scout skills for this team and creates any missing configs, then returns the troop. **Soft-degrade if the tool is missing or fails**: fall back to `signals-scout-config-list`. If that returns rows, tune those. If it returns nothing, the troop hasn't been materialized yet — record a follow-up ("the scout troop materializes automatically within ~30 minutes; tune it later in PostHog or re-run this setup") and continue to step 7. **Not an abort.** diff --git a/context/skills/self-driving/references/6b-tailor-scouts.md b/context/skills/self-driving/references/6b-tailor-scouts.md index ad7f60b..d2c1ee9 100644 --- a/context/skills/self-driving/references/6b-tailor-scouts.md +++ b/context/skills/self-driving/references/6b-tailor-scouts.md @@ -4,9 +4,9 @@ next_step: 7-report.md # Step 6b — Custom scouts for this product -The canonical troop covers generic surfaces (errors, anomalies, observability gaps, health). You are the only actor in this pipeline that has read the repo — you know what the events *mean*, which ones form a funnel, and which domain surfaces matter. This step turns that into coverage: custom scouts for the watchable surfaces no canonical scout owns. +The built-in troop covers generic surfaces (errors, anomalies, observability gaps, health). You are the only actor in this pipeline that has read the repo — you know what the events *mean*, which ones form a funnel, and which domain surfaces matter. This step turns that into coverage: custom scouts for the watchable surfaces no built-in scout owns. -**Canonical scout bodies are never edited** — not here, not anywhere in this setup. Tuning happens in step 6 (`enabled` flags only); new coverage happens here as new, separately-named scouts. This step is **propose-first and fully skippable**: nothing is created until the user approves, and a decline (or any tool failure) means you record the decision and continue to step 7. **Not an abort.** +**Built-in scout bodies are never edited** — not here, not anywhere in this setup. Tuning happens in step 6 (`enabled` flags only); new coverage happens here as new, separately-named scouts. This step is **propose-first and fully skippable**: nothing is created until the user approves, and a decline (or any tool failure) means you record the decision and continue to step 7. **Not an abort.** ## Status @@ -24,11 +24,11 @@ Load via `ToolSearch select:mcp__posthog-wizard__llma-skill-get,mcp__posthog-wiz 1. **Read the authoring guide.** `llma-skill-get {"skill_name": "authoring-signals-scouts"}` — step 6's sync seeded it into this team's skills store alongside the troop. It defines the scout anatomy (quick close-out → orient → discriminator → explore patterns → save-memory → decide → disqualifiers → close-out), the emit contract, and the quality bar. Follow it for every scout you write; pull its bundled references via `llma-skill-file-get` only for the sections you need. - **Soft-degrade if it 404s** (older PostHog deploy that doesn't seed companions): read a canonical scout body via `llma-skill-get` (e.g. `signals-scout-general`) and use it as your only template. If neither is readable, record a follow-up ("add custom scouts once the authoring guide is available") and continue to step 7. + **Soft-degrade if it 404s** (older PostHog deploy that doesn't seed companions): read a built-in scout body via `llma-skill-get` (e.g. `signals-scout-general`) and use it as your only template. If neither is readable, record a follow-up ("add custom scouts once the authoring guide is available") and continue to step 7. -2. **Do the gap analysis — this is the thinking step, take it seriously.** Lay the project evidence (the setup report's event taxonomy above all, plus the step-2 checklist: funnel structure, payment/LLM/survey surfaces, warehouse sources, integrations) against what the canonical troop already watches. For each candidate surface ask, in order: +2. **Do the gap analysis — this is the thinking step, take it seriously.** Lay the project evidence (the setup report's event taxonomy above all, plus the step-2 checklist: funnel structure, payment/LLM/survey surfaces, warehouse sources, integrations) against what the built-in troop already watches. For each candidate surface ask, in order: - **Is it watchable?** Concrete events with names you can list, a funnel with ordered steps, a domain loop with a success/failure pair. "It's a web app" is not a surface. - - **Is it uncovered?** Three things can already own a surface, and a custom scout that duplicates any of them adds noise, not coverage: (1) a canonical scout step 6 kept enabled — the 1–2 specialists or `signals-scout-general`, which sweeps cross-product surfaces every run (e.g. generic anomalies belong to `signals-scout-anomaly-detection` if it was picked); (2) a **native source** — error tracking and session replay are consumed as sources in step 4, so never propose a custom scout for error bursts or replay analysis even though their canonical scouts are now disabled. If the surface is only watched by a canonical scout step 6 *disabled* (not `general`, not a native source), it is genuinely uncovered and fair game. + - **Is it uncovered?** Three things can already own a surface, and a custom scout that duplicates any of them adds noise, not coverage: (1) a built-in scout step 6 kept enabled — the 1–2 specialists or `signals-scout-general`, which sweeps cross-product surfaces every run (e.g. generic anomalies belong to `signals-scout-anomaly-detection` if it was picked); (2) a **native source** — error tracking and session replay are consumed as sources in step 4, so never propose a custom scout for error bursts or replay analysis even though their built-in scouts are now disabled. If the surface is only watched by a built-in scout step 6 *disabled* (not `general`, not a native source), it is genuinely uncovered and fair game. - **Would its scout pass the quality bar?** You must be able to name its signal-vs-noise discriminator and 2–4 concrete explore patterns *before* proposing it. If you can't, the surface isn't ready for a scout — record it as a report note instead. Typical shapes that survive all three filters: the product's core funnel (creation → completion → conversion), a domain job pipeline with success/failure events, a critical third-party dependency the events expose (e.g. an external API search that can silently degrade). **Propose at most two custom scouts — never more, even if more surfaces look watchable.** Zero is a perfectly good outcome and one or two is the norm; if three or more look worthwhile, the filters were too loose — keep only the two highest-value ones and record the rest as report notes. Every scout is a recurring scheduled LLM spend — every tick costs a full run even when it's quiet — so each must earn its keep, and the hard cap also keeps the proposal readable in the terminal, where each scout needs room for its explanation. @@ -36,7 +36,7 @@ Load via `ToolSearch select:mcp__posthog-wizard__llma-skill-get,mcp__posthog-wiz 3. **Propose them in ONE `wizard_ask`.** If the gap analysis surfaced **no** candidate, skip this ask entirely and go straight to the status line ("Custom scouts: none"). Otherwise emit one multi-select question — one option per proposed scout (**at most two**), plus a leading "none" option. Write everything for a **human who has never heard the word "scout"**: define the term once in the question `prompt`, in one plain sentence (e.g. "Scouts are scheduled checks that watch your data and flag issues for your inbox."). Each scout option carries a short `label` **and** a `description`: - **`label`** — a plain-language title of what it would watch for, in product terms — e.g. "Watch your signup funnel for conversion drops", not "signals-scout-signup-funnel". One short line. - **`description`** — one or two sentences saying **what it watches and what would make it speak up**, in words a product person reads naturally. This renders dimmed and wrapped beneath the label, so it is where the real explanation lives — **never leave it empty, and never collapse it back into the label.** Do **not** surface raw event names (`run_failed`/`run_started`), internal metric tokens (`p95 duration_s`, `not_matched/candidates_total`), or jargon labels like "Discriminator:" / "Not covered by:" — translate those into plain English. - - **Make the first option an explicit decline** so declining is always one keystroke away and is the safe default: `{ "label": "None — keep the canonical troop", "value": "none", "description": "Skip custom scouts; the built-in troop already covers this project." }`. It must be **first** — it is the default highlight, so a user who just presses enter declines rather than accidentally accepting a scout. + - **Make the first option an explicit decline** so declining is always one keystroke away and is the safe default: `{ "label": "None — keep the built-in troop", "value": "none", "description": "Skip custom scouts; the built-in troop already covers this project." }`. It must be **first** — it is the default highlight, so a user who just presses enter declines rather than accidentally accepting a scout. - Keep the machine name `signals-scout-` (prefix mandatory — anything else never runs) **internal**: you still need it for `llma-skill-create`, but it never appears in any text the user reads. Shape (one scout shown; add a second only if a second survived the filters): @@ -49,7 +49,7 @@ Load via `ToolSearch select:mcp__posthog-wizard__llma-skill-get,mcp__posthog-wiz "kind": "multi", "prompt": "Scouts are scheduled checks that watch your data and flag issues for your inbox. Based on your project I found a gap the built-in troop doesn't cover — add it, or none.", "options": [ - { "label": "None — keep the canonical troop", "value": "none", "description": "Skip custom scouts; the built-in troop already covers this project." }, + { "label": "None — keep the built-in troop", "value": "none", "description": "Skip custom scouts; the built-in troop already covers this project." }, { "label": "Watch your signup funnel for conversion drops", "value": "signals-scout-signup-funnel", "description": "Speaks up when sign-up completion falls below its recent norm, so a broken or regressed onboarding step gets caught fast." } ] } @@ -69,6 +69,6 @@ Load via `ToolSearch select:mcp__posthog-wizard__llma-skill-get,mcp__posthog-wiz [STATUS] Custom scouts: created run-pipeline; declined: none ``` -(adjust to the actual decisions; if nothing was warranted or the user declined everything, say "Custom scouts: none — canonical troop covers this project".) +(adjust to the actual decisions; if nothing was warranted or the user declined everything, say "Custom scouts: none — built-in troop covers this project".) -Record for the report: each created scout's design rationale (surface, discriminator, why no canonical covers it), surfaces you considered and ruled out (with the filter that killed them), declined proposals, and the noise escape hatch — if a scout turns out noisy, setting `emit: false` on its config in PostHog switches it to dry-run. +Record for the report: each created scout's design rationale (surface, discriminator, why no built-in covers it), surfaces you considered and ruled out (with the filter that killed them), declined proposals, and the noise escape hatch — if a scout turns out noisy, setting `emit: false` on its config in PostHog switches it to dry-run. diff --git a/context/skills/self-driving/references/7-report.md b/context/skills/self-driving/references/7-report.md index 8862c9c..3482f12 100644 --- a/context/skills/self-driving/references/7-report.md +++ b/context/skills/self-driving/references/7-report.md @@ -24,7 +24,7 @@ Emit: - **Signal sources** — a table of every source you touched or deliberately skipped: `source_product` / `source_type`, action taken (enabled / already enabled / skipped + why / failed). - **Connected tools** — what the user picked, and per tool the step-5 class: "connected by this setup (source id …, first sync started)", "already connected" / "verified connected", "responder enabled but warehouse source not detected (dormant)", or "not used" (only for tools the user didn't pick). Never report a tool as connected unless this run created its source or saw it in `external-data-sources-list`. For sources this run created, note that only the responder-consumed table (issues / tickets) is syncing and more can be enabled in the UI. Any tool the user picked but didn't connect — whether they said "done" or skipped — is "selected but no source detected (dormant)" with a follow-up, never "user confirmed connecting" and never "not used". - **Scout troop** — kept-on scouts, disabled scouts with the one-line reason each, or the not-yet-materialized note from step 6. - - **Custom scouts** — from step 6b: each created scout (name, what it watches, its discriminator, and why no canonical scout covers it) or one line on why none was warranted; surfaces considered and ruled out, with the filter that killed each; declined proposals; and the noise escape hatch (set `emit: false` on a scout's config in PostHog to switch it to dry-run). Omit only if step 6b was skipped entirely. + - **Custom scouts** — from step 6b: each created scout (name, what it watches, its discriminator, and why no built-in scout covers it) or one line on why none was warranted; surfaces considered and ruled out, with the filter that killed each; declined proposals; and the noise escape hatch (set `emit: false` on a scout's config in PostHog to switch it to dry-run). Omit only if step 6b was skipped entirely. - **Follow-ups** — every follow-up recorded along the way, as a checklist. Omit the section if there are none. - **What happens next** — the scout coordinator picks up fresh configs within ~30 minutes; findings cluster into reports in the inbox; immediately-actionable ones can start coding tasks.