feat(self-driving): Self-driving setup skills for Wizard#192
Conversation
Workflow skill for the wizard's new product-autonomy program (wizard autonomy). Eight chained references: probe Signals API access, read the setup report + project profile, confirm org AI-data-processing approval, require the GitHub integration, enable the matching signal sources, ask-then-connect issue trackers, sync + tune the scout fleet, and write posthog-product-autonomy-report.md. Abort strings match the wizard's PRODUCT_AUTONOMY_ABORT_CASES verbatim. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
pnpm 10 refuses to prepare the git-hosted warlock dependency unless it is in onlyBuiltDependencies; the package.json pnpm section overrides pnpm-workspace.yaml, so the allowlist there never applied. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
🧙 Wizard CIRun the Wizard CI and test your changes against wizard-workbench example apps by replying with a GitHub comment using one of the following commands: Test all apps:
Test all apps in a directory:
Test an individual app:
Show more apps
Results will be posted here when complete. |
🤖 Validated self-review — PR #192I ran an automated PR review over this branch, then re-validated every finding against the current code and cross-checked each contract claim against the local Method: one validator + one adversarial skeptic per finding (the skeptic tries to refute each keep, correcting the raw review's "everything is valid" bias). Bar: keep if it could plausibly happen in a real wizard run; drop if theoretical, already-handled, or wrong about the code.
Notably, none of the 7 findings the raw review rated Kept findings (see inline comments)
22 dropped findings + why (click to expand)Factually wrong against the code/contract
Already mitigated by the wizard harness
Under-specified, but the flow already absorbs it
Build-pipeline tooling, out of the agent-run scope
|
| ``` | ||
| { | ||
| id: "github-connect", | ||
| prompt: "Self-driving needs GitHub access to investigate findings in your code and open fixes — setup can't finish without it.\n\nOpen this link to install the PostHog GitHub App in one click, then approve access. Grant it the repos you want Self-driving to work with — include this project's repo so step 5 can also watch its issues:\n\n<github authorize URL>\n\nThen come back here.\n\n(Need to re-link an existing installation instead? Use your integrations settings: <integrations settings URL>.)", |
There was a problem hiding this comment.
A couple of things about this:
- "Then come back here" - I did all of this, came back, and honestly didn't get for a minute that I need to click "Done" manually 😅 I'd expect this to be just like the desktop app - auto-detecting a callback, or polling.
Relatedly, as a user, when I get a wall of text like this, I have an insatiable urge to press "Enter". Basically, at first sight, this feels like an EULA. We should make this feel more like a loading state, which is waiting until you've connected GH - The
<integrations settings URL>link wraps in a way that makes it useless - unable to click and unable to copy (<github authorize URL>is fine as its in its own line)
There was a problem hiding this comment.
Oh, I see you tackled problem 2 (link wrapping) in the Wizard PR – but this didn't seem to work for me in Cursor 🤔
There was a problem hiding this comment.
Removed the re-link, noted the polling, will iterate.
| ``` | ||
| { | ||
| id: "connected-tools", | ||
| prompt: "Self-driving can also watch your other tools and pull their issues into the inbox. Which of these do you use?", |
There was a problem hiding this comment.
Nit: I feel "<...> and pull their issues into the inbox" undersells it a bit – something like this would be more active framing: "Self-driving can also address problems surfaced in your other tools."
But I don't feel strongly here.
I do definitely believe an option like "I don't see a tool in this list" would provide really interesting data. Probs needs a wizard change to support that.
There was a problem hiding this comment.
Good point. updated the copy.
As for custom options - yup, noted.
Twixes
left a comment
There was a problem hiding this comment.
A few comments for your consideration
| 1. Call `inbox-source-configs-list`. | ||
| 2. **Success — including an empty list** — means the API is reachable: proceed. (The probe can't prove beta enrollment — the wizard's detect step and the beta flags own that — but it's the strongest signal available to you.) Keep the returned rows: step 2 and step 4 use them as the already-enabled baseline. Mark your access task completed and continue. |
There was a problem hiding this comment.
Hmm, but the endpoint powering inbox-source-configs-list isn't flagged. It's GET /api/projects/{team_id}/signals/source_configs/ and I don't think there's any reason why it wouldn't be reachable.
No reason, other than the scope not being available, but that would not mean "self-driving is not available for this project", that would means that something is wrong with authentication.
What's this whole step for then?
There was a problem hiding this comment.
Initially it was about FFs, but we decided to drop all FFs. I still find it valuable, as it checks that MCP works, sources configs are pulled, and so on, before we do any edits. Plus, I 100% can imagine us adding new stuff that we would want FF in the future, so I'll keep it.
|
|
||
| 1. **Read `./posthog-setup-report.md`.** It is ground truth for what the base integration instrumented **in this repo**: events, error tracking, feature flags. Do not re-derive what it already states. It is NOT authority over project-level facts — session replay in particular may be instrumented in another repo or via the snippet, so the report can rule replay in but never out (step 4 probes the server for that). |
There was a problem hiding this comment.
Worth clarifying that this won't be present if the codebase was set up with PostHog a while ago. (Or never)
|
|
||
| 2. **Call `signals-scout-project-profile-get`.** It returns products in use, connected integrations, warehouse sources, and the signal source configs split enabled/disabled — one call instead of four. **Tolerate failure**: it can 404 or error on a team without a profile yet. If it fails, fall back to the step-1 source list and the report; do not retry more than once and do not abort. **Note "profile unavailable" in your checklist** — a profile 404 is expected on a first-run team, so any later decision that relies only on the profile must record "unknown", not a confident negative. | ||
|
|
||
| 3. **Server-side product usage.** The run prompt's "Project state" block is authoritative for the opt-ins it lists (session replay recording, exception autocapture, surveys): **opt-in ON = product enabled**, even if no data has arrived yet. Where the block says OFF/unknown and the repo gave no signal, spend ONE cheap probe each for usage evidence (tolerate 403/404 → record "unknown"): |
There was a problem hiding this comment.
Maybe I'm being a bad LLM here:
What is the "run prompt" referring to here?
There was a problem hiding this comment.
Each wizard run has a prompt (prompt.ts) that decides the order of steps and skill to follow that we use.
| 4. **Light scan for what the report, profile, and server state won't cover.** Targeted lookups only — package manifests, config files, a grep or two. You are answering these questions: | ||
| - **Revenue**: is there a payment SDK (Stripe, Paddle, LemonSqueezy, RevenueCat…) or revenue events? | ||
| - **Surveys**: does the code or profile show PostHog surveys in use? | ||
| - **AI/LLM**: are there `$ai_*` events, an LLM SDK, or LLM analytics in the profile? | ||
| - **Logs**: is the PostHog logs product in use (per the profile)? | ||
| - **CSP**: is a Content-Security-Policy with PostHog CSP reporting configured? | ||
| - **Support**: does the team use PostHog support/conversations (per the profile)? | ||
| - **Issue trackers**: any hints of Linear, Zendesk, or pganalyze (you will still ask in step 5 — hints only shape the question, they never authorize enabling). |
There was a problem hiding this comment.
I think error tracking, replay, and analytics are all worth including in the code scan as well, as you might e.g. not yet have errors in PostHog, but indeed have error tracking hooked up
There was a problem hiding this comment.
Fair enough, not sure what "analytics" means (general scout should cover regular analytics), but added bias toward error tracking/replay.
|
|
||
| Load via `ToolSearch select:mcp__posthog-wizard__integrations-github-repos-retrieve,mcp__posthog-wizard__external-data-sources-create`. | ||
|
|
||
| If `integrations-github-repos-retrieve` or `external-data-sources-create` isn't available (older server), skip the auto-create and record GitHub Issues as a dormant source (the dormant fallback below). **Not an abort.** |
There was a problem hiding this comment.
I guess we won't need to worry about an older server here?
| next_step: 6b-tailor-scouts.md | ||
| --- | ||
|
|
||
| # Step 6 — Configure the scout fleet |
There was a problem hiding this comment.
This is now a scout troop in the UI! (that's also what we'll go with marketing-wise, like hedgehog boyscouts troop, literally)
There was a problem hiding this comment.
Good point, renamed.
|
|
||
| 1. **Materialize**: call `signals-scout-config-sync`. It is idempotent — it seeds the canonical scout skills for this team and creates any missing configs, then returns the fleet. | ||
|
|
||
| **Soft-degrade if the tool is missing or fails**: fall back to `signals-scout-config-list`. If that returns rows, tune those. If it returns nothing, the fleet hasn't been materialized yet — record a follow-up ("the scout fleet materializes automatically within ~30 minutes; tune it later in PostHog or re-run this setup") and continue to step 7. **Not an abort.** |
There was a problem hiding this comment.
Hmm, interesting, does it really happen that you have to wait 30 minutes?
There was a problem hiding this comment.
You kinda do. Quoting from other PR:
Scout runs immediately, the sources should fire right away too, but it'll take 30 minutes or so for "get signal -> emit signal -> build repo cache -> pick the repo -> research the signal -> return the report". We can say 20 if you want.
| - `llm_analytics` (internal-only, not a user-facing responder) | ||
| - `logs` (not a v1 responder) | ||
| - Anything with `source_type` `evaluation` or `alert_state_change` | ||
| - The connected-tool sources (`github`, `linear`, `zendesk`, `pganalyze`) — those are step 5, ask-first. |
There was a problem hiding this comment.
Thinking we should def also ask them about non-signal sources, but ones we can connect via MCP, and are context-rich. E.g. Notion is a prime suspect here
There was a problem hiding this comment.
Yup, surely need to be one of the next iterations.
What changed
self-driving-setupskill