Skip to content

Issue 8: Real task bodies + full 1:1 integration flow #627

Description

@gewenyu99

Issue 8: Real task bodies + full 1:1 integration flow

Epic: Task-queue orchestrator runner · Depends on: #626 · PR: #619

Status: implemented. Shipped well past the original thin slice — the full
1:1 graph below is built and running. See the "As built" section at the end for
what changed from this issue's original plan.

Why

With the machinery proven by the walking skeleton, replace the stubs with the real
integration tasks. This is the first proof that decomposed micro-agents do real
work at parity with the linear baseline. It is also the first real exercise of two
things the design is built for: parallel branches in the task graph, and a plan
step that is a separate skill from the do step.

The task graph

The orchestrator detects the framework, then seeds:

install  \
          >  identify
init     /   plan-capture --> capture
  • install and init are independent. We know the framework shape, so neither
    waits on the other.
  • identify and plan-capture both follow install and init, and run independent
    of each other.
  • capture follows plan-capture. Planning what to capture and capturing it are
    separate skills.

Scope / deliverable

Each one is an agent prompt and a mini-skill (#625). Each sets the cheapest model
that succeeds, validated empirically. Success criteria are plain text in the
prompt, and the agent self-reports through complete_task.

  1. install. Depends on nothing. Add the PostHog SDK with the detected package
    manager (reuse detect_package_manager, and hasDeclaredDependency at
    posthog-integration/index.ts:106). Model: cheap, the work is mechanical.
    Success: the SDK is declared and the install ran clean. Runs alongside init.
  2. init. Depends on nothing. Create the provider or init file and set env
    vars through the wizard-tools MCP (set_env_values, check_env_keys). Model:
    cheap to mid, boilerplate. Success: the env keys are present and the init file
    exists. Runs alongside install.
  3. identify. Depends on install and init. Instrument user identification
    at the right place in the app. Model: mid. Success: an identify call is wired.
    Runs alongside plan-capture.
  4. plan-capture. Depends on install and init. A planning agent. It reads
    the app, decides which events are worth capturing, and writes that list as its
    structured handoff. It does not edit code. Model: mid, it has to understand the
    app. Success: a sensible, non-empty event plan in the handoff. Runs alongside
    identify.
  5. capture. Depends on plan-capture. Reads the event plan from
    plan-capture's handoff and instruments those events. Model: mid. Success: each
    planned event has a posthog.capture that fires on a real user action, not on
    page load.

Plan then do. plan-capture and capture are separate skills. The plan is the
handoff, and capture consumes it. This keeps the "decide what" reasoning out of
the "edit code" step, and it is the first real use of one task's handoff driving
the next.

Idempotent. Mini-skills read before write, so a re-run does not
double-instrument. The {type, inputs} dedup guard (#623) backs this up.

Key files

  • agent-prompt and mini-skill markdown for install, init, identify,
    plan-capture, capture, in context-mill on the experiment branch
  • the integrate-posthog seed prompt, seeding the graph above with the right
    dependsOn
  • reuse hasDeclaredDependency, detect_package_manager, set_env_values,
    check_env_keys (wizard-tools.ts)

Acceptance criteria

  • With the wizard-orchestrator flag enabled, on a clean Next.js
    workbench app: the SDK ends up in package.json, the env keys are set, an
    identify call is wired, and each planned event has a posthog.capture that
    fires on a real user action.
  • install and init carry no dependency between them, and identify runs
    independent of the capture branch. The graph expresses the parallelism.
  • plan-capture writes an event plan as its handoff, and capture instruments
    exactly that plan.
  • Each task self-reports done on success, and failed with a reason on a
    deliberately broken run.
  • An idempotent re-run does not double-instrument.
  • The result is at parity with a linear-baseline run, diff the integrated files.

As built (current)

The thin slice grew into the full 1:1 flow against the linear baseline. The seed
(integrate-posthog) now seeds nine tasks:

install ┐
init ───┤
        ├─→ identify
        ├─→ error-tracking
        └─→ plan-capture ─→ capture
              │
   all code ──┴─→ build ─→ dashboard
                       └─→ report

Changes from the original plan:

  • install is manifest-only. It declares the SDK in the manifest and does not
    run the package manager or build. The real install + build happens in the new
    build task, which also runs typecheck/build and, on an unresolved conflict,
    reports a one-line summary in its handoff conflict field — surfaced briefly in
    the outro, in full in the report.
  • Parity tasks added: error-tracking (captureException around critical
    flows), dashboard (PostHog MCP insights), report (posthog-setup-report.md).
  • Capture stayed a single task (consumes plan-capture's plan); the per-event
    fan-out is deferred (machinery — enqueue label, input injection — is in place).
  • Grounding like the linear flow: each mini-skill carries real PostHog docs
    (docs_urls + {references}), and every task agent is pointed at the detected
    framework's reference EXAMPLE.md (installed under .posthog-wizard/reference/)
    to reference its patterns — not to copy.
  • UI: tasks carry an agent-set label (short, action-level) shown in the queue
    panel; the agent task tools (TaskCreate/TaskUpdate) and the per-task spinner
    lines are suppressed so the queue panel is the sole progress surface.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions