Skip to content
Open
58 changes: 40 additions & 18 deletions ARCHITECTURE.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,9 +10,9 @@ see [vendor/symphony/SPEC.md](vendor/symphony/SPEC.md).

Work Please is a long-running TypeScript daemon that turns issue tracker tasks into autonomous
Claude Code agent sessions. It continuously polls an issue tracker (GitHub Projects v2 or Asana),
creates an isolated workspace for each eligible issue, renders a Liquid prompt template, and
launches a Claude Code agent session inside that workspace via the
`@anthropic-ai/claude-agent-sdk`.
renders a Liquid prompt template, and launches a Claude Code agent session via one of two runners:
the `@anthropic-ai/claude-agent-sdk` (local, with an isolated workspace per issue) or
`claude-code-action` (remote, via GitHub Actions `repository_dispatch`).

The service is primarily a **scheduler/runner** — it does not perform full ticket management.
The orchestrator writes only status labels to GitHub issues. All state transitions, PR
Expand Down Expand Up @@ -40,7 +40,12 @@ work-please/ # Monorepo root (Bun + Turborepo)
│ ├── config.ts # YAML front matter → typed ServiceConfig with env-var resolution
│ ├── workflow.ts # WORKFLOW.md parser (YAML front matter + Liquid body)
│ ├── prompt-builder.ts # Liquid template rendering (issue → prompt string)
│ ├── agent-runner.ts # Claude Code agent session via @anthropic-ai/claude-agent-sdk
│ ├── agent-runner.ts # Re-export shim for backward compatibility (delegates to runner/)
│ ├── runner/ # Agent runner abstraction
│ │ ├── types.ts # AgentRunner interface, AgentSession, SessionResult
│ │ ├── sdk-runner.ts # SDK runner: local Claude Code via @anthropic-ai/claude-agent-sdk
│ │ ├── code-action-runner.ts # Code Action runner: GitHub Actions via repository_dispatch
│ │ └── index.ts # createRunner() factory — selects runner based on config
│ ├── workspace.ts # Per-issue directory management, git worktrees, lifecycle hooks
│ ├── server.ts # Optional HTTP dashboard (Bun.serve) and JSON API
│ ├── tools.ts # MCP tool server (asana_api, github_graphql) injected into agent
Expand Down Expand Up @@ -86,10 +91,12 @@ work-please/ # Monorepo root (Bun + Turborepo)
┌───────▼──┐ ┌───▼───┐ ┌─▼──────────┐
│ Tracker │ │Workspace│ │ Agent │
│ Client │ │Manager │ │ Runner │
│(GitHub/ │ │(create,│ │(claude- │
│ Asana) │ │ hooks, │ │ agent-sdk) │
└──────────┘ │worktree│ └────────────┘
└────────┘
│(GitHub/ │ │(create,│ │(factory) │
│ Asana) │ │ hooks, │ ├────────────┤
└──────────┘ │worktree│ │ SdkRunner │ ← local via claude-agent-sdk
└────────┘ │ CodeAction │ ← remote via GH Actions
│ Runner │ (repository_dispatch + poll)
└────────────┘
```

### Startup
Expand All @@ -116,20 +123,34 @@ Each poll tick executes in order:

### Agent Session Lifecycle

The orchestrator selects a runner via `createRunner(config)` based on `agent.runner` (`sdk` or
`code_action`).

**SDK Runner (default):**

1. `createWorkspace()` — Creates or reuses a per-issue directory (or git worktree if issue URL
points to a GitHub repo). Runs `after_create` hook on first creation.
2. `runBeforeRunHook()` — Executes the optional `before_run` shell hook.
3. `AppServerClient.startSession()` — Validates workspace path against `workspace.root`
(path traversal prevention) and assigns a local session UUID. No SDK communication occurs
yet — the real session is established when `runTurn()` receives a `system/init` event.
4. `AppServerClient.runTurn()` — Calls `query()` from `@anthropic-ai/claude-agent-sdk` with the
rendered prompt. Translates SDK messages into orchestrator events (`session_started`,
`turn_completed`, `turn_failed`, `notification`).
3. `SdkRunner.startSession()` — Validates workspace path against `workspace.root`
(path traversal prevention) and assigns a local session UUID.
4. `SdkRunner.runTurn()` — Calls `query()` from `@anthropic-ai/claude-agent-sdk` with the
rendered prompt. Translates SDK messages into orchestrator events.
Supports multi-turn: after each turn, refreshes issue state; continues if still active and
under `max_turns`.
5. `runAfterRunHook()` — Executes the optional `after_run` shell hook.
6. On exit — normal exits schedule a 1s continuation retry; failures schedule exponential backoff
retries up to `max_retry_backoff_ms`.
6. On exit — normal exits schedule a 1s continuation retry; failures schedule exponential backoff.

**Code Action Runner:**

1. No local workspace created — execution happens in a GitHub Actions runner.
2. `CodeActionRunner.startSession()` — Returns a virtual session (workspace: null).
3. `CodeActionRunner.runTurn()` — Dispatches a `repository_dispatch` event via GitHub API with the
prompt and issue context in `client_payload` (including `before_run`/`after_run` hooks).
Polls the GitHub Actions API until the triggered run completes, then maps the conclusion
(`success`/`failure`/`cancelled`) to orchestrator events.
4. `CodeActionRunner.stopSession()` — Cancels the in-progress GitHub Actions run.
5. The target repository must have a workflow file listening for `repository_dispatch` events
that uses `anthropics/claude-code-action@v1`.

## Architecture Invariants

Expand Down Expand Up @@ -182,8 +203,9 @@ for narrowing.

- **Runner:** Bun test (Jest-compatible API)
- **Pattern:** Unit tests co-located with source files (`*.test.ts` alongside `*.ts`)
- **Mocking:** `AppServerClient` accepts an injectable `queryFn` for testing without the real
Claude CLI. Tracker adapters are tested against mock GraphQL/REST responses. Workspace operations
- **Mocking:** `SdkRunner` accepts an injectable `queryFn` for testing without the real
Claude CLI. `CodeActionRunner` tests mock `globalThis.fetch` for GitHub API responses.
Tracker adapters are tested against mock GraphQL/REST responses. Workspace operations
use `spyOn(_git, 'spawnSync')` to mock git commands.
- **Commands:** `bun run test` (all), `bun run test:app` (work-please only)

Expand Down
267 changes: 5 additions & 262 deletions apps/work-please/src/agent-runner.ts
Original file line number Diff line number Diff line change
@@ -1,267 +1,10 @@
import type { Options } from '@anthropic-ai/claude-agent-sdk'
import type { AgentMessage, Issue, ServiceConfig } from './types'
import { randomUUID } from 'node:crypto'
import { resolve, sep } from 'node:path'
import { query as sdkQuery } from '@anthropic-ai/claude-agent-sdk'
import { createToolsMcpServer, getToolSpecs } from './tools'
import type { AgentMessage } from './types'

export interface SessionResult {
turn_id: string
session_id: string
}

export interface AgentSession {
sessionId: string
workspace: string
}

type QueryFn = (params: { prompt: string, options?: Options }) => AsyncIterable<unknown>

const UUID_PATTERN = /^[\da-f]{8}-[\da-f]{4}-[\da-f]{4}-[\da-f]{4}-[\da-f]{12}$/i
const NEWLINE_PATTERN = /[\r\n]/g

// Minimal discriminated shape for SDK messages received in the for-await loop
interface SdkMsgBase { type: string }
interface SdkMsgInit extends SdkMsgBase { type: 'system', subtype: 'init', session_id: string }
interface SdkMsgSuccess extends SdkMsgBase { type: 'result', subtype: 'success', usage: { input_tokens: number, output_tokens: number } }
interface SdkMsgError extends SdkMsgBase { type: 'result', subtype: string, errors: string[] }
interface SdkMsgRateLimit extends SdkMsgBase { type: 'rate_limit_event', rate_limit_info: unknown }
type SdkMsg = SdkMsgInit | SdkMsgSuccess | SdkMsgError | SdkMsgRateLimit | SdkMsgBase

export class AppServerClient {
private assignedSessionId: string | null = null
private sessionId: string | null = null
private abortController: AbortController | null = null
private workspace: string
private config: ServiceConfig
private queryFn: QueryFn
private agentEnv: Record<string, string> | null = null

constructor(config: ServiceConfig, workspace: string, queryFn: QueryFn = sdkQuery) {
this.config = config
this.workspace = workspace
this.queryFn = queryFn
}

setAgentEnv(env: Record<string, string>): void {
this.agentEnv = env
}

async startSession(sessionId?: string): Promise<AgentSession | Error> {
// Reset state unconditionally to prevent stale fields on instance reuse or retry
this.assignedSessionId = null
this.sessionId = null

if (sessionId !== undefined && !UUID_PATTERN.test(sessionId)) {
const preview = String(sessionId).slice(0, 64).replace(NEWLINE_PATTERN, ' ')
return new Error(`invalid_session_id: expected UUID format, got "${preview}"`)
}

const validationErr = this.validateWorkspaceCwd()
if (validationErr)
return validationErr

const id = sessionId ?? randomUUID()
this.assignedSessionId = id
// Set sessionId immediately so runTurn uses options.resume (cross-restart resume path).
// The SDK has NOT confirmed this session — it may reject if the session no longer exists.
this.sessionId = sessionId ?? null
return { sessionId: id, workspace: this.workspace }
}

async runTurn(
session: AgentSession,
prompt: string,
_issue: Issue,
onMessage: (msg: AgentMessage) => void,
): Promise<SessionResult | Error> {
const controller = new AbortController()
this.abortController = controller

const timeoutHandle = setTimeout(
() => controller.abort(new Error('turn_timeout')),
this.config.claude.turn_timeout_ms,
)

const options: Options = {
cwd: session.workspace,
permissionMode: this.config.claude.permission_mode as Options['permissionMode'],
abortController: controller,
}

if (this.config.claude.permission_mode === 'bypassPermissions') {
options.allowDangerouslySkipPermissions = true
}

if (this.config.claude.allowed_tools.length > 0) {
options.allowedTools = this.config.claude.allowed_tools
}

if (this.sessionId) {
options.resume = this.sessionId
}
else if (this.assignedSessionId) {
options.sessionId = this.assignedSessionId as `${string}-${string}-${string}-${string}-${string}`
}

if (this.config.claude.command !== 'claude') {
options.pathToClaudeCodeExecutable = this.config.claude.command
}

if (this.config.claude.model) {
options.model = this.config.claude.model
}

const sp = this.config.claude.system_prompt
if (sp.type === 'custom') {
options.systemPrompt = sp.value
}
else {
options.systemPrompt = sp
}

options.effort = this.config.claude.effort

const toolSpecs = getToolSpecs(this.config)
if (toolSpecs.length > 0) {
options.mcpServers = {
'work-please-tools': createToolsMcpServer(this.config),
}
}

if (this.config.claude.setting_sources.length > 0) {
options.settingSources = this.config.claude.setting_sources
}

if (this.agentEnv) {
options.env = this.agentEnv
}

const turnId = randomUUID()
let sessionId: string | null = null
let gotError = false

try {
const q = this.queryFn({ prompt, options })

for await (const rawMsg of q) {
const msg = rawMsg as SdkMsg
if (msg.type === 'system' && (msg as SdkMsgInit).subtype === 'init') {
const initMsg = msg as SdkMsgInit
sessionId = initMsg.session_id
this.sessionId = sessionId
this.assignedSessionId = null // SDK confirmed — proposed ID no longer needed
onMessage({
event: 'session_started',
timestamp: new Date(),
session_id: sessionId,
turn_id: turnId,
})
}
else if (msg.type === 'result') {
const resultMsg = msg as SdkMsgSuccess | SdkMsgError
if (resultMsg.subtype === 'success') {
const successMsg = resultMsg as SdkMsgSuccess
onMessage({
event: 'turn_completed',
timestamp: new Date(),
usage: {
input_tokens: successMsg.usage.input_tokens,
output_tokens: successMsg.usage.output_tokens,
total_tokens: successMsg.usage.input_tokens + successMsg.usage.output_tokens,
},
})
}
else {
const errMsg = resultMsg as SdkMsgError
gotError = true
onMessage({
event: 'turn_failed',
timestamp: new Date(),
payload: { subtype: errMsg.subtype, errors: errMsg.errors },
})
}
}
else if (msg.type === 'rate_limit_event') {
const rlMsg = msg as SdkMsgRateLimit
onMessage({
event: 'notification',
timestamp: new Date(),
rate_limits: rlMsg.rate_limit_info,
})
}
else {
onMessage({
event: 'notification',
timestamp: new Date(),
payload: rawMsg,
})
}
}

clearTimeout(timeoutHandle)

if (gotError) {
return new Error('turn_failed')
}

if (!sessionId) {
const err = new Error('no_session_started')
onMessage({
event: 'startup_failed',
timestamp: new Date(),
payload: { reason: err.message },
})
return err
}

return { turn_id: turnId, session_id: sessionId }
}
catch (err) {
clearTimeout(timeoutHandle)
const error = err instanceof Error ? err : new Error(String(err))
// If init was never received, the session never started — report startup_failed
// and clear stale resume state so the next runTurn does not retry a poisoned session.
// If init was already received and the turn was aborted mid-execution, report turn_failed
// so callers can distinguish a startup failure from a mid-turn failure.
if (!sessionId) {
// Preserve resume state on transient pre-init failures so the next turn can retry.
// Only clear state for new sessions where no session was ever confirmed.
if (!options.resume) {
this.sessionId = null
this.assignedSessionId = null
}
}
onMessage({
event: sessionId ? 'turn_failed' : 'startup_failed',
timestamp: new Date(),
payload: { reason: error.message },
})
return error
}
}

stopSession(): void {
this.abortController?.abort()
this.assignedSessionId = null
this.sessionId = null
this.abortController = null
}

private validateWorkspaceCwd(): Error | null {
const wsPath = resolve(this.workspace)
const root = resolve(this.config.workspace.root)
const rootWithSep = root + sep

if (wsPath === root)
return new Error(`invalid_workspace_cwd: workspace_root ${wsPath}`)
if (!wsPath.startsWith(rootWithSep))
return new Error(`invalid_workspace_cwd: outside_workspace_root ${wsPath}`)
return null
}
}
// Re-export SdkRunner as AppServerClient for backward compatibility
export { SdkRunner as AppServerClient } from './runner/sdk-runner'

@cubic-dev-ai cubic-dev-ai Bot Mar 15, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Re-exporting SdkRunner directly changes AppServerClient's public session type to allow workspace: null, which breaks the old shim contract and can defer bad input to a runtime failure.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At apps/work-please/src/agent-runner.ts, line 4:

<comment>Re-exporting `SdkRunner` directly changes `AppServerClient`'s public session type to allow `workspace: null`, which breaks the old shim contract and can defer bad input to a runtime failure.</comment>

<file context>
@@ -1,267 +1,10 @@
-  }
-}
+// Re-export SdkRunner as AppServerClient for backward compatibility
+export { SdkRunner as AppServerClient } from './runner/sdk-runner'
+export type { AgentRunner, AgentSession, SessionResult } from './runner/types'
 
</file context>
Fix with Cubic

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⏭️ Not applied — SdkRunner.startSession() always returns workspace as non-null string. The AgentSession interface allows null for CodeActionRunner's virtual sessions, but AppServerClient callers always receive a string workspace. No runtime breakage possible.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback! I've saved this as a new learning to improve future reviews.

export type { AgentRunner, AgentSession, SessionResult } from './runner/types'

// Utility exports (kept for backward compatibility with tests and orchestrator)
// Utility exports (used by tests and orchestrator)
type JsonRpcMessage = Record<string, unknown>

export function extractRateLimits(payload: JsonRpcMessage): { rate_limits?: unknown } {
Expand Down
Loading
Loading