Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
113 changes: 68 additions & 45 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,21 +15,31 @@ A harness for micro-agents.
<a href="https://discord.gg/2xZzrxC9"><strong>Discord</strong></a>
</p>

Perstack is an infrastructure for **practical agentic AI** that aims to:
If you're looking for a way to build practical agentic apps like Claude Code or OpenClaw, you'll need a harness to manage the complexity.

Perstack is a harness for agentic apps. It aims to:

- **Do big things with small models**: If a smaller model can make the same thing, there's no reason to use a bigger model.
- **Focus on what you know best**: Building agentic software that people use doesn't require AI science degree knowledge to solve their problems is what matters.
- **Quality is a system property, not a model property**: Building agentic software that people use doesn't require AI science degree, just knowledge to solve their problems.
- **Keep it simple and reliable**: The biggest mistake is cramming AI into an overly complex harness and ending up with an uncontrollable agent.

## Getting Started

Perstack is designed so that defining experts, running them, and integrating them into applications remain separate concerns.

`create-expert` scaffolds experts, the harness handles orchestration, and deployment stays simple because Perstack runs on standard container and serverless infrastructure.

### Defining your first expert

To get started, you can use the `create-expert` Expert that helps you to focus on the core and build your first agentic AI:

```bash
# Ask `create-expert` to form a team named `ai-gaming`
docker run --pull always --rm -it \
-e ANTHROPIC_API_KEY \
-e FIREWORKS_API_KEY \
-v ./ai-gaming:/workspace \
perstack/perstack start create-expert \
--provider fireworks \
"Form a team named ai-gaming to build a Bun-based CLI cutting-edge indie game playable on Bash."
```

Expand All @@ -54,24 +64,68 @@ description = "Tests the game and reports bugs"
instruction = "Play-test the game, find bugs, and verify fixes."
```

### Running your expert

To let your agents work on an actual task, you can use the `perstack start` command to run them interactively:

```bash
# Let `ai-gaming` team build a Wizardry-like dungeon crawler
docker run --pull always --rm -it \
-e ANTHROPIC_API_KEY \
-e FIREWORKS_API_KEY \
-v ./ai-gaming:/workspace \
perstack/perstack start ai-gaming \
--model "haiku-4-5" \
--provider fireworks \
"Make a Wizardry-like dungeon crawler. Make it replayable, so players can dive in, die, and find a way to beat it."
```

Here is an example of a game built with these commands: [demo-dungeon-crawler](https://github.com/perstack-ai/demo-dungeon-crawler). It was built entirely on Claude 4.5 Haiku.
Here is an example of a game built with these commands: [demo-dungeon-crawler](https://github.com/perstack-ai/demo-dungeon-crawler). It was built entirely using Kimi K2.5 on Fireworks for under $0.10 in total API cost. You can play it directly:

```bash
npx perstack-demo-dungeon-crawler start
```

### Integrating with your app

Perstack separates the agent harness from the application layer. Your app stays a normal web or terminal app, with no LLM dependencies in the client.

```
┌─────────────────┐ ┌──────────────────┐
│ Your app │ events │ perstack run │
│ (React, TUI…) │◄─────────── │ (@perstack/ │
│ │ SSE / WS / │ runtime) │
│ @perstack/ │ any stream │ │
│ react │ │ │
└─────────────────┘ └──────────────────┘
Frontend Server
```

Swap models, change agent topology, or scale the harness — without touching application code. [`@perstack/react`](https://www.npmjs.com/package/@perstack/react) provides hooks (`useJobStream`, `useRun`) that turn the event stream into React state. See the [documentation](https://perstack.ai/docs/references/react/) for details.

### Deployment

```dockerfile
FROM perstack/perstack:latest
COPY perstack.toml .
RUN perstack install
ENTRYPOINT ["perstack", "run", "my-expert"]
```

The image is ubuntu-based, multi-arch (`linux/amd64`, `linux/arm64`) and weighs ~74MB. The runtime can also be imported directly as a TypeScript library ([`@perstack/runtime`](https://www.npmjs.com/package/@perstack/runtime)) for serverless environments. See the [deployment guide](https://perstack.ai/docs/operating-experts/deployment/) for details.


### Why micro-agents?

Perstack is a harness for micro-agents — purpose-specific agents with a single responsibility.

- **Reusable**: Delegates are dependency management for agents — like npm packages or crates. Separate concerns through delegate chains, and compose purpose-built experts across different projects.
- **Cost-Effective**: Purpose-specific experts are designed to run on affordable models. A focused agent with the right domain knowledge on a cheap model outperforms a generalist on an expensive one.
- **Fast**: Smaller models generate faster. Fine-grained tasks broken into delegates run concurrently via parallel delegation.
- **Maintainable**: A monolithic system prompt is like refactoring without tests — every change risks breaking something. Single-responsibility experts are independently testable. Test each one, then compose them.

## Prerequisites

- Docker
- An LLM provider API key (see [Providers](#providers))
- An LLM provider API key (see [Providers and Models](https://perstack.ai/docs/references/providers-and-models/))

### Giving API keys

Expand All @@ -82,11 +136,11 @@ There are two ways to provide API keys:
Export the key on the host and forward it to the container:

```bash
export ANTHROPIC_API_KEY=sk-ant-...
export FIREWORKS_API_KEY=fw_...
docker run --rm -it \
-e ANTHROPIC_API_KEY \
-e FIREWORKS_API_KEY \
-v ./workspace:/workspace \
perstack/perstack start my-expert "query"
perstack/perstack start my-expert "query" --provider fireworks
```

**2. Store keys in a `.env` file in the workspace**
Expand All @@ -95,7 +149,7 @@ Create a `.env` file in the workspace directory. Perstack loads `.env` and `.env

```bash
# ./workspace/.env
ANTHROPIC_API_KEY=sk-ant-...
FIREWORKS_API_KEY=fw_...
```

```bash
Expand Down Expand Up @@ -141,19 +195,19 @@ Perstack organizes the complexity of an agent harness into five layers with clea
| **Definition** | `perstack.toml` | Declarative project config with global defaults (model, reasoning budget, retries, timeout) |
| | Expert definitions | Instruction, description, delegates, tags, version, and minimum runtime version per expert |
| | Skill types | MCP stdio, MCP SSE, and interactive skills with tool pick/omit filtering and domain restrictions |
| | Provider config | 8 providers (Anthropic, OpenAI, Google, Azure OpenAI, Amazon Bedrock, Google Vertex, Ollama, DeepSeek) with per-provider settings |
| | Provider config | 9 providers (Anthropic, OpenAI, Google, Fireworks, DeepSeek, Ollama, Azure OpenAI, Amazon Bedrock, Google Vertex) with per-provider settings |
| | Model tiers | Provider-aware model selection via `defaultModelTier` (low / middle / high) with fallback cascade |
| | Provider tools | Provider-native capabilities (web search, code execution, image generation, etc.) with per-tool options |
| | Lockfile | `perstack.lock` — resolved snapshot of experts and tool definitions for reproducible deployments |
| **Context** | Meta-prompts | Role-specific system prompts (coordinator vs. delegate) with environment injection (time, working directory, sandbox) |
| | Context window tracking | Per-model context window lookup with usage ratio monitoring |
| | Message types | Instruction, user, expert, and tool messages with text, image, file, thinking, and tool-call parts |
| | Prompt caching | Anthropic ephemeral cache control with cache-hit tracking |
| | Prompt caching | Provider-specific cache control with cache-hit tracking |
| | Delegation | Parallel child runs with isolated context, parent history preservation, and result aggregation |
| | Extended thinking | Provider-specific reasoning budgets (Anthropic thinking, OpenAI reasoning effort, Google thinking config) |
| | Token usage | Input, output, reasoning, cached, and total token tracking accumulated across steps and delegations |
| | Resume / continue | Resume from any checkpoint, specific job, or delegation stop point |
| **Runtime** | State machine | 9-state XState machine (init → generate → call tools → resolve → finish, with delegation and interactive stops) |
| **Runtime** | State machine | 9-state machine (init → generate → call tools → resolve → finish, with delegation and interactive stops) |
| | Event-sourcing | 21 run events, 6 streaming events, and 5 runtime events for full execution observability |
| | Checkpoints | Immutable state snapshots with messages, usage, pending tool calls, and delegation metadata |
| | Skill manager | Dynamic skill lifecycle — connect, discover tools, execute, disconnect — with adapter pattern |
Expand All @@ -176,37 +230,6 @@ Perstack organizes the complexity of an agent harness into five layers with clea

</details>


## Deployment

```dockerfile
FROM perstack/perstack:latest
COPY perstack.toml .
RUN perstack install
ENTRYPOINT ["perstack", "run", "my-expert"]
```

```bash
docker build -t my-expert .
docker run --rm -e ANTHROPIC_API_KEY my-expert "query"
```

The image is ubuntu-based, multi-arch (`linux/amd64`, `linux/arm64`) and weighs ~74MB.

The runtime can also be imported directly as a TypeScript library for serverless environments (Cloudflare Workers, Vercel, etc.) or integrated into your own applications:

```typescript
import { run } from "@perstack/runtime"

const checkpoint = await run({
setting: {
providerConfig: { providerName: "anthropic", apiKey: env.ANTHROPIC_API_KEY },
expertKey: "my-expert",
input: { text: query },
},
})
```

## Documentation

| Topic | Link |
Expand Down
55 changes: 45 additions & 10 deletions docs/references/providers-and-models.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,16 +32,17 @@ npx perstack run my-expert "query" --provider google --model gemini-2.5-pro

## Supported Providers

| Provider | Key | Description |
| ---------------- | ---------------- | ----------------------------- |
| Anthropic | `anthropic` | Claude models (default) |
| Google | `google` | Gemini models |
| OpenAI | `openai` | GPT and reasoning models |
| DeepSeek | `deepseek` | DeepSeek models |
| Ollama | `ollama` | Local model hosting |
| Azure OpenAI | `azure-openai` | Azure-hosted OpenAI models |
| Amazon Bedrock | `amazon-bedrock` | AWS Bedrock-hosted models |
| Google Vertex AI | `google-vertex` | Google Cloud Vertex AI models |
| Provider | Key | Description |
| ---------------- | ---------------- | ------------------------------------ |
| Anthropic | `anthropic` | Claude models |
| Google | `google` | Gemini models |
| OpenAI | `openai` | GPT and reasoning models |
| Fireworks | `fireworks` | Open-weight models (Kimi, DeepSeek) |
| DeepSeek | `deepseek` | DeepSeek models |
| Ollama | `ollama` | Local model hosting |
| Azure OpenAI | `azure-openai` | Azure-hosted OpenAI models |
| Amazon Bedrock | `amazon-bedrock` | AWS Bedrock-hosted models |
| Google Vertex AI | `google-vertex` | Google Cloud Vertex AI models |

## Anthropic

Expand Down Expand Up @@ -201,6 +202,40 @@ export DEEPSEEK_API_KEY=sk-...
npx perstack run my-expert "query" --provider deepseek --model deepseek-chat
```

## Fireworks

**Environment variables:**
| Variable | Required | Description |
| -------------------- | -------- | --------------- |
| `FIREWORKS_API_KEY` | Yes | API key |
| `FIREWORKS_BASE_URL` | No | Custom endpoint |

**perstack.toml settings:**
```toml
[provider]
providerName = "fireworks"
[provider.setting]
baseUrl = "https://custom-endpoint.example.com" # Optional
headers = { "X-Custom-Header" = "value" } # Optional
```

| Setting | Type | Description |
| --------- | ------ | ------------------- |
| `baseUrl` | string | Custom API endpoint |
| `headers` | object | Custom HTTP headers |

**Models:**
| Model | Context | Max Output |
| ---------------------------------------- | ------- | ---------- |
| `accounts/fireworks/models/kimi-k2p5` | 262K | 262K |
| `accounts/fireworks/models/deepseek-v3p2`| 164K | 164K |
| `accounts/fireworks/models/glm-5` | 203K | 203K |

```bash
export FIREWORKS_API_KEY=fw_...
npx perstack run my-expert "query" --provider fireworks --model accounts/fireworks/models/kimi-k2p5
```

## Ollama

**Environment variables:**
Expand Down