From cc5564802b307c889363359b313045faa24cd720 Mon Sep 17 00:00:00 2001 From: HiranoMasaaki Date: Sun, 8 Mar 2026 07:57:27 +0900 Subject: [PATCH] docs: revamp README and add Fireworks provider documentation - Restructure README with clearer Getting Started flow (define, run, integrate, deploy) - Position Fireworks as the reference provider in all examples - Add "Why micro-agents?" section explaining reusability, cost, speed, and maintainability - Add "Integrating with your app" section with architecture diagram - Add Fireworks section to providers-and-models documentation - Fix broken #providers anchor link - Add demo cost info and npx command Co-Authored-By: Claude Opus 4.6 --- README.md | 113 ++++++++++++++---------- docs/references/providers-and-models.md | 55 +++++++++--- 2 files changed, 113 insertions(+), 55 deletions(-) diff --git a/README.md b/README.md index d05b3c70..b9a9b346 100644 --- a/README.md +++ b/README.md @@ -15,21 +15,31 @@ A harness for micro-agents. Discord

-Perstack is an infrastructure for **practical agentic AI** that aims to: +If you're looking for a way to build practical agentic apps like Claude Code or OpenClaw, you'll need a harness to manage the complexity. + +Perstack is a harness for agentic apps. It aims to: + - **Do big things with small models**: If a smaller model can make the same thing, there's no reason to use a bigger model. -- **Focus on what you know best**: Building agentic software that people use doesn't require AI science degree — knowledge to solve their problems is what matters. +- **Quality is a system property, not a model property**: Building agentic software that people use doesn't require AI science degree, just knowledge to solve their problems. - **Keep it simple and reliable**: The biggest mistake is cramming AI into an overly complex harness and ending up with an uncontrollable agent. ## Getting Started +Perstack is designed so that defining experts, running them, and integrating them into applications remain separate concerns. + +`create-expert` scaffolds experts, the harness handles orchestration, and deployment stays simple because Perstack runs on standard container and serverless infrastructure. + +### Defining your first expert + To get started, you can use the `create-expert` Expert that helps you to focus on the core and build your first agentic AI: ```bash # Ask `create-expert` to form a team named `ai-gaming` docker run --pull always --rm -it \ - -e ANTHROPIC_API_KEY \ + -e FIREWORKS_API_KEY \ -v ./ai-gaming:/workspace \ perstack/perstack start create-expert \ + --provider fireworks \ "Form a team named ai-gaming to build a Bun-based CLI cutting-edge indie game playable on Bash." ``` @@ -54,24 +64,68 @@ description = "Tests the game and reports bugs" instruction = "Play-test the game, find bugs, and verify fixes." ``` +### Running your expert + To let your agents work on an actual task, you can use the `perstack start` command to run them interactively: ```bash # Let `ai-gaming` team build a Wizardry-like dungeon crawler docker run --pull always --rm -it \ - -e ANTHROPIC_API_KEY \ + -e FIREWORKS_API_KEY \ -v ./ai-gaming:/workspace \ perstack/perstack start ai-gaming \ - --model "haiku-4-5" \ + --provider fireworks \ "Make a Wizardry-like dungeon crawler. Make it replayable, so players can dive in, die, and find a way to beat it." ``` -Here is an example of a game built with these commands: [demo-dungeon-crawler](https://github.com/perstack-ai/demo-dungeon-crawler). It was built entirely on Claude 4.5 Haiku. +Here is an example of a game built with these commands: [demo-dungeon-crawler](https://github.com/perstack-ai/demo-dungeon-crawler). It was built entirely using Kimi K2.5 on Fireworks for under $0.10 in total API cost. You can play it directly: + +```bash +npx perstack-demo-dungeon-crawler start +``` + +### Integrating with your app + +Perstack separates the agent harness from the application layer. Your app stays a normal web or terminal app, with no LLM dependencies in the client. + +``` +┌─────────────────┐ ┌──────────────────┐ +│ Your app │ events │ perstack run │ +│ (React, TUI…) │◄─────────── │ (@perstack/ │ +│ │ SSE / WS / │ runtime) │ +│ @perstack/ │ any stream │ │ +│ react │ │ │ +└─────────────────┘ └──────────────────┘ + Frontend Server +``` + +Swap models, change agent topology, or scale the harness — without touching application code. [`@perstack/react`](https://www.npmjs.com/package/@perstack/react) provides hooks (`useJobStream`, `useRun`) that turn the event stream into React state. See the [documentation](https://perstack.ai/docs/references/react/) for details. + +### Deployment + +```dockerfile +FROM perstack/perstack:latest +COPY perstack.toml . +RUN perstack install +ENTRYPOINT ["perstack", "run", "my-expert"] +``` + +The image is ubuntu-based, multi-arch (`linux/amd64`, `linux/arm64`) and weighs ~74MB. The runtime can also be imported directly as a TypeScript library ([`@perstack/runtime`](https://www.npmjs.com/package/@perstack/runtime)) for serverless environments. See the [deployment guide](https://perstack.ai/docs/operating-experts/deployment/) for details. + + +### Why micro-agents? + +Perstack is a harness for micro-agents — purpose-specific agents with a single responsibility. + +- **Reusable**: Delegates are dependency management for agents — like npm packages or crates. Separate concerns through delegate chains, and compose purpose-built experts across different projects. +- **Cost-Effective**: Purpose-specific experts are designed to run on affordable models. A focused agent with the right domain knowledge on a cheap model outperforms a generalist on an expensive one. +- **Fast**: Smaller models generate faster. Fine-grained tasks broken into delegates run concurrently via parallel delegation. +- **Maintainable**: A monolithic system prompt is like refactoring without tests — every change risks breaking something. Single-responsibility experts are independently testable. Test each one, then compose them. ## Prerequisites - Docker -- An LLM provider API key (see [Providers](#providers)) +- An LLM provider API key (see [Providers and Models](https://perstack.ai/docs/references/providers-and-models/)) ### Giving API keys @@ -82,11 +136,11 @@ There are two ways to provide API keys: Export the key on the host and forward it to the container: ```bash -export ANTHROPIC_API_KEY=sk-ant-... +export FIREWORKS_API_KEY=fw_... docker run --rm -it \ - -e ANTHROPIC_API_KEY \ + -e FIREWORKS_API_KEY \ -v ./workspace:/workspace \ - perstack/perstack start my-expert "query" + perstack/perstack start my-expert "query" --provider fireworks ``` **2. Store keys in a `.env` file in the workspace** @@ -95,7 +149,7 @@ Create a `.env` file in the workspace directory. Perstack loads `.env` and `.env ```bash # ./workspace/.env -ANTHROPIC_API_KEY=sk-ant-... +FIREWORKS_API_KEY=fw_... ``` ```bash @@ -141,19 +195,19 @@ Perstack organizes the complexity of an agent harness into five layers with clea | **Definition** | `perstack.toml` | Declarative project config with global defaults (model, reasoning budget, retries, timeout) | | | Expert definitions | Instruction, description, delegates, tags, version, and minimum runtime version per expert | | | Skill types | MCP stdio, MCP SSE, and interactive skills with tool pick/omit filtering and domain restrictions | -| | Provider config | 8 providers (Anthropic, OpenAI, Google, Azure OpenAI, Amazon Bedrock, Google Vertex, Ollama, DeepSeek) with per-provider settings | +| | Provider config | 9 providers (Anthropic, OpenAI, Google, Fireworks, DeepSeek, Ollama, Azure OpenAI, Amazon Bedrock, Google Vertex) with per-provider settings | | | Model tiers | Provider-aware model selection via `defaultModelTier` (low / middle / high) with fallback cascade | | | Provider tools | Provider-native capabilities (web search, code execution, image generation, etc.) with per-tool options | | | Lockfile | `perstack.lock` — resolved snapshot of experts and tool definitions for reproducible deployments | | **Context** | Meta-prompts | Role-specific system prompts (coordinator vs. delegate) with environment injection (time, working directory, sandbox) | | | Context window tracking | Per-model context window lookup with usage ratio monitoring | | | Message types | Instruction, user, expert, and tool messages with text, image, file, thinking, and tool-call parts | -| | Prompt caching | Anthropic ephemeral cache control with cache-hit tracking | +| | Prompt caching | Provider-specific cache control with cache-hit tracking | | | Delegation | Parallel child runs with isolated context, parent history preservation, and result aggregation | | | Extended thinking | Provider-specific reasoning budgets (Anthropic thinking, OpenAI reasoning effort, Google thinking config) | | | Token usage | Input, output, reasoning, cached, and total token tracking accumulated across steps and delegations | | | Resume / continue | Resume from any checkpoint, specific job, or delegation stop point | -| **Runtime** | State machine | 9-state XState machine (init → generate → call tools → resolve → finish, with delegation and interactive stops) | +| **Runtime** | State machine | 9-state machine (init → generate → call tools → resolve → finish, with delegation and interactive stops) | | | Event-sourcing | 21 run events, 6 streaming events, and 5 runtime events for full execution observability | | | Checkpoints | Immutable state snapshots with messages, usage, pending tool calls, and delegation metadata | | | Skill manager | Dynamic skill lifecycle — connect, discover tools, execute, disconnect — with adapter pattern | @@ -176,37 +230,6 @@ Perstack organizes the complexity of an agent harness into five layers with clea - -## Deployment - -```dockerfile -FROM perstack/perstack:latest -COPY perstack.toml . -RUN perstack install -ENTRYPOINT ["perstack", "run", "my-expert"] -``` - -```bash -docker build -t my-expert . -docker run --rm -e ANTHROPIC_API_KEY my-expert "query" -``` - -The image is ubuntu-based, multi-arch (`linux/amd64`, `linux/arm64`) and weighs ~74MB. - -The runtime can also be imported directly as a TypeScript library for serverless environments (Cloudflare Workers, Vercel, etc.) or integrated into your own applications: - -```typescript -import { run } from "@perstack/runtime" - -const checkpoint = await run({ - setting: { - providerConfig: { providerName: "anthropic", apiKey: env.ANTHROPIC_API_KEY }, - expertKey: "my-expert", - input: { text: query }, - }, -}) -``` - ## Documentation | Topic | Link | diff --git a/docs/references/providers-and-models.md b/docs/references/providers-and-models.md index 4f256789..08adb1cb 100644 --- a/docs/references/providers-and-models.md +++ b/docs/references/providers-and-models.md @@ -32,16 +32,17 @@ npx perstack run my-expert "query" --provider google --model gemini-2.5-pro ## Supported Providers -| Provider | Key | Description | -| ---------------- | ---------------- | ----------------------------- | -| Anthropic | `anthropic` | Claude models (default) | -| Google | `google` | Gemini models | -| OpenAI | `openai` | GPT and reasoning models | -| DeepSeek | `deepseek` | DeepSeek models | -| Ollama | `ollama` | Local model hosting | -| Azure OpenAI | `azure-openai` | Azure-hosted OpenAI models | -| Amazon Bedrock | `amazon-bedrock` | AWS Bedrock-hosted models | -| Google Vertex AI | `google-vertex` | Google Cloud Vertex AI models | +| Provider | Key | Description | +| ---------------- | ---------------- | ------------------------------------ | +| Anthropic | `anthropic` | Claude models | +| Google | `google` | Gemini models | +| OpenAI | `openai` | GPT and reasoning models | +| Fireworks | `fireworks` | Open-weight models (Kimi, DeepSeek) | +| DeepSeek | `deepseek` | DeepSeek models | +| Ollama | `ollama` | Local model hosting | +| Azure OpenAI | `azure-openai` | Azure-hosted OpenAI models | +| Amazon Bedrock | `amazon-bedrock` | AWS Bedrock-hosted models | +| Google Vertex AI | `google-vertex` | Google Cloud Vertex AI models | ## Anthropic @@ -201,6 +202,40 @@ export DEEPSEEK_API_KEY=sk-... npx perstack run my-expert "query" --provider deepseek --model deepseek-chat ``` +## Fireworks + +**Environment variables:** +| Variable | Required | Description | +| -------------------- | -------- | --------------- | +| `FIREWORKS_API_KEY` | Yes | API key | +| `FIREWORKS_BASE_URL` | No | Custom endpoint | + +**perstack.toml settings:** +```toml +[provider] +providerName = "fireworks" +[provider.setting] +baseUrl = "https://custom-endpoint.example.com" # Optional +headers = { "X-Custom-Header" = "value" } # Optional +``` + +| Setting | Type | Description | +| --------- | ------ | ------------------- | +| `baseUrl` | string | Custom API endpoint | +| `headers` | object | Custom HTTP headers | + +**Models:** +| Model | Context | Max Output | +| ---------------------------------------- | ------- | ---------- | +| `accounts/fireworks/models/kimi-k2p5` | 262K | 262K | +| `accounts/fireworks/models/deepseek-v3p2`| 164K | 164K | +| `accounts/fireworks/models/glm-5` | 203K | 203K | + +```bash +export FIREWORKS_API_KEY=fw_... +npx perstack run my-expert "query" --provider fireworks --model accounts/fireworks/models/kimi-k2p5 +``` + ## Ollama **Environment variables:**