Chrome DevTools for LLM APIs.
A local proxy that intercepts, inspects, replays, and budgets your LLM API calls — with a real-time dashboard. Zero code changes. One binary. Interactive debugging.
# Install (pick one)
go install github.com/uucz/llmview@latest
# or: download from https://github.com/uucz/llmview/releases
# or: docker run -p 4700:4700 ghcr.io/uucz/llmview
# Run
llmview
# Point your AI tools at it
export OPENAI_BASE_URL=http://localhost:4700/proxy/openai/v1
export ANTHROPIC_BASE_URL=http://localhost:4700/proxy/anthropicOpen http://localhost:4700 — every LLM call now streams through your dashboard.
I spent $47 debugging an AI agent last week. Blind. No idea which calls were burning tokens, which prompts were bloated, or where the agent was looping.
Existing tools are either:
- Cloud-hosted (Helicone, AgentOps) — your prompts leave your machine
- SDK-based (Langfuse, Phoenix) — requires code changes and framework lock-in
- CLI-only (llm-interceptor) — no UI, just raw logs
llmview sits between your AI tools and the API. You change one environment variable. That's it. Your prompts never leave your machine. Everything shows up in a real-time dashboard.
| Feature | Description |
|---|---|
| Real-time timeline | Watch API calls stream in as they happen |
| Live token streaming | See the response being generated token-by-token |
| Request/response viewer | Parsed message threads + syntax-highlighted JSON bodies |
| Filter & search | Filter by provider, status code, model. Search across all calls. |
| Request replay | Re-send any call with modified parameters or a different model |
| Budget enforcement | Set a session cost ceiling — proxy returns 402 when exceeded |
| Export | Download session data as JSON or CSV |
| Cost tracking | Per-call and session-total cost with per-model pricing |
| Multi-provider | OpenAI, Anthropic, Ollama — all through one dashboard |
| Zero code changes | Just set an environment variable |
| Single binary | One ~10MB file. No database to install. No Docker required. |
| Local & private | SQLite storage. Nothing leaves your machine. |
| Dark theme | Because you're probably running this at 2am |
| Provider | Environment Variable | Works With |
|---|---|---|
| OpenAI | OPENAI_BASE_URL=http://localhost:4700/proxy/openai/v1 |
GPT-4o, o1, o3, any OpenAI model |
| Anthropic | ANTHROPIC_BASE_URL=http://localhost:4700/proxy/anthropic |
Claude Opus, Sonnet, Haiku |
| Ollama | OLLAMA_HOST=http://localhost:4700/proxy/ollama |
Llama, Mistral, Qwen, any local model |
Works with any tool that uses these SDKs: Claude Code, Cursor, Aider, LangChain, CrewAI, OpenAI Python/Node SDK, Anthropic SDK, and more.
Your Agent / IDE / Script
│
▼ (just an env var change)
┌─────────┐
│ llmview │ ← intercepts, logs, calculates cost
└────┬────┘
│
▼ (forwards to real API)
OpenAI / Anthropic / Ollama
llmview is a reverse proxy. It receives the request, records it, forwards it to the real API, records the response, calculates the cost, and pushes everything to the dashboard via WebSocket. Streaming responses are forwarded chunk-by-chunk with zero added latency.
# Custom port (default: 4700)
llmview --port 8080
# Custom database path (default: ~/.llmview/llmview.db)
llmview --db /path/to/data.db
# Set a budget ceiling (proxy returns 402 when exceeded)
llmview --budget 5.00llmview ships with built-in pricing for popular models (GPT-4o, Claude Sonnet, etc.). Local models (Ollama) are tracked as free. Pricing updates with new releases.
llmview exposes a JSON API for programmatic access:
# Current session stats
curl http://localhost:4700/api/session
# List recent calls
curl http://localhost:4700/api/calls?limit=20&offset=0
# Get full request/response for a specific call
curl http://localhost:4700/api/calls/{id}
# Replay a call with overrides
curl -X POST http://localhost:4700/api/replay \
-H 'Content-Type: application/json' \
-d '{"call_id":"abc123","overrides":{"model":"gpt-4o-mini"}}'
# Get config (budget info)
curl http://localhost:4700/api/config
# Health check
curl http://localhost:4700/api/healthgit clone https://github.com/uucz/llmview.git
cd llmview
make build # builds UI + Go binary
make test # runs all testsRequires: Go 1.25+, Node.js 18+ (for UI build). No C compiler needed — pure Go.
macOS: "permission denied" when running the binary
Downloaded binaries don't have execute permission by default. Fix:
chmod +x llmview-darwin-arm64
./llmview-darwin-arm64macOS: "cannot be opened because the developer cannot be verified"
macOS Gatekeeper blocks unsigned binaries. Go to System Settings > Privacy & Security, scroll down, and click "Open Anyway". Or remove the quarantine attribute:
xattr -d com.apple.quarantine llmview-darwin-arm64"command not found: llmview"
If you downloaded the binary, you need to use the full path or ./ prefix:
# Run from current directory
./llmview-darwin-arm64
# Or move to a directory in your PATH
sudo mv llmview-darwin-arm64 /usr/local/bin/llmview
llmviewIf you used go install, make sure $GOPATH/bin (usually ~/go/bin) is in your PATH:
export PATH="$HOME/go/bin:$PATH"Common mistakes
# WRONG: "go" is for compiling source code, not running binaries
go llmview
# WRONG: "run" is not a system command
run llmview
# WRONG: just the filename doesn't work without PATH setup
llmview-darwin-arm64
# CORRECT:
./llmview-darwin-arm64- Real-time proxy + dashboard
- Token cost tracking
- Multi-provider support
- Request/response detail viewer with message thread parsing
- Export sessions to JSON/CSV
- Filter & search calls by provider, status, model
- Request replay with parameter overrides
- Budget enforcement with real-time progress bar
- Prompt diff viewer (compare request variations)
- VS Code extension
- Breakpoints (pause before dangerous operations)
- Session history browser
MIT
If llmview saved you from a surprise API bill, consider giving it a star.