Claude Code Proxy

A self-hosted LiteLLM proxy that gives every developer on your team Claude Code access — using your existing cloud credits, without handing out API keys.

Two env vars per developer. That's the entire setup.

What This Does

Routes Claude Code traffic through a single proxy with weighted load balancing across Vertex AI, Bedrock, and Anthropic Direct
Tracks per-developer cost, token usage, and model selection in PostgreSQL
Enforces budget limits via virtual keys
Enables prompt caching automatically through session affinity
Provides automatic failover across providers

Architecture

Developer machines (Claude Code CLI)
        │
        ▼
  Nginx (TLS + session extraction via Lua)
        │
        ▼
  LiteLLM Proxy (routing, auth, cost tracking)
        │
        ├── Vertex AI    (weight: 10)
        ├── AWS Bedrock   (weight: 1)
        └── Anthropic     (weight: 1)

Everything runs on a single VM.

Prerequisites

A VM with Ubuntu (any cloud provider — AWS, GCP, DigitalOcean, etc.)
A domain name pointed at your VM (A record)
API credentials for at least one Claude provider

Quick Start

Clone and configure:

git clone https://github.com/your-org/claude-code-proxy.git
cd claude-code-proxy
cp env.example .env
# Edit .env with your credentials

Add your GCP credentials (if using Vertex AI):

# Place your Application Default Credentials file
cp /path/to/your/adc.json ./gcp-adc.json

Deploy:

chmod +x deploy.sh
./deploy.sh your-domain.com you@example.com

This installs Docker, Nginx, provisions an SSL certificate via Let's Encrypt, and starts the stack.

Create virtual keys for your developers via the LiteLLM admin dashboard at https://your-domain.com/ui.
Developer setup (2 minutes per person):

echo 'export ANTHROPIC_BASE_URL=https://your-domain.com/v1' >> ~/.bashrc
echo 'export ANTHROPIC_AUTH_TOKEN=sk-...' >> ~/.bashrc
source ~/.bashrc

Done. claude works as normal.

Provider Setup

You need credentials for at least one provider. Configure all three for maximum reliability and credit utilization.

Anthropic Direct

The simplest option. Create an API key at console.anthropic.com:

Sign up or log in at console.anthropic.com
Go to API Keys and create a new key
Add to your .env:

ANTHROPIC_API_KEY=sk-ant-your-key-here

Google Cloud (Vertex AI)

Use this to route traffic through your GCP cloud credits.

Enable the Vertex AI API in your GCP project:

gcloud services enable aiplatform.googleapis.com

Enable the Claude models you need. Go to Vertex AI Model Garden and enable Claude Opus, Sonnet, and/or Haiku.
Create Application Default Credentials:

# Option A: User credentials (development)
gcloud auth application-default login
cp ~/.config/gcloud/application_default_credentials.json ./gcp-adc.json

# Option B: Service account (production, recommended)
gcloud iam service-accounts create litellm-proxy
gcloud projects add-iam-policy-binding YOUR_PROJECT_ID \
    --member="serviceAccount:litellm-proxy@YOUR_PROJECT_ID.iam.gserviceaccount.com" \
    --role="roles/aiplatform.user"
gcloud iam service-accounts keys create ./gcp-adc.json \
    --iam-account=litellm-proxy@YOUR_PROJECT_ID.iam.gserviceaccount.com

Add to your .env:

VERTEX_PROJECT=your-gcp-project-id
VERTEX_LOCATION=us-east5          # Region where Claude is available

The gcp-adc.json file is mounted into the container automatically by docker-compose.yml.

AWS Bedrock

Use this to route traffic through your AWS cloud credits.

Enable Claude model access in the AWS console:
- Go to Amazon Bedrock in your preferred region
- Navigate to Model access in the left sidebar
- Request access to the Anthropic Claude models you need
- Wait for access to be granted (usually instant for on-demand)
Create an IAM user with Bedrock permissions:

aws iam create-user --user-name litellm-proxy

# Attach the Bedrock policy
aws iam attach-user-policy \
    --user-name litellm-proxy \
    --policy-arn arn:aws:iam::aws:policy/AmazonBedrockFullAccess

# Create access keys
aws iam create-access-key --user-name litellm-proxy

Add to your .env:

AWS_ACCESS_KEY_ID=AKIA...
AWS_SECRET_ACCESS_KEY=...
AWS_REGION=us-east-1              # Region where you enabled Claude

Configuring Routing Weights

The weight parameter in litellm-config.yaml controls what percentage of traffic goes to each provider. Set weights to match your available cloud credit ratio.

How Weights Work

Each model is defined three times — once per provider — under the same model_name. The router picks a provider using weighted random selection:

# Example: 10x more GCP credits than AWS or Anthropic
- model_name: claude-sonnet-4-6
  litellm_params:
    model: vertex_ai/claude-sonnet-4-6
    weight: 10                    # ~83% of traffic

- model_name: claude-sonnet-4-6
  litellm_params:
    model: bedrock/us.anthropic.claude-sonnet-4-6
    weight: 1                     # ~8% of traffic

- model_name: claude-sonnet-4-6
  litellm_params:
    model: anthropic/claude-sonnet-4-6
    weight: 1                     # ~8% of traffic

Common Ratios

Scenario	Vertex	Bedrock	Anthropic	Result
Heavy GCP credits	10	1	1	~83% GCP, ~8% each AWS/Anthropic
Equal credits	1	1	1	~33% each
GCP only	1	0	0	100% GCP (remove other entries)
GCP + AWS, no direct	5	5	0	50/50 (remove Anthropic entries)
Anthropic only	0	0	1	100% direct (remove other entries)

To change the ratio, edit litellm-config.yaml and restart:

sudo docker compose restart litellm

Removing a Provider

If you only have credentials for one or two providers, simply delete the model entries you don't need from litellm-config.yaml. For example, to use only Anthropic Direct, keep only the anthropic/ entries and remove all vertex_ai/ and bedrock/ entries.

Session Affinity

The Lua script (extract_session.lua) automatically extracts Claude Code's session_id from request bodies and pins sessions to the same provider for 4 hours. This enables prompt caching with zero developer configuration. Prompt caching can reduce costs by up to 90% on cached prefixes.

Files

File	Purpose
`deploy.sh`	One-command deployment (Docker, Nginx, SSL, containers)
`docker-compose.yml`	LiteLLM + PostgreSQL service definitions
`litellm-config.yaml`	Model routing, weights, and general settings
`nginx.conf`	Reverse proxy with TLS and Lua session extraction
`extract_session.lua`	Extracts session ID from request body for routing affinity
`setup-claude-session.sh`	Optional shell wrapper for session ID injection
`env.example`	Template for required environment variables

License

MIT

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Claude Code Proxy

What This Does

Architecture

Prerequisites

Quick Start

Provider Setup

Anthropic Direct

Google Cloud (Vertex AI)

AWS Bedrock

Configuring Routing Weights

How Weights Work

Common Ratios

Removing a Provider

Session Affinity

Files

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
deploy.sh		deploy.sh
docker-compose.yml		docker-compose.yml
env.example		env.example
extract_session.lua		extract_session.lua
litellm-config.yaml		litellm-config.yaml
nginx.conf		nginx.conf
setup-claude-session.sh		setup-claude-session.sh

Folders and files

Latest commit

History

Repository files navigation

Claude Code Proxy

What This Does

Architecture

Prerequisites

Quick Start

Provider Setup

Anthropic Direct

Google Cloud (Vertex AI)

AWS Bedrock

Configuring Routing Weights

How Weights Work

Common Ratios

Removing a Provider

Session Affinity

Files

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages