This document describes the actual behavior of the current Go codebase.
- Basics
- Configuration Best Practice
- Authentication
- Route Index
- Health Endpoints
- OpenAI-Compatible API
- Claude-Compatible API
- Gemini-Compatible API
- Admin API
- Error Payloads
- cURL Examples
| Item | Details |
|---|---|
| Base URL | http://localhost:5001 or your deployment domain |
| Default Content-Type | application/json |
| Health probes | GET /healthz, GET /readyz |
| CORS | Enabled (Access-Control-Allow-Origin: *, allows Content-Type, Authorization, X-API-Key, X-Ds2-Target-Account, X-Vercel-Protection-Bypass) |
Use config.json as the single source of truth:
cp config.example.json config.json
# Edit config.json (keys/accounts)Use it per deployment mode:
- Local run: read
config.jsondirectly - Docker / Vercel: generate Base64 from
config.json, then setDS2API_CONFIG_JSON
DS2API_CONFIG_JSON="$(base64 < config.json | tr -d '\n')"For Vercel one-click bootstrap, you can set only DS2API_ADMIN_KEY first, then import config at /admin and sync env vars from the "Vercel Sync" page.
Two header formats accepted:
| Method | Example |
|---|---|
| Bearer Token | Authorization: Bearer <token> |
| API Key Header | x-api-key: <token> (no Bearer prefix) |
Auth behavior:
- Token is in
config.keys→ Managed account mode: DS2API auto-selects an account via rotation - Token is not in
config.keys→ Direct token mode: treated as a DeepSeek token directly
Optional header: X-Ds2-Target-Account: <email_or_mobile> — Pin a specific managed account.
| Endpoint | Auth |
|---|---|
POST /admin/login |
Public |
GET /admin/verify |
Authorization: Bearer <jwt> (JWT only) |
Other /admin/* |
Authorization: Bearer <jwt> or Authorization: Bearer <admin_key> |
| Method | Path | Auth | Description |
|---|---|---|---|
| GET | /healthz |
None | Liveness probe |
| GET | /readyz |
None | Readiness probe |
| GET | /v1/models |
None | OpenAI model list |
| GET | /v1/models/{id} |
None | OpenAI single-model query (alias accepted) |
| POST | /v1/chat/completions |
Business | OpenAI chat completions |
| POST | /v1/responses |
Business | OpenAI Responses API (stream/non-stream) |
| GET | /v1/responses/{response_id} |
Business | Query stored response (in-memory TTL) |
| POST | /v1/embeddings |
Business | OpenAI Embeddings API |
| GET | /anthropic/v1/models |
None | Claude model list |
| POST | /anthropic/v1/messages |
Business | Claude messages |
| POST | /anthropic/v1/messages/count_tokens |
Business | Claude token counting |
| POST | /v1/messages |
Business | Claude shortcut path |
| POST | /messages |
Business | Claude shortcut path |
| POST | /v1/messages/count_tokens |
Business | Claude token counting shortcut |
| POST | /messages/count_tokens |
Business | Claude token counting shortcut |
| POST | /v1beta/models/{model}:generateContent |
Business | Gemini non-stream |
| POST | /v1beta/models/{model}:streamGenerateContent |
Business | Gemini stream |
| POST | /v1/models/{model}:generateContent |
Business | Gemini non-stream compat path |
| POST | /v1/models/{model}:streamGenerateContent |
Business | Gemini stream compat path |
| POST | /admin/login |
None | Admin login |
| GET | /admin/verify |
JWT | Verify admin JWT |
| GET | /admin/vercel/config |
Admin | Read preconfigured Vercel creds |
| GET | /admin/config |
Admin | Read sanitized config |
| POST | /admin/config |
Admin | Update config |
| GET | /admin/settings |
Admin | Read runtime settings |
| PUT | /admin/settings |
Admin | Update runtime settings (hot reload) |
| POST | /admin/settings/password |
Admin | Update admin password and invalidate old JWTs |
| POST | /admin/config/import |
Admin | Import config (merge/replace) |
| GET | /admin/config/export |
Admin | Export full config (config/json/base64) |
| POST | /admin/keys |
Admin | Add API key |
| DELETE | /admin/keys/{key} |
Admin | Delete API key |
| GET | /admin/accounts |
Admin | Paginated account list |
| POST | /admin/accounts |
Admin | Add account |
| DELETE | /admin/accounts/{identifier} |
Admin | Delete account |
| GET | /admin/queue/status |
Admin | Account queue status |
| POST | /admin/accounts/test |
Admin | Test one account |
| POST | /admin/accounts/test-all |
Admin | Test all accounts |
| POST | /admin/import |
Admin | Batch import keys/accounts |
| POST | /admin/test |
Admin | Test API through service |
| POST | /admin/vercel/sync |
Admin | Sync config to Vercel |
| GET | /admin/vercel/status |
Admin | Vercel sync status |
| GET | /admin/export |
Admin | Export config JSON/Base64 |
| GET | /admin/dev/captures |
Admin | Read local packet-capture entries |
| DELETE | /admin/dev/captures |
Admin | Clear local packet-capture entries |
{"status": "ok"}{"status": "ready"}No auth required. Returns supported models.
Response:
{
"object": "list",
"data": [
{"id": "deepseek-chat", "object": "model", "created": 1677610602, "owned_by": "deepseek", "permission": []},
{"id": "deepseek-reasoner", "object": "model", "created": 1677610602, "owned_by": "deepseek", "permission": []},
{"id": "deepseek-chat-search", "object": "model", "created": 1677610602, "owned_by": "deepseek", "permission": []},
{"id": "deepseek-reasoner-search", "object": "model", "created": 1677610602, "owned_by": "deepseek", "permission": []}
]
}For chat / responses / embeddings, DS2API follows a wide-input/strict-output policy:
- Match DeepSeek native model IDs first.
- Then match exact keys in
model_aliases. - If still unmatched, fall back by known family heuristics (
o*,gpt-*,claude-*, etc.). - If still unmatched, return
invalid_request_error.
Headers:
Authorization: Bearer your-api-key
Content-Type: application/jsonRequest body:
| Field | Type | Required | Notes |
|---|---|---|---|
model |
string | ✅ | DeepSeek native models + common aliases (gpt-4o, gpt-5-codex, o3, claude-sonnet-4-5, etc.) |
messages |
array | ✅ | OpenAI-style messages |
stream |
boolean | ❌ | Default false |
tools |
array | ❌ | Function calling schema |
temperature, etc. |
any | ❌ | Accepted but final behavior depends on upstream |
{
"id": "<chat_session_id>",
"object": "chat.completion",
"created": 1738400000,
"model": "deepseek-reasoner",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "final response",
"reasoning_content": "reasoning trace (reasoner models)"
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 10,
"completion_tokens": 20,
"total_tokens": 30,
"completion_tokens_details": {
"reasoning_tokens": 5
}
}
}SSE format: each frame is data: <json>\n\n, terminated by data: [DONE].
data: {"id":"...","object":"chat.completion.chunk","choices":[{"delta":{"role":"assistant"},"index":0}]}
data: {"id":"...","object":"chat.completion.chunk","choices":[{"delta":{"reasoning_content":"..."},"index":0}]}
data: {"id":"...","object":"chat.completion.chunk","choices":[{"delta":{"content":"..."},"index":0}]}
data: {"id":"...","object":"chat.completion.chunk","choices":[{"delta":{},"index":0,"finish_reason":"stop"}],"usage":{...}}
data: [DONE]
Field notes:
- First delta includes
role: assistant deepseek-reasoner/deepseek-reasoner-searchmodels emitdelta.reasoning_content- Text emits
delta.content - Last chunk includes
finish_reasonandusage
When tools is present, DS2API performs anti-leak handling:
Non-stream: If detected, returns message.tool_calls, finish_reason=tool_calls, message.content=null.
{
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": null,
"tool_calls": [
{
"id": "call_xxx",
"type": "function",
"function": {
"name": "get_weather",
"arguments": "{\"city\":\"beijing\"}"
}
}
]
},
"finish_reason": "tool_calls"
}
]
}Stream: Once high-confidence toolcall features are matched, DS2API emits delta.tool_calls immediately (without waiting for full JSON closure), then keeps sending argument deltas; confirmed raw tool JSON is never forwarded as delta.content.
No auth required. Alias values are accepted as path params (for example gpt-4o), and the returned object is the mapped DeepSeek model.
OpenAI Responses-style endpoint, accepting either input or messages.
| Field | Type | Required | Notes |
|---|---|---|---|
model |
string | ✅ | Supports native models + alias mapping |
input |
string/array/object | ❌ | One of input or messages is required |
messages |
array | ❌ | One of input or messages is required |
instructions |
string | ❌ | Prepended as a system message |
stream |
boolean | ❌ | Default false |
tools |
array | ❌ | Same tool detection/translation policy as chat |
tool_choice |
string/object | ❌ | Supports auto/none/required and forced function selection ({"type":"function","name":"..."}) |
Non-stream: Returns a standard response object with an ID like resp_xxx, and stores it in in-memory TTL cache.
If tool_choice=required and no valid tool call is produced, DS2API returns HTTP 422 (error.code=tool_choice_violation).
Stream (SSE): minimal event sequence:
event: response.created
data: {"type":"response.created","id":"resp_xxx","status":"in_progress",...}
event: response.output_item.added
data: {"type":"response.output_item.added","response_id":"resp_xxx","item":{"type":"message|function_call",...},...}
event: response.content_part.added
data: {"type":"response.content_part.added","response_id":"resp_xxx","part":{"type":"output_text",...},...}
event: response.output_text.delta
data: {"type":"response.output_text.delta","response_id":"resp_xxx","item_id":"msg_xxx","output_index":0,"content_index":0,"delta":"..."}
event: response.function_call_arguments.delta
data: {"type":"response.function_call_arguments.delta","response_id":"resp_xxx","call_id":"call_xxx","delta":"..."}
event: response.function_call_arguments.done
data: {"type":"response.function_call_arguments.done","response_id":"resp_xxx","call_id":"call_xxx","name":"tool","arguments":"{...}"}
event: response.content_part.done
data: {"type":"response.content_part.done","response_id":"resp_xxx",...}
event: response.output_item.done
data: {"type":"response.output_item.done","response_id":"resp_xxx","item":{"type":"message|function_call",...},...}
event: response.completed
data: {"type":"response.completed","response":{...}}
data: [DONE]
If tool_choice=required is violated in stream mode, DS2API emits response.failed then [DONE] (no response.completed).
Unknown tool names (outside declared tools) are rejected and will not be emitted as valid tool calls.
Business auth required. Fetches cached responses created by POST /v1/responses (caller-scoped; only the same key/token can read).
Backed by in-memory TTL store. Default TTL is
900s(configurable viaresponses.store_ttl_seconds).
Business auth required. Returns OpenAI-compatible embeddings shape.
| Field | Type | Required | Notes |
|---|---|---|---|
model |
string | ✅ | Supports native models + alias mapping |
input |
string/array | ✅ | Supports string, string array, token array |
Requires
embeddings.provider. Current supported values:mock/deterministic/builtin. If missing/unsupported, returns standard error shape with HTTP 501.
Besides /anthropic/v1/*, DS2API also supports shortcut paths: /v1/messages, /messages, /v1/messages/count_tokens, /messages/count_tokens.
No auth required.
Response:
{
"object": "list",
"data": [
{"id": "claude-sonnet-4-5", "object": "model", "created": 1715635200, "owned_by": "anthropic"},
{"id": "claude-haiku-4-5", "object": "model", "created": 1715635200, "owned_by": "anthropic"},
{"id": "claude-opus-4-6", "object": "model", "created": 1715635200, "owned_by": "anthropic"}
],
"first_id": "claude-opus-4-6",
"last_id": "claude-instant-1.0",
"has_more": false
}Note: the example is partial; the real response includes historical Claude 1.x/2.x/3.x/4.x IDs and common aliases.
Headers:
x-api-key: your-api-key
Content-Type: application/json
anthropic-version: 2023-06-01
anthropic-versionis optional; DS2API auto-fills2023-06-01when absent.
Request body:
| Field | Type | Required | Notes |
|---|---|---|---|
model |
string | ✅ | For example claude-sonnet-4-5 / claude-opus-4-6 / claude-haiku-4-5 (compatible with claude-3-5-haiku-latest), plus historical Claude model IDs |
messages |
array | ✅ | Claude-style messages |
max_tokens |
number | ❌ | Auto-filled to 8192 when omitted; not strictly enforced by upstream bridge |
stream |
boolean | ❌ | Default false |
system |
string | ❌ | Optional system prompt |
tools |
array | ❌ | Claude tool schema |
{
"id": "msg_1738400000000000000",
"type": "message",
"role": "assistant",
"model": "claude-sonnet-4-5",
"content": [
{"type": "text", "text": "response"}
],
"stop_reason": "end_turn",
"stop_sequence": null,
"usage": {
"input_tokens": 12,
"output_tokens": 34
}
}If tool use is detected, stop_reason becomes tool_use and content contains tool_use blocks.
SSE uses paired event: + data: lines. Event type is also in JSON type.
event: message_start
data: {"type":"message_start","message":{...}}
event: content_block_start
data: {"type":"content_block_start","index":0,"content_block":{"type":"text","text":""}}
event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":"hello"}}
event: ping
data: {"type":"ping"}
event: content_block_stop
data: {"type":"content_block_stop","index":0}
event: message_delta
data: {"type":"message_delta","delta":{"stop_reason":"end_turn","stop_sequence":null},"usage":{"output_tokens":12}}
event: message_stop
data: {"type":"message_stop"}
Notes:
- Models whose names contain
opus/reasoner/slowstreamthinking_delta signature_deltais not emitted (DeepSeek does not provide verifiable thinking signatures)- In
toolsmode, the stream avoids leaking raw tool JSON and does not forceinput_json_delta
Request:
{
"model": "claude-sonnet-4-5",
"messages": [
{"role": "user", "content": "Hello"}
]
}Response:
{
"input_tokens": 5
}Supported paths:
/v1beta/models/{model}:generateContent/v1beta/models/{model}:streamGenerateContent/v1/models/{model}:generateContent(compat path)/v1/models/{model}:streamGenerateContent(compat path)
Authentication is the same as other business routes (Authorization: Bearer <token> or x-api-key).
Request body accepts Gemini-style contents / tools. Model names can use aliases and are mapped to DeepSeek models.
Response uses Gemini-compatible fields, including:
candidates[].content.parts[].textcandidates[].content.parts[].functionCall(when tool call is produced)usageMetadata(promptTokenCount/candidatesTokenCount/totalTokenCount)
Returns SSE (text/event-stream), each chunk as data: <json>:
- regular text: incremental text chunks
toolsmode: buffered and emitted asfunctionCallat finalize phase- final chunk: includes
finishReason: "STOP"andusageMetadata
Public endpoint.
Request:
{
"admin_key": "admin",
"expire_hours": 24
}expire_hours is optional, default 24.
Response:
{
"success": true,
"token": "<jwt>",
"expires_in": 86400
}Requires JWT: Authorization: Bearer <jwt>
Response:
{
"valid": true,
"expires_at": 1738400000,
"remaining_seconds": 72000
}Returns Vercel preconfiguration status.
{
"has_token": true,
"project_id": "prj_xxx",
"team_id": null
}Returns sanitized config.
{
"keys": ["k1", "k2"],
"accounts": [
{
"identifier": "user@example.com",
"email": "user@example.com",
"mobile": "",
"has_password": true,
"has_token": true,
"token_preview": "abcde..."
}
],
"claude_mapping": {
"fast": "deepseek-chat",
"slow": "deepseek-reasoner"
}
}Updatable fields: keys, accounts, claude_mapping.
Request:
{
"keys": ["k1", "k2"],
"accounts": [
{"email": "user@example.com", "password": "pwd", "token": ""}
],
"claude_mapping": {
"fast": "deepseek-chat",
"slow": "deepseek-reasoner"
}
}Reads runtime settings and status, including:
admin(JWT expiry, default-password warning, etc.)runtime(account_max_inflight,account_max_queue,global_max_inflight)toolcall/responses/embeddingsauto_delete(sessions)claude_mapping/model_aliasesenv_backed,needs_vercel_sync
Hot-updates runtime settings. Supported fields:
admin.jwt_expire_hoursruntime.account_max_inflight/runtime.account_max_queue/runtime.global_max_inflighttoolcall.mode/toolcall.early_emit_confidenceresponses.store_ttl_secondsembeddings.providerauto_delete.sessionsclaude_mappingmodel_aliases
Updates admin password and invalidates existing JWTs.
Request example:
{"new_password":"your-new-password"}Imports full config with:
mode=merge(default)mode=replace
The request can send config directly, or wrapped as {"config": {...}, "mode":"merge"}.
Exports full config in three forms: config, json, and base64.
{"key": "new-api-key"}Response: {"success": true, "total_keys": 3}
Response: {"success": true, "total_keys": 2}
Query params:
| Param | Default | Range |
|---|---|---|
page |
1 |
≥ 1 |
page_size |
10 |
1–100 |
Response:
{
"items": [
{
"identifier": "user@example.com",
"email": "user@example.com",
"mobile": "",
"has_password": true,
"has_token": true,
"token_preview": "abc..."
}
],
"total": 25,
"page": 1,
"page_size": 10,
"total_pages": 3
}{"email": "user@example.com", "password": "pwd"}Response: {"success": true, "total_accounts": 6}
identifier can be email, mobile, or the synthetic id for token-only accounts (token:<hash>).
Response: {"success": true, "total_accounts": 5}
{
"available": 3,
"in_use": 1,
"total": 4,
"available_accounts": ["a@example.com"],
"in_use_accounts": ["b@example.com"],
"max_inflight_per_account": 2,
"recommended_concurrency": 8
}| Field | Description |
|---|---|
available |
Currently available accounts |
in_use |
Currently in-use accounts |
total |
Total accounts |
max_inflight_per_account |
Per-account inflight limit |
recommended_concurrency |
Suggested concurrency (total × max_inflight_per_account) |
| Field | Required | Notes |
|---|---|---|
identifier |
✅ | email / mobile / token-only synthetic id |
model |
❌ | default deepseek-chat |
message |
❌ | if empty, only session creation is tested |
Response:
{
"account": "user@example.com",
"success": true,
"response_time": 1240,
"message": "API test successful (session creation only)",
"model": "deepseek-chat"
}Optional request field: model.
{
"total": 5,
"success": 4,
"failed": 1,
"results": [...]
}Batch import keys and accounts.
Request:
{
"keys": ["k1", "k2"],
"accounts": [
{"email": "user@example.com", "password": "pwd", "token": ""}
]
}Response:
{
"success": true,
"imported_keys": 2,
"imported_accounts": 1
}Test API availability through the service itself.
| Field | Required | Default |
|---|---|---|
model |
❌ | deepseek-chat |
message |
❌ | 你好 |
api_key |
❌ | First key in config |
Response:
{
"success": true,
"status_code": 200,
"response": {"id": "..."}
}| Field | Required | Notes |
|---|---|---|
vercel_token |
❌ | If empty or __USE_PRECONFIG__, read env |
project_id |
❌ | Fallback: VERCEL_PROJECT_ID |
team_id |
❌ | Fallback: VERCEL_TEAM_ID |
auto_validate |
❌ | Default true |
save_credentials |
❌ | Default true |
Success response:
{
"success": true,
"validated_accounts": 3,
"message": "Config synced, redeploying...",
"deployment_url": "https://..."
}Or manual deploy required:
{
"success": true,
"validated_accounts": 3,
"message": "Config synced to Vercel, please trigger redeploy manually",
"manual_deploy_required": true
}{
"synced": true,
"last_sync_time": 1738400000,
"has_synced_before": true
}{
"json": "{...}",
"base64": "ey4uLn0="
}Reads local packet-capture status and recent entries (Admin auth required):
enabledlimitmax_body_bytesitems
Clears packet-capture entries:
{"success":true,"detail":"capture logs cleared"}Compatible routes (/v1/*, /anthropic/*) use the same error envelope:
{
"error": {
"message": "...",
"type": "invalid_request_error",
"code": "invalid_request",
"param": null
}
}Admin routes keep {"detail":"..."}.
Gemini routes use Google-style errors:
{
"error": {
"code": 400,
"message": "invalid json",
"status": "INVALID_ARGUMENT"
}
}Clients should handle HTTP status code plus error / detail fields.
Common status codes:
| Code | Meaning |
|---|---|
401 |
Authentication failed (invalid key/token, or expired admin JWT) |
429 |
Too many requests (exceeded inflight + queue capacity) |
503 |
Model unavailable or upstream error |
curl http://localhost:5001/v1/chat/completions \
-H "Authorization: Bearer your-api-key" \
-H "Content-Type: application/json" \
-d '{
"model": "deepseek-chat",
"messages": [{"role": "user", "content": "Hello"}],
"stream": false
}'curl http://localhost:5001/v1/chat/completions \
-H "Authorization: Bearer your-api-key" \
-H "Content-Type: application/json" \
-d '{
"model": "deepseek-reasoner",
"messages": [{"role": "user", "content": "Explain quantum entanglement"}],
"stream": true
}'curl http://localhost:5001/v1/responses \
-H "Authorization: Bearer your-api-key" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-5-codex",
"input": "Write a hello world in golang",
"stream": true
}'curl http://localhost:5001/v1/embeddings \
-H "Authorization: Bearer your-api-key" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o",
"input": ["first text", "second text"]
}'curl http://localhost:5001/v1/chat/completions \
-H "Authorization: Bearer your-api-key" \
-H "Content-Type: application/json" \
-d '{
"model": "deepseek-chat-search",
"messages": [{"role": "user", "content": "Latest news today"}],
"stream": true
}'curl http://localhost:5001/v1/chat/completions \
-H "Authorization: Bearer your-api-key" \
-H "Content-Type: application/json" \
-d '{
"model": "deepseek-chat",
"messages": [{"role": "user", "content": "What is the weather in Beijing?"}],
"tools": [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get weather for a city",
"parameters": {
"type": "object",
"properties": {
"city": {"type": "string", "description": "City name"}
},
"required": ["city"]
}
}
}
]
}'curl "http://localhost:5001/v1beta/models/gemini-2.5-pro:generateContent" \
-H "Authorization: Bearer your-api-key" \
-H "Content-Type: application/json" \
-d '{
"contents": [
{
"role": "user",
"parts": [{"text": "Introduce Go in three sentences"}]
}
]
}'curl "http://localhost:5001/v1beta/models/gemini-2.5-flash:streamGenerateContent" \
-H "Authorization: Bearer your-api-key" \
-H "Content-Type: application/json" \
-d '{
"contents": [
{
"role": "user",
"parts": [{"text": "Write a short summary"}]
}
]
}'curl http://localhost:5001/anthropic/v1/messages \
-H "x-api-key: your-api-key" \
-H "Content-Type: application/json" \
-H "anthropic-version: 2023-06-01" \
-d '{
"model": "claude-sonnet-4-5",
"max_tokens": 1024,
"messages": [{"role": "user", "content": "Hello"}]
}'curl http://localhost:5001/anthropic/v1/messages \
-H "x-api-key: your-api-key" \
-H "Content-Type: application/json" \
-H "anthropic-version: 2023-06-01" \
-d '{
"model": "claude-opus-4-6",
"max_tokens": 1024,
"messages": [{"role": "user", "content": "Explain relativity"}],
"stream": true
}'curl http://localhost:5001/admin/login \
-H "Content-Type: application/json" \
-d '{"admin_key": "admin"}'curl http://localhost:5001/v1/chat/completions \
-H "Authorization: Bearer your-api-key" \
-H "X-Ds2-Target-Account: user@example.com" \
-H "Content-Type: application/json" \
-d '{
"model": "deepseek-chat",
"messages": [{"role": "user", "content": "Hello"}]
}'