All endpoints are served by s-proxy (default port 9000).
User-facing authentication endpoints used by cproxy login and the c-proxy daemon.
Authenticate with username and password. Returns a short-lived access token and a long-lived refresh token.
Request
{
"username": "alice",
"password": "my-password"
}Response 200
{
"access_token": "<JWT>",
"refresh_token": "<UUID>",
"expires_in": 86400,
"token_type": "Bearer",
"username": "alice"
}Error responses
| Status | error code |
Reason |
|---|---|---|
| 400 | invalid_request |
Missing or malformed JSON body |
| 401 | invalid_credentials |
Wrong username or password |
| 403 | account_disabled |
Account has been deactivated |
| 500 | internal_error |
Database or token generation error |
Exchange a valid refresh token for a new access token. The refresh token is not rotated.
Request
{
"refresh_token": "<UUID from /auth/login>"
}Response 200
{
"access_token": "<new JWT>",
"expires_in": 86400,
"token_type": "Bearer"
}Error responses
| Status | error code |
Reason |
|---|---|---|
| 400 | invalid_request |
Missing refresh_token |
| 401 | invalid_refresh_token |
Token expired, revoked, or unknown |
| 403 | account_disabled |
Account deactivated after token was issued |
Invalidate the current session. Blacklists the access token (if provided) and revokes the refresh token.
Headers
Authorization: Bearer <access_token> (optional but recommended)
Request body (optional)
{
"refresh_token": "<UUID>"
}Response
204 No Content
All admin endpoints (except /api/admin/login) require a valid admin JWT.
Authentication: Bearer token from /api/admin/login, or the pairproxy_admin cookie set by the dashboard login page.
Authorization: Bearer <admin_token>
Authenticate as the admin user using the configured password hash.
Request
{
"password": "admin-password"
}Response 200
{
"token": "<admin JWT>",
"expires_in": 86400
}Error responses
| Status | error code |
Reason |
|---|---|---|
| 400 | invalid_request |
Missing password |
| 401 | unauthorized |
Wrong password |
List all users, optionally filtered by group.
Query parameters
| Name | Type | Description |
|---|---|---|
group_id |
string | Filter by group ID (optional) |
Response 200
{
"users": [
{
"id": "uuid",
"username": "alice",
"group_id": "uuid-or-null",
"is_active": true,
"created_at": "2025-01-01T00:00:00Z",
"last_login_at": "2025-06-01T12:00:00Z"
}
]
}Create a new user.
Request
{
"username": "bob",
"password": "initial-password",
"group_id": "group-uuid"
}group_id is optional. Omit or set to "" for no group (unlimited quota).
Response 201
{ "id": "new-user-uuid" }Error responses
| Status | error code |
Reason |
|---|---|---|
| 400 | invalid_request |
Missing username or password |
| 409 | conflict |
Username already exists |
Enable or disable a user account.
Request
{ "active": false }Response 204 — No Content
Reset a user's password.
Request
{ "password": "new-password" }Response 204 — No Content
List all groups with their current quota settings.
Response 200
{
"groups": [
{
"id": "uuid",
"name": "engineering",
"daily_token_limit": 100000,
"monthly_token_limit": 2000000,
"requests_per_minute": 20,
"max_tokens_per_request": 4096,
"concurrent_requests": 5,
"created_at": "2025-01-01T00:00:00Z"
}
]
}All limit fields are null when unlimited.
| Field | Description |
|---|---|
daily_token_limit |
Daily token cap (input + output combined). null = unlimited |
monthly_token_limit |
Monthly token cap. null = unlimited |
requests_per_minute |
Per-user RPM limit. null = unlimited |
max_tokens_per_request |
Maximum max_tokens value allowed in a single request. null = unlimited |
concurrent_requests |
Maximum number of simultaneous in-flight requests per user. null = unlimited |
Create a new group.
Request
{
"name": "trial",
"daily_token_limit": 10000,
"monthly_token_limit": 200000,
"requests_per_minute": 10,
"max_tokens_per_request": 4096,
"concurrent_requests": 2
}All limit fields are optional. Omit or set to null for unlimited.
Response 201
{ "id": "new-group-uuid" }Update quota limits for an existing group.
Request
{
"daily_token_limit": 50000,
"monthly_token_limit": null,
"requests_per_minute": 30,
"max_tokens_per_request": 8192,
"concurrent_requests": 3
}Set a field to null to remove that limit (unlimited). Omitting a field leaves it unchanged.
Response 204 — No Content
Delete a group. Users currently in the group are moved to no group (unlimited quota).
Response 204 — No Content
Error responses
| Status | error code |
Reason |
|---|---|---|
| 404 | not_found |
Group not found |
| 409 | conflict |
Cannot delete group with active users |
Assign a user to a group (or remove from group by setting group_id to null).
Request
{ "group_id": "group-uuid" }Set group_id to null to remove the user from their current group.
Response 204 — No Content
Immediately revoke all active tokens (access + refresh) for a user. Use this when disabling a user or responding to a security incident.
Response 204 — No Content
Global token and cost summary for a time range.
Query parameters
| Name | Default | Description |
|---|---|---|
from |
start of today | ISO 8601 timestamp |
to |
now | ISO 8601 timestamp |
Response 200
{
"total_input_tokens": 1234567,
"total_output_tokens": 345678,
"total_tokens": 1580245,
"request_count": 420,
"error_count": 3,
"cost_usd": 12.34
}Per-user token usage breakdown.
Query parameters
| Name | Default | Description |
|---|---|---|
from |
start of today | ISO 8601 timestamp |
to |
now | ISO 8601 timestamp |
limit |
50 |
Max rows returned |
Response 200
{
"rows": [
{
"user_id": "uuid",
"username": "alice",
"total_tokens": 98000,
"request_count": 42
}
]
}Paginated request log entries.
Query parameters
| Name | Default | Description |
|---|---|---|
user_id |
(all users) | Filter by user ID |
limit |
100 |
Max rows returned |
Response 200
{
"logs": [
{
"id": 1001,
"request_id": "uuid",
"user_id": "uuid",
"model": "claude-3-5-sonnet-20241022",
"input_tokens": 512,
"output_tokens": 128,
"status_code": 200,
"duration_ms": 1230,
"cost_usd": 0.0032,
"is_streaming": true,
"upstream_url": "https://api.anthropic.com",
"created_at": "2025-06-01T12:00:00Z"
}
]
}Query current quota usage for a user or group.
Query parameters
| Name | Required | Description |
|---|---|---|
user |
no | Username to query |
group |
no | Group name to query |
Exactly one of user or group must be provided.
Response 200
{
"daily_used": 12345,
"daily_limit": 50000,
"monthly_used": 234567,
"monthly_limit": 1000000,
"rpm_limit": 10
}Limit fields are null when unlimited.
List recent admin operations (user/group changes, quota updates, etc.).
Query parameters
| Name | Default | Description |
|---|---|---|
limit |
100 |
Max entries returned |
Response 200
{
"entries": [
{
"id": 1,
"admin_user": "admin",
"action": "create_user",
"target": "alice",
"details": "{\"group\":\"trial\"}",
"created_at": "2025-06-01T12:00:00Z"
}
]
}Purge old request logs.
Query parameters
| Name | Required | Description |
|---|---|---|
before |
no | Delete logs before this date (ISO 8601) |
days |
no | Delete logs older than N days |
Exactly one of before or days must be provided.
Response 204 — No Content
Export request logs in CSV or JSON format.
Query parameters
| Name | Default | Description |
|---|---|---|
format |
json |
Output format: json or csv |
from |
start of today | Start timestamp (ISO 8601) |
to |
now | End timestamp (ISO 8601) |
user_id |
(all) | Filter by user ID |
Response 200 — Returns file download with appropriate Content-Type.
Enter drain mode (reject new requests, allow in-flight to complete).
Response 204 — No Content
Exit drain mode (resume accepting requests).
Response 204 — No Content
Check current drain mode status.
Response 200
{
"draining": false,
"active_requests": 3
}List all configured API keys (values are masked).
Response 200
{
"keys": [
{
"id": "uuid",
"name": "anthropic-prod",
"provider": "anthropic",
"masked_value": "sk-ant-***xyz",
"is_active": true,
"assigned_to": "user-uuid",
"created_at": "2025-06-01T12:00:00Z"
}
]
}Create a new API key.
Request
{
"name": "anthropic-backup",
"provider": "anthropic",
"value": "sk-ant-api03-..."
}Response 201
{
"id": "uuid",
"name": "anthropic-backup"
}Assign an API key to a specific user or group.
Request
{
"user_id": "uuid"
}Or:
{
"group_id": "uuid"
}Response 204 — No Content
Revoke an API key.
Response 204 — No Content
List all configured LLM targets with health status.
Response 200
{
"targets": [
{
"id": "https://api.anthropic.com",
"name": "Primary",
"provider": "anthropic",
"healthy": true,
"weight": 100,
"bound_users": 5,
"bound_groups": 2
}
]
}v2.7.0+ — 动态添加新 LLM Target,无需重启服务。
Request
{
"url": "https://api.example.com",
"api_key": "sk-...",
"provider": "anthropic",
"name": "新节点",
"weight": 1
}| Field | Required | Description |
|---|---|---|
url |
✅ | Target 的唯一标识 URL |
api_key |
✅ | 调用该上游时使用的 API Key |
provider |
✅ | "anthropic" / "openai" / "ollama" |
name |
❌ | 可读名称(留空使用 URL) |
weight |
❌ | 负载权重,默认 1 |
Response 201
{ "id": "uuid" }Error responses
| Status | error code |
Reason |
|---|---|---|
| 400 | invalid_request |
缺少必填字段或 URL 格式非法 |
| 409 | conflict |
相同 URL 的 Target 已存在 |
v2.7.0+ — 启用指定 Target(URL 需 URLEncode)。已禁用的 Target 重新加入路由。
Path parameter: url — Target URL,需进行 URL 编码(%3A、%2F 等)。
Response 204 — No Content
v2.7.0+ — 禁用指定 Target,不删除配置,停止向其路由新请求。
Response 204 — No Content
v2.7.0+ — 更新 Target 配置(部分更新,仅传入需修改的字段)。
Request(各字段均为可选)
{
"api_key": "sk-new-key",
"weight": 2,
"name": "新名称"
}Response 204 — No Content
Error responses
| Status | error code |
Reason |
|---|---|---|
| 404 | not_found |
Target URL 不存在 |
v2.7.0+ — 删除指定 Target。默认拒绝删除仍有 binding 的 Target,可使用
force=true强制解除所有绑定后删除。
Query parameters
| Name | Default | Description |
|---|---|---|
force |
false |
设为 true 则先解除全部用户/分组绑定再删除 |
Response 204 — No Content
Error responses
| Status | error code |
Reason |
|---|---|---|
| 404 | not_found |
Target URL 不存在 |
| 409 | conflict |
Target 仍有活跃绑定且未设置 force=true |
List all user/group LLM bindings.
Response 200
{
"bindings": [
{
"id": "uuid",
"user_id": "uuid",
"username": "alice",
"target_url": "https://api.anthropic.com",
"created_at": "2025-06-01T12:00:00Z"
},
{
"id": "uuid",
"group_id": "uuid",
"group_name": "premium",
"target_url": "https://api.anthropic.com",
"created_at": "2025-06-01T12:00:00Z"
}
]
}Create a new LLM binding for a user or group.
Request
{
"user_id": "uuid",
"target_url": "https://api.anthropic.com"
}Or:
{
"group_id": "uuid",
"target_url": "https://api.anthropic.com"
}Response 201
{
"id": "uuid"
}Delete an LLM binding.
Response 204 — No Content
Automatically distribute all active users evenly across all healthy LLM targets.
Response 200
{
"distributed": 42,
"targets": 3
}从模板文件(text/plain 格式)批量导入用户和分组。支持 dry_run 预览模式,不写入数据库。
Authentication: Bearer token (admin JWT)
Headers
Content-Type: text/plain
Query parameters
| Name | Default | Description |
|---|---|---|
dry_run |
false |
设为 true 时仅预览结果,不实际写入 |
Request body — 模板文件内容(与 sproxy admin import CLI 使用的格式相同)
group:engineering
alice password123
bob password456
group:trial
carol password789
Response 200
{
"groups_created": 2,
"users_created": 5,
"skipped": 1,
"dry_run": false
}| Field | Description |
|---|---|
groups_created |
新建的分组数量 |
users_created |
新建的用户数量 |
skipped |
跳过的条目数(用户名已存在等) |
dry_run |
是否为预览模式(与请求参数一致) |
Error responses
| Status | error code |
Reason |
|---|---|---|
| 400 | invalid_request |
模板格式解析失败 |
| 422 | validation_error |
模板内容存在业务约束冲突 |
v2.9.0+ — 为普通用户签发
sk-pp-格式的 API Key,用户无需运行cproxy守护进程,可直接以 API Key 访问 PairProxy,与 OpenAI / Anthropic SDK 兼容。v2.15.0 — Keygen 算法升级为 HMAC-SHA256,替换旧版指纹算法,同一用户每次调用返回确定性相同的密钥。
为指定用户生成(或重新获取)sk-pp- 前缀的 API Key。
Authentication: Bearer token (admin JWT)
Path parameter: username — 目标用户名(须已存在于数据库中)
Response 200
{
"api_key": "sk-pp-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx",
"username": "alice",
"expires_at": null
}| Field | Description |
|---|---|
api_key |
生成的 API Key,sk-pp- 前缀,HMAC-SHA256 派生(v2.15.0+) |
username |
对应的用户名 |
expires_at |
过期时间(null 表示永不过期) |
Error responses
| Status | error code |
Reason |
|---|---|---|
| 404 | not_found |
用户名不存在 |
| 403 | account_disabled |
用户账户已被禁用 |
验证一个 sk-pp- API Key 是否有效,并返回对应的用户名。
Request
{
"api_key": "sk-pp-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
}Response 200
{
"valid": true,
"username": "alice"
}若 Key 无效或用户已被禁用:
{
"valid": false,
"username": ""
}v2.4.0+ — 按用户粒度记录 LLM 对话内容(messages + 助手回复)到磁盘 JSON 文件。 仅 CLI 操作,无 REST API。对话文件存储在数据库文件同级的
track/目录下。
sproxy admin track enable <username>为指定用户启用对话跟踪。启用后该用户的所有后续请求均被记录,存量请求不受影响。幂等(多次调用无副作用)。
Exit codes: 0 成功,1 用户名非法(含路径遍历字符)
sproxy admin track disable <username>禁用指定用户的对话跟踪。已写入的历史记录不会被删除。幂等(用户不存在时也返回成功)。
sproxy admin track list列出当前所有处于跟踪状态的用户名。
Output example:
Tracked users (2):
alice
bob
若无跟踪用户,输出:
No users are currently being tracked.
sproxy admin track show <username>列出指定用户的所有对话记录文件(按时间倒序,含文件大小)。
Output example:
Conversations for alice [tracking: ENABLED] — 3 record(s):
2026-03-07T13-05-22Z-req-abc123.json (2.1 KB)
2026-03-07T12-31-09Z-req-def456.json (1.8 KB)
2026-03-07T11-47-53Z-req-ghi789.json (3.4 KB)
Record JSON format (单条文件内容):
{
"request_id": "ca0e1b3b-bc75-4a7d-9925-2cda8cf2b318",
"username": "alice",
"timestamp": "2026-03-07T13:05:22Z",
"provider": "anthropic",
"model": "claude-3-opus",
"messages": [
{ "role": "user", "content": "Hello Claude" },
{ "role": "assistant", "content": "Sure" },
{ "role": "user", "content": "Thanks" }
],
"response": "You're welcome!",
"input_tokens": 15,
"output_tokens": 7
}| 字段 | 说明 |
|---|---|
request_id |
请求唯一 ID(UUID) |
username |
用户名 |
timestamp |
请求时间(UTC) |
provider |
"anthropic" / "openai" / "ollama" |
model |
请求中指定的模型名(可能为空) |
messages |
请求中的 messages 列表(已展开 content block) |
response |
助手回复全文(流式自动累积,非流式从响应体提取) |
input_tokens |
输入 token 数(来自响应 usage 字段) |
output_tokens |
输出 token 数 |
sproxy admin track clear <username>删除指定用户的所有对话记录文件。跟踪状态不受影响(已启用的继续记录新对话)。
⚠️ 此操作不可逆,删除前请确认或先备份。
<db_dir>/track/
├── users/
│ ├── alice # 空标记文件,存在即表示追踪已启用
│ └── bob
└── conversations/
├── alice/
│ ├── 2026-03-07T13-05-22Z-<reqID>.json
│ └── 2026-03-07T12-31-09Z-<reqID>.json
└── bob/
└── 2026-03-07T11-00-00Z-<reqID>.json
文件命名格式:<UTC时间戳>-<requestID>.json,字典序即时间序,便于 ls/sort 排序。
Dashboard 数据接口,用于概览页面的趋势图表。需要 admin 认证。
Authentication: Bearer token (admin JWT) 或 pairproxy_admin cookie。
返回趋势数据,用于 Dashboard 概览页的图表展示。
Query parameters
| Name | Default | Description |
|---|---|---|
days |
7 |
时间范围(天),最大 365 |
Response 200
{
"daily_tokens": [
{
"date": "2025-03-01",
"input_tokens": 123456,
"output_tokens": 34567,
"total_tokens": 158023,
"request_count": 42
}
],
"daily_cost": [
{
"date": "2025-03-01",
"cost_usd": 1.23
}
],
"top_users": [
{
"user_id": "uuid",
"total_input": 50000,
"total_output": 15000,
"request_count": 20
}
]
}daily_tokens: 按日期聚合的 token 用量daily_cost: 按日期聚合的费用(USD)top_users: Top 5 用户按 token 总量排序
用户自助服务接口,普通用户可访问(无需 admin 角色)。需要有效的用户 JWT。
Authentication: Authorization: Bearer <user_jwt>
返回当前用户的配额状态(已用 / 限额)。
Response 200
{
"daily_limit": 50000,
"daily_used": 12345,
"monthly_limit": 1000000,
"monthly_used": 234567,
"rpm_limit": 10
}| Field | Description |
|---|---|
daily_limit |
日配额上限,0 表示无限制 |
daily_used |
今日已用 token |
monthly_limit |
月配额上限,0 表示无限制 |
monthly_used |
本月已用 token |
rpm_limit |
每分钟请求限制,0 表示无限制 |
Error responses
| Status | error code |
Reason |
|---|---|---|
| 401 | unauthorized |
缺少或无效的 JWT |
| 404 | not_found |
用户不存在 |
返回当前用户的每日 token 用量历史。
Query parameters
| Name | Default | Description |
|---|---|---|
days |
30 |
回溯天数(最大 365) |
Response 200
{
"history": [
{
"date": "2025-03-01",
"input_tokens": 12345,
"output_tokens": 3456,
"total_tokens": 15801,
"request_count": 12
},
{
"date": "2025-03-02",
"input_tokens": 23456,
"output_tokens": 5678,
"total_tokens": 29134,
"request_count": 18
}
]
}Error responses
| Status | error code |
Reason |
|---|---|---|
| 401 | unauthorized |
缺少或无效的 JWT |
| 500 | internal_error |
数据库查询失败 |
Used by worker s-proxy nodes to communicate with the primary node. Requests must carry the shared secret:
Authorization: Bearer <shared_secret>
⚠️ These endpoints are not intended for external clients.
Worker node heartbeat. The primary registers the peer and updates its routing table.
Request
{
"id": "sp-2",
"addr": "http://sp-2:9000",
"weight": 50,
"source_node": "sp-2"
}Response 200
{ "status": "ok" }Worker node batch-uploads usage records collected locally.
Request
{
"source_node": "sp-2",
"records": [
{
"request_id": "uuid",
"user_id": "uuid",
"model": "claude-3-5-sonnet-20241022",
"input_tokens": 512,
"output_tokens": 128,
"status_code": 200,
"duration_ms": 980,
"created_at": "2025-06-01T12:00:00Z"
}
]
}Response 200
{ "status": "ok" }Returns the current peer routing table (useful for debugging).
Response 200
{
"peers": [
{
"id": "sp-2",
"addr": "http://sp-2:9000",
"weight": 50,
"last_seen": "2025-06-01T12:00:00Z"
}
]
}以 SSE(Server-Sent Events)方式实时推送 WARN 及以上级别的日志事件。连接保持打开,服务端持续推送;客户端断开后自动清理订阅。
Authentication: Bearer token (admin JWT)
Response headers
Content-Type: text/event-stream
Cache-Control: no-cache
Connection: keep-alive
Event format(每条日志一个 SSE 事件)
data: {"level":"WARN","msg":"upstream latency spike","ts":"2026-03-22T00:00:00Z","fields":{"target":"https://api.anthropic.com","latency_ms":4200}}
data: {"level":"ERROR","msg":"upstream returned 529","ts":"2026-03-22T00:01:05Z","fields":{"target":"https://api.openai.com","status":529}}
| Field | Description |
|---|---|
level |
日志级别:"WARN" 或 "ERROR" |
msg |
日志消息正文 |
ts |
事件时间戳(UTC,ISO 8601) |
fields |
结构化附加字段(随事件类型不同而变化) |
Error responses
| Status | error code |
Reason |
|---|---|---|
| 401 | unauthorized |
缺少或无效的 admin JWT |
Prometheus-format metrics. No authentication required (restrict via network policy).
# HELP pairproxy_tokens_total Total tokens processed
# TYPE pairproxy_tokens_total counter
pairproxy_tokens_today{type="input"} 1234567
pairproxy_tokens_today{type="output"} 345678
# HELP pairproxy_requests_today Total requests today
# TYPE pairproxy_requests_today gauge
pairproxy_requests_today{status="success"} 417
pairproxy_requests_today{status="error"} 3
# HELP pairproxy_active_users_today Unique users with at least one request today
# TYPE pairproxy_active_users_today gauge
pairproxy_active_users_today 12
# HELP pairproxy_cost_usd_today Estimated LLM cost today in USD
# TYPE pairproxy_cost_usd_today gauge
pairproxy_cost_usd_today 12.345678
# HELP pairproxy_tokens_month Total tokens this calendar month
# TYPE pairproxy_tokens_month gauge
pairproxy_tokens_month{type="input"} 12000000
pairproxy_tokens_month{type="output"} 3500000
# HELP pairproxy_requests_month Total requests this calendar month
# TYPE pairproxy_requests_month gauge
pairproxy_requests_month{status="success"} 8100
pairproxy_requests_month{status="error"} 47
Metrics are cached for 30 seconds to avoid excessive DB queries.
Liveness probe. No authentication required.
Response 200(v2.7.0 起响应包含更多运行状态字段)
{
"status": "ok",
"service": "sproxy",
"version": "2.18.0",
"uptime_seconds": 86400,
"active_requests": 3,
"usage_queue_depth": 12,
"db_type": "sqlite",
"draining": false
}| Field | Description |
|---|---|
status |
始终为 "ok"(服务存活) |
service |
服务标识,固定为 "sproxy" |
version |
当前运行版本号 |
uptime_seconds |
服务启动至今的运行秒数 |
active_requests |
当前正在处理的请求数 |
usage_queue_depth |
内部用量上报队列深度 |
db_type |
数据库类型,"sqlite" 或 "postgres" |
draining |
是否处于 drain 模式 |
All JSON error responses use the following envelope:
{
"error": "<machine-readable code>",
"message": "<human-readable description>"
}When a user exceeds their daily or monthly token quota, or request rate limit:
{
"error": "quota_exceeded",
"kind": "daily",
"current": 100500,
"limit": 100000,
"reset_at": "2025-06-02T00:00:00Z"
}kind is one of "daily", "monthly", or "rate_limit".
reset_at indicates when the quota window resets.
The following headers are also included:
X-RateLimit-Limit: 100000
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1748822400
X-RateLimit-Kind: daily
v2.18.0 — 语义路由功能通过外部分类器服务(默认监听
:9000)对请求内容进行意图分类,并将请求路由至匹配规则所指定的 LLM Target。规则持久化存储于数据库,优先级高的规则优先匹配。
Authentication: Bearer token (admin JWT)
列出所有语义路由规则。
Response 200
{
"rules": [
{
"id": "uuid",
"name": "code-tasks",
"description": "编程、调试、代码审查类任务",
"targets": ["https://api.anthropic.com"],
"priority": 10,
"enabled": true,
"source": "db"
}
]
}| Field | Description |
|---|---|
id |
规则唯一 ID(UUID) |
name |
规则名称(唯一) |
description |
规则描述,供分类器理解匹配意图 |
targets |
匹配后路由至的 LLM Target URL 列表(轮询) |
priority |
优先级,数值越大越先匹配 |
enabled |
是否启用 |
source |
"db"(来自数据库)或 "config"(来自配置文件,只读) |
创建新的语义路由规则。
Request
{
"name": "code-tasks",
"description": "编程、调试、代码审查类任务",
"targets": ["https://api.anthropic.com"],
"priority": 10
}| Field | Required | Description |
|---|---|---|
name |
✅ | 规则名称,须唯一 |
description |
✅ | 分类器用于意图匹配的描述文本 |
targets |
✅ | 至少包含一个 Target URL |
priority |
❌ | 优先级,默认 0 |
Response 201
{ "id": "uuid" }Error responses
| Status | error code |
Reason |
|---|---|---|
| 400 | invalid_request |
缺少必填字段或 targets 为空 |
| 409 | conflict |
同名规则已存在 |
更新指定规则(部分更新,仅传入需修改的字段)。
Path parameter: id — 规则 UUID
Request(各字段均为可选)
{
"description": "更新后的描述",
"targets": ["https://api.anthropic.com", "https://api.openai.com"],
"priority": 5,
"enabled": false
}Response 204 — No Content
Error responses
| Status | error code |
Reason |
|---|---|---|
| 404 | not_found |
规则 ID 不存在 |
删除指定规则。
Response 204 — No Content
Error responses
| Status | error code |
Reason |
|---|---|---|
| 404 | not_found |
规则 ID 不存在 |
启用指定规则(幂等)。
Response 204 — No Content
禁用指定规则(幂等)。禁用后该规则不参与路由匹配,但不会被删除。
Response 204 — No Content
查看语义路由系统当前运行状态。
Response 200
{
"enabled": true,
"classifier_url": "http://sproxy.internal:9000",
"rules_count": 5,
"fallback_count": 12,
"last_classification_ms": 45
}| Field | Description |
|---|---|
enabled |
语义路由功能是否全局启用 |
classifier_url |
分类器服务地址 |
rules_count |
当前已加载的启用规则数量 |
fallback_count |
自上次重启以来因无规则匹配而走 fallback 的请求数 |
last_classification_ms |
最近一次分类请求耗时(毫秒) |
v2.18.0 — 通过 CLI 管理语义路由规则,与 REST API 功能一一对应。
# 列出所有语义路由规则
sproxy admin semantic-router list
# 创建新规则
sproxy admin semantic-router add \
--name "code-tasks" \
--description "编程、调试、代码审查类任务" \
--targets "https://api.anthropic.com,https://api.openai.com" \
--priority 10
# 更新规则(部分更新,仅修改传入的字段)
sproxy admin semantic-router update <id> --priority 5 --enabled false
# 删除规则
sproxy admin semantic-router delete <id>
# 启用 / 禁用规则
sproxy admin semantic-router enable <id>
sproxy admin semantic-router disable <id>
# 查看运行状态
sproxy admin semantic-router status| 子命令 | 说明 |
|---|---|
list |
列出所有规则(含 ID、名称、优先级、启用状态) |
add |
创建新规则 |
update <id> |
部分更新指定规则 |
delete <id> |
删除规则 |
enable <id> |
启用规则 |
disable <id> |
禁用规则 |
status |
查看语义路由系统运行状态 |
v2.16.0+ — 训练语料采集功能,将符合条件的请求/响应对持久化至磁盘,用于后续模型微调或分类器训练。可按分组粒度启用或全局启用。
# 查看当前采集状态(全局及各分组)
sproxy admin corpus status
# 全局启用语料采集
sproxy admin corpus enable
# 仅为指定分组启用采集
sproxy admin corpus enable --group engineering
# 全局禁用语料采集
sproxy admin corpus disable
# 仅为指定分组禁用采集
sproxy admin corpus disable --group trial
# 列出已采集的语料文件(含路径、大小、时间)
sproxy admin corpus list| 子命令 | 选项 | 说明 |
|---|---|---|
status |
— | 查看全局及各分组的采集开关状态 |
enable |
--group <name> |
启用采集;省略 --group 则全局启用 |
disable |
--group <name> |
禁用采集;省略 --group 则全局禁用 |
list |
— | 列出 corpus 目录下所有语料文件 |
语料文件默认存储于数据库文件同级的 corpus/ 目录下,JSON 格式,命名规则与 track/ 目录保持一致:<UTC时间戳>-<requestID>.json。
| 版本 | 新增 / 变更 |
|---|---|
| v2.4.0 | sproxy admin track CLI — 按用户粒度记录对话内容至磁盘 |
| v2.7.0 | LLM Target 动态管理:POST /api/admin/llm/targets、PUT /api/admin/llm/targets/{url}(部分更新)、PUT /api/admin/llm/targets/{url}/enable、PUT /api/admin/llm/targets/{url}/disable、DELETE /api/admin/llm/targets/{url} |
| v2.7.0 | /health 响应新增 version、uptime_seconds、active_requests、usage_queue_depth、db_type、draining 字段 |
| v2.8.0 | GET /api/admin/alerts/stream — SSE 实时告警流 |
| v2.8.0 | POST /api/admin/import — 从模板文件批量导入用户/分组,支持 dry_run 预览 |
| v2.9.0 | GET /keygen/{username} — 为用户签发 sk-pp- API Key,支持 Direct Proxy 模式 |
| v2.9.0 | POST /keygen/verify — 验证 sk-pp- API Key 有效性 |
| v2.15.0 | Keygen 算法升级为 HMAC-SHA256,/keygen/ 端点产生确定性密钥,替换旧版指纹算法 |
| v2.16.0 | sproxy admin corpus CLI — 训练语料采集管理(status / enable / disable / list) |
| v2.18.0 | GET/POST /api/admin/semantic-router/rules — 语义路由规则 CRUD |
| v2.18.0 | PUT/DELETE /api/admin/semantic-router/rules/{id} — 规则更新与删除 |
| v2.18.0 | POST /api/admin/semantic-router/rules/{id}/enable|disable — 规则启用/禁用 |
| v2.18.0 | GET /api/admin/semantic-router/status — 语义路由系统状态查询 |
| v2.18.0 | sproxy admin semantic-router CLI — 语义路由规则完整命令行管理 |
| v2.24.0 | POST /api/admin/llm/targets 新增 supported_models([]string,支持通配符)、auto_model(string)字段 — Model-Aware Routing |
| v2.24.0 | PUT /api/admin/llm/targets/{url} 新增 supported_models、auto_model 可选字段(部分更新) |
| v2.24.0 | sproxy admin llm target add/update 新增 --supported-models、--auto-model 参数 |