Skip to content

feat: add local/ routing prefix for model IDs with embedded slashes#3165

Open
petterreinholdtsen wants to merge 1 commit into
ultraworkers:mainfrom
petterreinholdtsen:local-api-routing
Open

feat: add local/ routing prefix for model IDs with embedded slashes#3165
petterreinholdtsen wants to merge 1 commit into
ultraworkers:mainfrom
petterreinholdtsen:local-api-routing

Conversation

@petterreinholdtsen
Copy link
Copy Markdown

Summary

Some backends (e.g., gpt.uio.no running Ollama) have model IDs containing slashes like Qwen/Qwen3.6-27B-FP8. The existing routing prefix logic would strip 'Qwen/' as a provider prefix, sending only the remainder on the wire and causing 403 Forbidden from the server.

The new local/ prefix is an escape hatch: it strips just 'local/' and sends everything after it verbatim, preserving embedded slashes in model IDs.

Usage: --model local/Qwen/Qwen3.6-27B-FP8

This patch was developed with assistance from OpenCode and local Qwen 3.6 API server.

Anti-slop triage

I do not undersnad these fields:

  • Classification:
  • Evidence:
  • Non-destructive review result:

Verification

I do not understand the two first of these check list points.

  • Targeted tests/docs checks ran, or the gap is explicitly recorded.
  • git diff --check passes.
  • No live secrets, tokens, private logs, or unrelated generated churn are included.

Resolution gate

I do not understand these check list points:

  • If this PR resolves an issue, the issue number and fix evidence are linked.
  • If this PR should not merge, the rejection/defer rationale is evidence-backed and does not rely on vibes.
  • I did not merge/close remote PRs or issues from an automation lane without owner approval.

I believe this is related to #3036, but unsure if it fix it.

Some backends (e.g., gpt.uio.no running Ollama) have model IDs containing
slashes like Qwen/Qwen3.6-27B-FP8. The existing routing prefix logic would
strip 'Qwen/' as a provider prefix, sending only the remainder on the wire
and causing 403 Forbidden from the server.

The new local/ prefix is an escape hatch: it strips just 'local/' and sends
everything after it verbatim, preserving embedded slashes in model IDs.

Usage: --model local/Qwen/Qwen3.6-27B-FP8

This patch was developed with assistance from OpenCode and local Qwen 3.6
API server.
@petterreinholdtsen
Copy link
Copy Markdown
Author

I asked opencode to keep notes on what it was solving, and here is what it ended up with for this patch. I've removed secrets and private DNS names.

Crash: "Access denied to model" on corporate API server (routing prefix stripping) [SOLVED]

  • Error: api returned 403 Forbidden: Access denied to model: Qwen3.6-27B-FP8
  • Root cause: Corporate API server has model IDs containing slashes like Qwen/Qwen3.6-27B-FP8. The existing routing prefix logic in wire_model_for_base_url() stripped Qwen/ as a known provider prefix, sending only Qwen3.6-27B-FP8 on the wire — which the server doesn't recognize (403). Additionally, validate_model_syntax() rejected 3+ parts when split by /, so even if you tried to work around it, the CLI would refuse the model name.
  • Key files:
    • rust/crates/api/src/providers/openai_compat.rs:920-933strip_routing_prefix() now accepts "local" as a routing prefix
    • rust/crates/api/src/providers/openai_compat.rs:964-975wire_model_for_base_url() handles "local/" escape hatch
    • rust/crates/rusty-claude-cli/src/main.rs:1654-1658validate_model_syntax() now allows embedded slashes for local/ prefix

What was done

  • Added "local" to the routing prefixes in both strip_routing_prefix() and wire_model_for_base_url(). The local/ prefix is an escape hatch: it strips just local/ and sends everything after it verbatim on the wire, preserving embedded slashes in model IDs.
  • Updated validate_model_syntax() to allow more than 2 parts when split by / for models starting with local/, so local/Qwen/Qwen3.6-27B-FP8 passes validation.
  • Use: --model local/Qwen/Qwen3.6-27B-FP8 sends the full ID on the wire.

Test command

echo "Say hello." | OPENAI_BASE_URL=https://corporate.example.com/api/v1/chat/completions OPENAI_API_KEY=secret ~/src/claw-code-upstream/rust/target/debug/claw --model local/Qwen/Qwen3.6-27B-FP8

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant