feat(provider): add Morph as a gateway provider for large open-source models#4245
feat(provider): add Morph as a gateway provider for large open-source models#4245shreybirmiwalmorph wants to merge 6 commits into
Conversation
| }; | ||
|
|
||
| export const morph_minimax_m27_model: KiloExclusiveModel = { | ||
| public_id: 'morph/minimax-m2.7', |
There was a problem hiding this comment.
WARNING: Morph MiniMax models will fail for OpenCode clients
getGatewayOpenCodeSettings() treats any *minimax* model as ai_sdk_provider: 'anthropic', and CustomLlmProviderSchema documents that as the Messages API. Because MORPH.supportedChatApis only exposes chat_completions, requests for both morph/minimax-m2.7 and morph/minimax-m3 are rejected by apiKindNotSupportedResponse before they ever reach Morph. Either override these ids to an OpenAI-compatible provider or add /messages support for the Morph gateway.
Reply with @kilocode-bot fix it to have Kilo Code address this issue.
There was a problem hiding this comment.
Good catch — fixed in 40795bd.
getAiSdkProvider mapped any *minimax* id to the Anthropic Messages API, which the Morph gateway (chat_completions only) rejects via apiKindNotSupportedResponse. Rather than patch just the two MiniMax ids, I pinned every Morph model to openai-compatible at the top of getAiSdkProvider, encoding the gateway invariant and guarding the whole class (also protects future gpt/grok-named ids from being routed to the Responses API). Added regression coverage asserting all six Morph models resolve to a chat_completions-compatible OpenCode provider.
Also verified live against api.morphllm.com: all six models return 200 with correct output + usage, and the two vision-flagged models (qwen3.5-397b, minimax-m3) correctly process image input.
Code Review SummaryStatus: No Issues Found | Recommendation: Merge Files Reviewed (1 files)
Previous Review Summaries (6 snapshots, latest commit 40795bd)Current summary above is authoritative. Previous snapshots are kept for context only. Previous review (commit 40795bd)Status: No Issues Found | Recommendation: Merge Files Reviewed (2 files)
Previous review (commit aada3de)Status: No Issues Found | Recommendation: Merge Files Reviewed (2 files)
Previous review (commit f632630)Status: 1 Issues Found | Recommendation: Address before merge Overview
Issue Details (click to expand)WARNING
Files Reviewed (1 files)
Fix these issues in Kilo Cloud Previous review (commit aff4c19)Status: No Issues Found | Recommendation: Merge Files Reviewed (2 files)
Previous review (commit f942d39)Status: 1 Issues Found | Recommendation: Address before merge Overview
Issue Details (click to expand)WARNING
Files Reviewed (4 files)
Fix these issues in Kilo Cloud Previous review (commit deb5357)Status: 1 Issues Found | Recommendation: Address before merge Overview
Issue Details (click to expand)WARNING
Files Reviewed (4 files)
Reviewed by gpt-5.4-20260305 · Input: 47.3K · Output: 5.1K · Cached: 201.7K Review guidance: REVIEW.md from base branch |
… models Adds Morph (https://api.morphllm.com/v1, OpenAI-compatible) as a first-party gateway provider and registers the large open-source models Morph serves on its own fleet: Qwen3.5 397B, Qwen3.6 27B, MiniMax M2.7, MiniMax M3, GLM-5.2, and DeepSeek V4 Flash. Excludes Morph's proprietary models (apply/v3, compactor, warp-grep, embeddings). Routing reuses the existing gateway->provider resolution in get-provider.ts; 'morph' is already a recognized inference-provider id. Kilo holds the key via MORPH_API_KEY. Pricing and context windows mirror Morph's published numbers. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
deb5357 to
f942d39
Compare
Reconcile Morph gateway pricing against the canonical source (https://www.morphllm.com/api/models/json + landing/src/lib/pricing.ts cost functions): - GLM-5.2: set input_cache_read to $0.35/1M. Morph's calculateChatGlm52Cost bills cached input at this rate (LMCache prefix reuse); the models JSON currently omits the field, so the prior null under-priced cache reads. - Verified the other five models: qwen3.5 keeps its $0.30 cache rate; the rest (qwen3.6, minimax m2.7/m3, deepseek-v4-flash) have no cache-read billing in Morph's cost functions, so they correctly stay null. The minimax cache rates present in MODEL_PRICING config are never applied by the cost functions, so they are intentionally not exposed here. Adds morph.test.ts covering: Morph provider config, the six registered open-source models (and exclusion of proprietary apply/compactor/etc.), gateway->provider resolution to PROVIDERS.MORPH (the same lookup get-provider.ts uses), and per-model pricing. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Match Morph's canonical context (MODEL_CONTEXT_WINDOWS / api/models/json): GLM-5.2 serves a 1,048,576-token window, not 450k. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…l source Reconcile all six Morph gateway models against the canonical source of truth (https://www.morphllm.com/api/models/json). Per-1M prompt/completion/cache-read rates and context windows already match exactly; no pricing change needed. The canonical JSON marks qwen3.5-397b and minimax-m3 as image-capable (input_modalities: ["text","image"]), so add the 'vision' flag to those two and assert vision support (and reasoning) for every model in the test matrix. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ents getAiSdkProvider maps any '*minimax*' id to the Anthropic Messages API (and gpt/grok ids to the OpenAI Responses API). Morph's gateway only exposes chat_completions (MORPH.supportedChatApis), so OpenCode requests for morph/minimax-m2.7 and morph/minimax-m3 were rejected by apiKindNotSupportedResponse before reaching Morph. Pin every Morph model to the 'openai-compatible' AI SDK provider at the top of getAiSdkProvider, encoding the gateway's real invariant and guarding the whole bug class (not just minimax). Add regression coverage asserting all six Morph models resolve to a chat_completions-compatible OpenCode provider. Reported-by: kilo-code-bot on PR Kilo-Org#4245. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…t_mUsd Add cost-calculation coverage proving the stored per-million rates produce the right micro-USD bills, not just that the numbers are present: - qwen3.5/glm-5.2 cache reads bill at the discounted cache_read rate (< prompt) - models without a cache_read price (e.g. deepseek-v4-flash) fall back to the prompt rate for cache hits (not free) - a mixed uncached + cache-hit + output bill matches the expected total Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
What
Adds Morph as a first-party gateway provider and registers the large open-source models Morph serves on its own inference fleet, routed through Morph's OpenAI-compatible gateway (
https://api.morphllm.com/v1).Per the discussion with @ari, this is the gateway option (Kilo holds the key via
MORPH_API_KEY). A follow-up PR will add the BYOK option (user-supplied Morph key) so both are available.Models added
morph/qwen3.5-397bmorph-qwen35-397bmorph/qwen3.6-27bmorph-qwen36-27bmorph/minimax-m2.7morph-minimax27-230bmorph/minimax-m3morph-minimax3-428bmorph/glm-5.2morph-glm52-744bmorph/deepseek-v4-flashmorph-dsv4flashMorph's proprietary models (apply/v3, compactor, warp-grep, embeddings) are intentionally excluded.
How
providers/types.ts— add'morph'toProviderId.providers/provider-definitions.ts— add theMORPHprovider (OpenAI-compatiblechat_completions).providers/morph.ts(new) — the six models asKiloExclusiveModelwithgateway: 'morph'.models.ts— register them inkiloExclusiveModels.Routing reuses the existing
gateway-> provider resolution inget-provider.ts;'morph'is already a recognized inference-provider id, sogetInferenceProvider()is unaffected.Deploy requirement
Set
MORPH_API_KEYin the gateway environment.Notes
MiniMax M3pricing is provisional pending GA confirmation; happy to update.max_completion_tokenscurrently equals each model's context window; can tighten to real output caps on request.