Skip to content

BYOK custom models: no documented way to set or auto-detect context/input window size #1267

Description

@tlerbao

Summary

There is no documented, working way to set (or auto-detect) the input / context window size for BYOK custom models in ~/.factory/settings.json. Droid appears to apply an internal default (~239K–250K) to custom models regardless of the model's real context window, and none of the documented settings change this.

This is both a question (what is the intended mechanism?) and a feature request (please make it configurable per custom model, or auto-detect it).

Environment

  • Droid CLI (latest), macOS (darwin 22.4.0)
  • BYOK custom models via an OpenAI/Anthropic-compatible proxy
  • Config: ~/.factory/settings.json using camelCase customModels

What I expected

For a custom model whose upstream really supports a larger context window (e.g. Anthropic 1M, or a proxy that exposes more), Droid would either:

  1. Auto-detect the context window from the provider, or
  2. Let me declare it explicitly via a documented field (e.g. maxInputTokens / contextWindow).

What actually happens

  • The BYOK docs only document these fields for customModels: model, displayName, baseUrl, apiKey, provider, maxOutputTokens, noImageSupport, extraArgs, extraHeaders. There is no input/context-window field documented.
  • /context shows every custom model capped at roughly the same ~239K/250K value, i.e. an internal default, not the model's real window.
  • I tried the only token-threshold settings that are documented under "Context and compaction" (compactionTokenLimit / compactionTokenLimitPerModel) in https://docs.factory.ai/cli/configuration/settings . After restarting Droid, they had no effect on the displayed context window / compaction meter for custom models.

Example config I tried (no effect on the context window after restart):

{
  "customModels": [
    {
      "model": "claude-opus-4-8",
      "id": "custom:Claude-Opus-4.8-[Sub]-0",
      "provider": "anthropic",
      "baseUrl": "http://<proxy>",
      "apiKey": "...",
      "maxOutputTokens": 32000
    }
  ],
  "compactionTokenLimitPerModel": {
    "custom:Claude-Opus-4.8-[Sub]-0": 900000
  }
}

The compaction meter still showed 35K/239K for this model, i.e. the default was still applied.

This has been asked before with no answer

The exact same question was raised in the widely-referenced BYOK proxy gist and never got an answer:

"Quick question on the Factory side: is maxInputTokens actually supported for custom models in ~/.factory/settings.json? The BYOK docs only list model, displayName, baseUrl, apiKey, provider, maxOutputTokens, noImageSupport, extraArgs, and extraHeaders for customModels. I couldn't find maxInputTokens documented anywhere, but Droid clearly has an internal idea of input/context limits for built-in models. ... what's the exact field name? does it affect model selection / context checks / truncation? or is maxOutputTokens the only token-limit field users are meant to set manually?"

Source: https://gist.github.com/chandika/c4b64c5b8f5e29f6112021d46c159fdd?permalink_comment_id=6084094#gistcomment-6084094

Questions

  1. Is there any field (documented or not) to set the input/context window for a custom model? If so, what is the exact name and does it affect context checks / truncation / compaction timing?
  2. Is compactionTokenLimit / compactionTokenLimitPerModel supposed to work for BYOK custom models? If yes, what is the exact key to use (the custom model id, the model value, or something else)?
  3. If neither is supported, is the ~239K/250K value a hard-coded default for all custom models?

Proposed solution / feature request

Please support one of the following for BYOK custom models:

  • A documented maxInputTokens (or contextWindow) field on each customModels entry, or
  • Auto-detection of the context window from the provider/proxy, or
  • At minimum, make compactionTokenLimitPerModel actually govern the effective context window for custom models, and document the exact key format to use.

This matters for users on proxies/subscriptions where the real upstream window is larger (or smaller, e.g. GPT-5.5 at 258K) than Droid's internal default. Right now there is no reliable client-side lever, and users can only guess with undocumented fields.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions