Skip to content

Allow self-hosted / custom STT endpoint instead of hardcoded OpenAI / ChatGPT #128

@L0GYKAL

Description

@L0GYKAL

Summary

Allow configuring a custom Speech-to-Text endpoint (URL + model + auth header) so users can route voice transcription to a self-hosted Whisper / Faster-Whisper / whisper.cpp server instead of OpenAI / ChatGPT.

Motivation

Voice transcription is currently hardcoded to OpenAI / ChatGPT endpoints:

  • apps/ios/Sources/Litter/Models/VoiceTranscriptionManager.swifthttps://chatgpt.com/backend-api/transcribe and https://api.openai.com/v1/audio/transcriptions.
  • apps/android/app/src/main/java/com/litter/android/state/VoiceTranscriptionManager.kt — same two endpoints, model field gpt-4o-mini-transcribe.

For users who:

  • Run their own Whisper/whisper.cpp/Faster-Whisper server on their LAN or VPN,
  • Are subject to data-residency / privacy constraints,
  • Don't want voice audio leaving their network,
  • Or already pay for a different STT vendor (Deepgram, AssemblyAI, etc. with OpenAI-compatible endpoints),

…there is no way to redirect transcription without forking the apps.

Concrete ask

Per-server (or global) STT config with at least:

  • stt.endpoint — full URL of the transcription endpoint.
  • stt.model — model field sent in the multipart form (default gpt-4o-mini-transcribe for OpenAI-compat servers, configurable).
  • stt.auth_header — optional, e.g. Authorization: Bearer <token> for self-hosted servers behind an auth proxy. Empty/None for open LAN deployments.
  • (Optional) stt.disabled for users who want to suppress voice transcription entirely.

When set, the existing transcribe(wav:authMethod:token:) path uses these instead of the hardcoded ChatGPT/OpenAI URLs. The OpenAI multipart contract (form fields file, model, plus standard transcription params) is the same as what whisper.cpp's server example and many self-hosted Whisper wrappers already implement, so for OpenAI-compatible servers no protocol change is needed — only URL/model/auth.

For genuinely non-OpenAI-shaped APIs (e.g. raw whisper.cpp), a small adapter layer per provider could come later; the immediate win is OpenAI-compatible endpoint redirection.

Why this is worth solving in Litter

Alternatives considered

  • MITM proxy on the network rewriting chatgpt.com/api.openai.com to a local server — fragile, needs a custom CA on the device, breaks the moment Litter pins certs.
  • Fork the apps — defeats the point of using the upstream client.

Happy to test on iOS/Android beta channels once landed.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions