Voice-to-text dictation for macOS, powered by AI
Press and hold a key to record. Release to transcribe. Text appears instantly.
TalkFlow is a native macOS menu bar app that transforms your voice into text using OpenAI's Whisper. Choose between cloud-based transcription via the OpenAI API or completely private, on-device transcription using local Whisper models—no internet required. Hold down a trigger key (default: Right Command), speak, and release—your transcription is automatically pasted into whatever app you're using.
- Press-and-hold activation — Hold the trigger key to record, release to transcribe. No clicking required.
- Instant paste — Transcribed text is automatically inserted into your focused input field.
- Local or cloud transcription — Use on-device Whisper models for free, private transcription, or the OpenAI API for cloud-based processing.
- Configurable shortcuts — Remap the trigger to any modifier key, single key, or key combination.
- Smart audio processing — Voice activity detection and silence removal for clean, efficient transcriptions.
- History with search — All transcriptions are saved locally with full-text search.
- Privacy-focused — Audio is never saved to disk. API keys stored securely in macOS Keychain.
- Visual feedback — Floating indicator shows recording, processing, and completion states.
- Multi-monitor support — Indicator appears on whichever display your cursor is on.
- macOS 15 (Sequoia) or later
- For cloud transcription: OpenAI API key with access to the Whisper API
- For local transcription: Apple Silicon Mac recommended (Intel Macs supported but slower)
# Clone the repository
git clone https://github.com/jcampuza/TalkFlow.git
cd TalkFlow
# Build the app
./Scripts/build-app.sh release
# Launch
open .build/release/TalkFlow.appOr build and launch in one step:
./Scripts/build-app.sh release --run- Launch TalkFlow — The app runs in your menu bar.
- Grant permissions — You'll be prompted for:
- Accessibility: Required for global shortcuts and text insertion
- Microphone: Required for audio capture
- Choose your transcription mode:
- Local (recommended) — Download a Whisper model and transcribe entirely on-device. Free and private.
- Cloud — Enter your OpenAI API key to use the Whisper API.
- Start dictating — Hold Right Command (or your configured trigger), speak, and release.
TalkFlow supports on-device transcription using WhisperKit, which runs OpenAI's Whisper models locally via Apple's Core ML. This means:
- Completely free — No API costs, transcribe as much as you want
- Private — Audio never leaves your device
- Works offline — No internet connection required after model download
| Model | Quality | Best For |
|---|---|---|
| Tiny | Basic | Quick notes, simple dictation |
| Small | Good | General use, balanced speed/quality |
| Large v3 Turbo | Best | Accuracy-critical work, complex vocabulary |
Note: The first transcription after downloading a model may take 10-30 seconds as the model compiles and warms up on your specific hardware. This is a one-time process—subsequent transcriptions are fast (typically under 2 seconds for short recordings).
To use local transcription:
- Open Settings → Transcription
- Select Local as your transcription mode
- Choose a model and click Download
- Once downloaded, the model is ready to use
- Apple Silicon Macs deliver the best performance with local models
- Larger models produce better results but require more memory and processing time
- Keep the model loaded — TalkFlow keeps the model in memory between transcriptions for faster response times
| Action | Result |
|---|---|
| Hold trigger key | Start recording (after 300ms) |
| Release trigger key | Stop recording and transcribe |
| Press another key while holding | Cancel recording |
| Click menu bar icon | Open menu with recent transcriptions |
| State | Appearance |
|---|---|
| Recording | Pulsing red/orange |
| Processing | Blue with spinner |
| Success | Green checkmark |
| Error | Red with message |
| No Speech | Yellow/orange |
Access settings from the menu bar icon → Settings:
- Shortcut — Configure your trigger key
- Audio — Select input device, adjust silence threshold
- Transcription — Choose local or cloud mode, download/manage models, set language preference
- Output — Toggle punctuation stripping
- Dictionary — Add custom words and phrases to improve transcription accuracy
- Appearance — Indicator visibility and position
- Transcription history:
~/Library/Application Support/TalkFlow/transcriptions.sqlite - Local models:
~/Library/Containers/com.josephcampuzano.TalkFlow/Data/Documents/huggingface/ - Logs:
~/Library/Logs/TalkFlow/talkflow.log - API key: macOS Keychain (never stored in plain text)
# Debug build
swift build
# Run tests
swift test
# Build app bundle
./Scripts/build-app.sh
# Build and launch
./Scripts/build-app.sh --run- Swift 6 with modern concurrency (async/await, actors)
- SwiftUI for all UI components
- AVFoundation for audio capture
- Accelerate/vDSP for signal processing
- WhisperKit for on-device Whisper transcription via Core ML
- GRDB.swift for SQLite storage with full-text search
MIT

