-
Notifications
You must be signed in to change notification settings - Fork 0
Comparison
| Feature | Vayu | whisper.cpp | faster-whisper | OpenAI Whisper |
|---|---|---|---|---|
| Platform | macOS (Apple Silicon) | Cross-platform | Cross-platform | Cross-platform |
| Backend | MLX | GGML/Metal/CUDA | CTranslate2/CUDA | PyTorch |
| Speed (Apple Silicon) | 3-5x faster | ~2x faster | N/A (CPU only on Mac) | Baseline |
| Speed (NVIDIA GPU) | N/A | ~2x faster | 4-5x faster | Baseline |
| Language | Python | C/C++ | Python | Python |
| Install |
uv pip install / pip install
|
Build from source | pip install |
pip install |
| Python API | Yes | Via bindings | Yes | Yes |
| CLI | Yes | Yes | No | Yes |
| Word timestamps | Yes | Yes | Yes | Yes |
| Batched decoding | Yes | No | Yes | No |
| Quantization | 4-bit, 8-bit | 4-bit, 5-bit, 8-bit | 8-bit, 16-bit | No |
| Speculative decoding | Yes (experimental) | No | No | No |
| Output formats | txt, srt, vtt, tsv, json | txt, srt, vtt, csv | Custom | txt, srt, vtt, tsv, json |
| Models | All Whisper + distil | All Whisper | All Whisper + distil | All Whisper |
Choose Vayu if:
- You have an Apple Silicon Mac (M1/M2/M3/M4)
- You want the fastest possible transcription on macOS
- You need a simple
pip install/uv pip install+ Python API - You want batched decoding for throughput
Choose whisper.cpp if:
- You need cross-platform support (Windows/Linux/Mac)
- You want minimal dependencies (pure C++)
- You're deploying on edge devices or embedded systems
- You need CoreML or Metal support on Mac
Choose faster-whisper if:
- You have an NVIDIA GPU
- You need the fastest transcription on Linux/Windows
- You want a Python API with CTranslate2 optimization
Choose OpenAI Whisper if:
- You want the reference implementation
- You need compatibility with existing Whisper code
- Platform performance isn't a priority
On Apple Silicon Macs:
Fastest ──────────────────────────── Slowest
Vayu (batched) > whisper.cpp > OpenAI Whisper
3-5x ~2x 1x
On NVIDIA GPUs:
Fastest ──────────────────────────── Slowest
faster-whisper > whisper.cpp > OpenAI Whisper
4-5x ~2x 1x
Vayu's core innovation. Processes multiple 30-second audio segments in a single forward pass instead of one at a time. This is only efficient on hardware with high parallel compute — Apple Silicon's unified memory architecture makes it particularly effective.
Built directly on Apple's MLX framework, which is optimized for Apple Silicon's unified memory, Neural Engine, and GPU. No translation layers or compatibility shims.
Unique to Vayu. Uses a small model (e.g., tiny) to draft tokens, verified by a large model (e.g., large-v3) in a single pass. Potential 2-3x additional speedup when draft model is accurate.
One-line install, simple Python API (LightningWhisperMLX), and full-featured CLI. No need to compile from source or configure GPU drivers.