An open-source video subtitle generation and translation tool optimized for Apple Silicon.
- Speech Recognition: MLX-optimized Parakeet TDT 0.6B model for high-accuracy transcription
- v3: 25 languages (recommended)
- v2: English only
- ASR Model Manager: Built-in model download with pause/resume support
- Batch Translation: Optimized subtitle translation using OpenAI or Google Translate
- Reduces API calls from thousands to single digits per video
- Exponential backoff retry with automatic fallback
- Apple Silicon Optimization: Native MLX framework acceleration for 2x faster processing
- Real-time Progress Tracking: Fine-grained progress updates during processing
- macOS (Apple Silicon recommended for MLX acceleration)
- Python 3.13+
- ffmpeg:
brew install ffmpeg
conda create -n video python=3.13
conda activate video
pip install -r requirements.txt# Run from source
python src/main.py
# Build .app bundle
bash main.sh- Recommended: Apple Silicon Mac with 16GB+ RAM
- Minimum: 8GB RAM
- Storage: ~1.2GB for ASR model (first-time download)
- Video Pipeline: Sequential processing with progress callbacks
- Speech Recognition: Parakeet TDT 0.6B model with MLX framework acceleration
- Translation Pipeline: Batch processing with exponential backoff retry and automatic fallback
- ASR Model Manager: Version management, download with pause/resume, cache detection
- Configuration: Platform-aware defaults with persistent user settings
- Exponential backoff retry for transient failures
- Automatic Google Translate fallback when OpenAI fails
Configuration file location: ~/Library/Application Support/videoCaptioner/config.json
Key settings:
model_version: ASR model version (v3 or v2)max_retries: Retry attempts for failed translations (default: 3)enable_google_fallback: Automatic fallback to Google Translate (default: true)
Note: This project was developed as a vibe coding experiment in collaboration with Claude Code.