Skip to content

itchat/videoCaptioner

Repository files navigation

Video Captioner

An open-source video subtitle generation and translation tool optimized for Apple Silicon.

Features

  • Speech Recognition: MLX-optimized Parakeet TDT 0.6B model for high-accuracy transcription
    • v3: 25 languages (recommended)
    • v2: English only
  • ASR Model Manager: Built-in model download with pause/resume support
  • Batch Translation: Optimized subtitle translation using OpenAI or Google Translate
    • Reduces API calls from thousands to single digits per video
    • Exponential backoff retry with automatic fallback
  • Apple Silicon Optimization: Native MLX framework acceleration for 2x faster processing
  • Real-time Progress Tracking: Fine-grained progress updates during processing

Installation

Prerequisites

  • macOS (Apple Silicon recommended for MLX acceleration)
  • Python 3.13+
  • ffmpeg: brew install ffmpeg

Setup

conda create -n video python=3.13
conda activate video
pip install -r requirements.txt

Usage

# Run from source
python src/main.py

# Build .app bundle
bash main.sh

System Requirements

  • Recommended: Apple Silicon Mac with 16GB+ RAM
  • Minimum: 8GB RAM
  • Storage: ~1.2GB for ASR model (first-time download)

Architecture

Core Components

  • Video Pipeline: Sequential processing with progress callbacks
  • Speech Recognition: Parakeet TDT 0.6B model with MLX framework acceleration
  • Translation Pipeline: Batch processing with exponential backoff retry and automatic fallback
  • ASR Model Manager: Version management, download with pause/resume, cache detection
  • Configuration: Platform-aware defaults with persistent user settings

Error Handling

  • Exponential backoff retry for transient failures
  • Automatic Google Translate fallback when OpenAI fails

Configuration

Configuration file location: ~/Library/Application Support/videoCaptioner/config.json

Key settings:

  • model_version: ASR model version (v3 or v2)
  • max_retries: Retry attempts for failed translations (default: 3)
  • enable_google_fallback: Automatic fallback to Google Translate (default: true)

Note: This project was developed as a vibe coding experiment in collaboration with Claude Code.

About

A bilingual video subtitle generator for macOS.

Resources

Stars

Watchers

Forks

Packages

No packages published