Skip to content

Output Formats

Behnam Ebrahimi edited this page Mar 29, 2026 · 1 revision

Output Formats

Vayu supports five output formats for transcription results.

Plain Text (.txt)

Simple text output with one segment per line.

vayu audio.mp3 --output-format txt
Hello, this is a test recording.
The weather today is sunny.

SRT (.srt)

SubRip subtitle format. Widely supported by video players and editors.

vayu audio.mp3 --output-format srt
1
00:00:00,000 --> 00:00:03,200
Hello, this is a test recording.

2
00:00:03,500 --> 00:00:06,100
The weather today is sunny.

With Word Highlighting

vayu audio.mp3 --output-format srt --word-timestamps True --highlight-words True

Words are underlined as they are spoken using <u> tags.

WebVTT (.vtt)

Web Video Text Tracks format. Used for HTML5 video subtitles.

vayu audio.mp3 --output-format vtt
WEBVTT

00:00:00.000 --> 00:00:03.200
Hello, this is a test recording.

00:00:03.500 --> 00:00:06.100
The weather today is sunny.

TSV (.tsv)

Tab-separated values. Easy to import into spreadsheets and databases.

vayu audio.mp3 --output-format tsv
start	end	text
0	3200	Hello, this is a test recording.
3500	6100	The weather today is sunny.

Timestamps are in milliseconds.

JSON (.json)

Full metadata including segments, tokens, probabilities, and word timestamps.

vayu audio.mp3 --output-format json
{
  "text": "Hello, this is a test recording. The weather today is sunny.",
  "segments": [
    {
      "id": 0,
      "seek": 0,
      "start": 0.0,
      "end": 3.2,
      "text": " Hello, this is a test recording.",
      "tokens": [50364, 2425, 11],
      "temperature": 0.0,
      "avg_logprob": -0.25,
      "compression_ratio": 1.3,
      "no_speech_prob": 0.02
    }
  ],
  "language": "en"
}

All Formats

Generate all formats at once:

vayu audio.mp3 --output-format all --output-dir ./transcripts

This creates audio.txt, audio.srt, audio.vtt, audio.tsv, and audio.json in the output directory.

Subtitle Formatting Options

These options apply to SRT and VTT outputs:

Option Description
--max-line-width Maximum characters per subtitle line
--max-line-count Maximum number of lines per subtitle entry
--max-words-per-line Maximum words per line
--highlight-words Underline the active word (requires --word-timestamps True)
# Compact subtitles for mobile screens
vayu audio.mp3 -f srt --max-line-width 30 --max-line-count 2

# Word-by-word highlighting
vayu audio.mp3 -f vtt --word-timestamps True --highlight-words True

Programmatic Output

In Python, use the writer classes directly:

from whisper_mlx import WriteSRT, WriteVTT, WriteJSON

# After transcription
result = whisper.transcribe("audio.mp3")

# Write SRT file
writer = WriteSRT("./output")
writer(result, "my_audio", highlight_words=False)
# Creates: ./output/my_audio.srt

Clone this wiki locally