Output Formats

Vayu supports five output formats for transcription results.

Plain Text (`.txt`)

Simple text output with one segment per line.

vayu audio.mp3 --output-format txt

Hello, this is a test recording.
The weather today is sunny.

SRT (`.srt`)

SubRip subtitle format. Widely supported by video players and editors.

vayu audio.mp3 --output-format srt

1
00:00:00,000 --> 00:00:03,200
Hello, this is a test recording.

2
00:00:03,500 --> 00:00:06,100
The weather today is sunny.

With Word Highlighting

vayu audio.mp3 --output-format srt --word-timestamps True --highlight-words True

Words are underlined as they are spoken using <u> tags.

WebVTT (`.vtt`)

Web Video Text Tracks format. Used for HTML5 video subtitles.

vayu audio.mp3 --output-format vtt

WEBVTT

00:00:00.000 --> 00:00:03.200
Hello, this is a test recording.

00:00:03.500 --> 00:00:06.100
The weather today is sunny.

TSV (`.tsv`)

Tab-separated values. Easy to import into spreadsheets and databases.

vayu audio.mp3 --output-format tsv

start	end	text
0	3200	Hello, this is a test recording.
3500	6100	The weather today is sunny.

Timestamps are in milliseconds.

JSON (`.json`)

Full metadata including segments, tokens, probabilities, and word timestamps.

vayu audio.mp3 --output-format json

{
  "text": "Hello, this is a test recording. The weather today is sunny.",
  "segments": [
    {
      "id": 0,
      "seek": 0,
      "start": 0.0,
      "end": 3.2,
      "text": " Hello, this is a test recording.",
      "tokens": [50364, 2425, 11],
      "temperature": 0.0,
      "avg_logprob": -0.25,
      "compression_ratio": 1.3,
      "no_speech_prob": 0.02
    }
  ],
  "language": "en"
}

All Formats

Generate all formats at once:

vayu audio.mp3 --output-format all --output-dir ./transcripts

This creates audio.txt, audio.srt, audio.vtt, audio.tsv, and audio.json in the output directory.

Subtitle Formatting Options

These options apply to SRT and VTT outputs:

Option	Description
`--max-line-width`	Maximum characters per subtitle line
`--max-line-count`	Maximum number of lines per subtitle entry
`--max-words-per-line`	Maximum words per line
`--highlight-words`	Underline the active word (requires `--word-timestamps True`)

# Compact subtitles for mobile screens
vayu audio.mp3 -f srt --max-line-width 30 --max-line-count 2

# Word-by-word highlighting
vayu audio.mp3 -f vtt --word-timestamps True --highlight-words True

Programmatic Output

In Python, use the writer classes directly:

from whisper_mlx import WriteSRT, WriteVTT, WriteJSON

# After transcription
result = whisper.transcribe("audio.mp3")

# Write SRT file
writer = WriteSRT("./output")
writer(result, "my_audio", highlight_words=False)
# Creates: ./output/my_audio.srt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Output Formats

Output Formats

Plain Text (`.txt`)

SRT (`.srt`)

With Word Highlighting

WebVTT (`.vtt`)

TSV (`.tsv`)

JSON (`.json`)

All Formats

Subtitle Formatting Options

Programmatic Output

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally

Output Formats

Output Formats

Plain Text (.txt)

SRT (.srt)

With Word Highlighting

WebVTT (.vtt)

TSV (.tsv)

JSON (.json)

All Formats

Subtitle Formatting Options

Programmatic Output

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally

Plain Text (`.txt`)

SRT (`.srt`)

WebVTT (`.vtt`)

TSV (`.tsv`)

JSON (`.json`)