Skip to content

thatgeeman/whisperx-transcribe

Repository files navigation

Notes

This is still a WIP as of 10.08.2025

WhisperX Transcription for Notetaking maniacs and Planners.

Usage

After cloning the repo and setting up the env: pip install . to install wxt command.

For sample audio placed in assets/sample

wxt assets/sample/audio.mp3

For other supported options see wxt --help.

Models

  • Summarization model: Any available on Ollama (developed with gemma3:4b)
  • Transcription model: WhisperX Large v3
  • Diarization model:

In retrieval mode, based on MTEB ranking:

  • Text embedding model: Qwen3-Embedding-0.6B

Info

Cluster backend at work does not support float16 computation (Quadro P4000)

Environment Setup

Python 3.10.

Install ffmpeg, rust, cudnn=8.9.7 (faster-whisper-large-v3 looks for `libcudnn_ops_infer.so.8).

Setup ollama based on latest instructions from https://github.com/ollama/ollama/tree/main/docs

See pyproject.toml for python dependencies.

HuggingFace

Accept terms for

Add huggingface token to .env file:

MY_TOKEN=hf_xxx

Alternatives

Replicate provides inference. See colab.

Freemium (referral): Otter.ai

Video to mp3

To strip audio (mp3) from video file, you can use the following command:

ffmpeg -i input_video.mp4 -f mp3 -vn -ar 44100 output_audio.mp3
  • -vn disables video recording, -ar sets the audio sample rate.
  • -f mp3 specifies the output format as mp3.

Troubleshooting

  • ctranslate2: ImportError: libctranslate2-d3638643.so.4.4.0: cannot enable executable stack as shared object requires: Invalid argument shared object error. Fixed this issue with this on Manjaro-xfce.

  • If the huggingface_hub download takes longer, found that it was easier to just clone the repo, for example with their cli: hf download Systran/faster-whisper-large-v3 and place it in the cache_dir, model/ in this case. The faster-whisper also depends on an older huggingface_hub version that does not come with xet. The download only appears slower due to an inner loop with tqdm, which sometimes does not appear to update the outer, main progress bar -- probably related to xet's chunked downloads.

About

Experiments with whisperX

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published