This project processes videos by extracting audio, detecting speech segments, transcribing with Whisper, removing duplicate phrases, and compiling a final edited video.
- Python 3.6 or higher
- ffmpeg must be installed and available in your system PATH (the script will try to use imageio-ffmpeg if you haven't read this).
- Git for cloning the repository.
- Clone the repository:
git clone git@github.com:justinkahrs/video-supercut.git
- Navigate into the project directory:
cd video-supercut - Create and activate a virtual environment:
python3 -m venv venv source ./venv/bin/activate - Install dependencies:
pip install -r requirements.txt
NOISE: The noise level environment variable (low,medium, orhigh). The script measures the audio's average RMS in dBFS and adjusts the silence threshold accordingly.MIN_SPEECH_DURATION: The minimum duration (in milliseconds) that a segment must last to be considered speech.MARGIN_DURATION: An extra margin (in milliseconds) added before and after detected speech segments to ensure that speech is fully captured.RAW_FOLDER: The directory where raw.mp4video files should be placed.EDITED_FOLDER: The directory where the final edited videos will be saved.
-
Add Video Files:
mkdir raw- Place your
.mp4video files in therawfolder.
-
Run the Script:
python3 main.py
The script will process the videos and output the edited videos to the
edited_videosfolder.
- The first run may download the necessary Whisper model weights.
- Temporary files are stored in the system's temporary directory.
- Ensure that ffmpeg is installed and configured correctly on your system.