GitHub - isurulucky/whisper_based_sinhala_asr: Fine tuning a Whisper model to perform sinhala speech recognition

This repo contains sample code for fine tuning an OpenAI Whisper (https://arxiv.org/abs/2212.04356) model with Sinhala langauge audio and transcriptions available at https://openslr.org/52/.

The audio_folder_creation.py script creates a metadata file named metadata.csv in the same location as a single unzipped data directory (ex: asr_sinhala_0.zip). The file containing the utterences utt_spk_text.tsv should be copied to this location.
The whisper_sinhala_fine_tuning.ipynb python notebook can be used to fine tune the model. This notebook has been tested in Google Colab, and expects the resulting data from the above step to be available as a data.zip fil in Google Drive.

Once unzipped, the notebook expects data/ directory to contain the following:

          |---- train
                  |---- *.flac
                  '
                  |---- metadata.csv
          |---- test
                  |---- *.flac
                  '
                  |---- metadata.csv

The metadata.csv file contains the name of the audio file with the transcription.

file_name,transcription
010009989d.flac,හෝටල්වල ගිනි මැල හදනවා.
010062fad4.flac,මරණින් මතු පැවැත්ම

This notebook is a modified version from the hugging face tutorial on fie tuning Whisper: https://huggingface.co/blog/fine-tune-whisper. Please follow the information in the original huggingface blog post for installing required python dependecies.

The losses and WER for a fine tuning process over 1000 steps using one dataset (asr_sinhala_0.zip) from https://openslr.org/52/ is shown below:

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
img		img
LICENSE		LICENSE
README.md		README.md
audio_folder_creation.py		audio_folder_creation.py
whisper_sinhala_fine_tuning.ipynb		whisper_sinhala_fine_tuning.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

About

Uh oh!

Releases

Packages

Languages

License

isurulucky/whisper_based_sinhala_asr

Folders and files

Latest commit

History

Repository files navigation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages