Is your feature request related to a problem?
Once STT evaluation results are populated, there are no automated quality metrics. Reviewers must manually compare transcriptions against ground truth, with no quantitative measure of transcription accuracy.
Describe the solution you'd like
Compute automated metrics for each stt_result by comparing its transcription column against the stt_sample.ground_truth (linked via stt_sample_id foreign key):
- WER (Word Error Rate) — word-level accuracy
- CER (Character Error Rate) — character-level accuracy
- Lenient WER — WER after script-aware normalization (handles Unicode inconsistencies in Indic scripts)
- WIP (Word Information Preserved) — proportion of correctly recognized words
Is your feature request related to a problem?
Once STT evaluation results are populated, there are no automated quality metrics. Reviewers must manually compare transcriptions against ground truth, with no quantitative measure of transcription accuracy.
Describe the solution you'd like
Compute automated metrics for each
stt_resultby comparing itstranscriptioncolumn against thestt_sample.ground_truth(linked viastt_sample_idforeign key):