whisper-event generated_from_trainer

Whisper Large V2 Portuguese 🇧🇷🇵🇹

Bem-vindo ao whisper large-v2 para transcrição em português 👋🏻

Transcribe Portuguese audio to text with the highest precision.

This model is a fine-tuned version of openai/whisper-large-v2 on the mozilla-foundation/common_voice_11 dataset. If you want a lighter model, you may be interested in jlondonobo/whisper-medium-pt. It achieves faster inference with almost no difference in WER.

Comparable models

Reported WER is based on the evaluation subset of Common Voice.

Model WER # Parameters
jlondonobo/whisper-large-v2-pt 5.590 🤗 1550M
openai/whisper-large-v2 6.300 1550M
jlondonobo/whisper-medium-pt 6.579 769M
jonatasgrosman/wav2vec2-large-xlsr-53-portuguese 11.310 317M
Edresson/wav2vec2-large-xlsr-coraa-portuguese 20.080 317M

Training hyperparameters

We used the following hyperparameters for training:

Training results

Training Loss Epoch Step Validation Loss Wer
0.0828 1.09 1000 0.1868 6.778
0.0241 3.07 2000 0.2057 6.109
0.0084 5.06 3000 0.2367 6.029
0.0015 7.04 4000 0.2469 5.709
0.0009 9.02 5000 0.2821 5.590 🤗

Framework versions