Model Card for Model ID

This is a finetune for Whisper Small. A finetune to achieve better results on Whisper Small for Portuguese. Whisper is a pre-trained model for automatic speech recognition (ASR) and speech translation.

Model Details

Whisper is a Transformer based encoder-decoder model, also referred to as a sequence-to-sequence model. It was trained on 680k hours of labelled speech data annotated using large-scale weak supervision.

This is a finetune using Common Voice 13.0 to improve the results for PORTUGUESE.

Uses

This repository contains a fine-tuned version of the Whisper ASR (Automatic Speech Recognition) system developed by OpenAI. The model has been specifically fine-tuned to improve performance in portuguese language.

Out-of-Scope Use

While this model is powerful and versatile, it's important to understand its limitations and inappropriate uses:

  1. Misuse and Malicious Use: This model should not be used for any illegal activities, including but not limited to eavesdropping, illegal surveillance, or any other form of privacy invasion. It's also not intended for the creation or spread of misinformation, hate speech, or harmful content.

  2. Non-Portuguese Languages: While this model has been fine-tuned for Portuguese, it may not perform well with other languages. It's not recommended for transcribing multilingual content where languages other than Portuguese are spoken.

  3. Low-Quality Audio: The model's performance can be significantly affected by the quality of the input audio. It may not work well with low-quality audio, background noise, or speakers who are far away from the microphone.

Training Details

Training Procedure

Trained using the code from HF Whisper Event.

Training Hyperparameters

The following hyperparameters were used during training:

Evaluation

Wer on CV13.0: 10.3