Japanese_Fine_Tuned_Whisper_Model
This model is a fine-tuned version of openai/whisper-tiny on the Common Voice dataset. It achieves the following results on the evaluation set:
- Loss: 0.549100
- Wer: 225.233037
Model description
The tiny Whisper model is fine-tuned on Japanese speech samples from the Common Voice dataset, based on which users can perform Automatic Speech Recognition in real time in Japanese.
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 1e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 100
- training_steps: 1000
- mixed_precision_training: Native AMP
Training results
Training Loss | Step | Validation Loss | Wer |
---|---|---|---|
0.8097 | 200 | 0.801917 | 601.560806 |
0.7200 | 400 | 0.783436 | 327.335790 |
0.6810 | 600 | 0.759281 | 254.064600 |
0.7351 | 800 | 0.747759 | 241.426404 |
0.5491 | 1000 | 0.747127 | 225.233037 |
Framework versions
- Transformers 4.27.0.dev0
- Pytorch 1.13.1+cu116
- Datasets 2.10.1
- Tokenizers 0.13.2