<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->
dgx1_w2v2_base_finetune_teacher_babble_noise_libri_360_hours_50_epochs_batch_16
This model is a fine-tuned version of facebook/wav2vec2-base-960h on the None dataset. It achieves the following results on the evaluation set:
- Loss: 13.6162
- Wer: 0.0656
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0005
- train_batch_size: 16
- eval_batch_size: 1
- seed: 42
- gradient_accumulation_steps: 64
- total_train_batch_size: 1024
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.2
- num_epochs: 50
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss | Wer |
---|---|---|---|---|
948.9895 | 4.95 | 500 | 15.5681 | 0.0901 |
486.9617 | 9.9 | 1000 | 14.6740 | 0.0888 |
430.4173 | 14.85 | 1500 | 14.5798 | 0.0876 |
370.9503 | 19.8 | 2000 | 14.3314 | 0.0825 |
325.1611 | 24.75 | 2500 | 13.1245 | 0.0767 |
285.1508 | 29.7 | 3000 | 13.8376 | 0.0734 |
246.6912 | 34.65 | 3500 | 13.6600 | 0.0702 |
215.6232 | 39.6 | 4000 | 13.6056 | 0.0673 |
195.4624 | 44.55 | 4500 | 13.5280 | 0.0662 |
186.2427 | 49.5 | 5000 | 13.6162 | 0.0656 |
Framework versions
- Transformers 4.25.1
- Pytorch 1.12.1
- Datasets 2.8.0
- Tokenizers 0.13.2