<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->
dgx1_w2v2_base_finetune_teacher_babble_noise_libri_360_hours_50_epochs_batch_16
This model is a fine-tuned version of facebook/wav2vec2-base-960h on the None dataset. It achieves the following results on the evaluation set:
- Loss: 13.6162
- Wer: 0.0656
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0005
- train_batch_size: 16
- eval_batch_size: 1
- seed: 42
- gradient_accumulation_steps: 64
- total_train_batch_size: 1024
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.2
- num_epochs: 50
- mixed_precision_training: Native AMP
Training results
| Training Loss | Epoch | Step | Validation Loss | Wer | 
|---|---|---|---|---|
| 948.9895 | 4.95 | 500 | 15.5681 | 0.0901 | 
| 486.9617 | 9.9 | 1000 | 14.6740 | 0.0888 | 
| 430.4173 | 14.85 | 1500 | 14.5798 | 0.0876 | 
| 370.9503 | 19.8 | 2000 | 14.3314 | 0.0825 | 
| 325.1611 | 24.75 | 2500 | 13.1245 | 0.0767 | 
| 285.1508 | 29.7 | 3000 | 13.8376 | 0.0734 | 
| 246.6912 | 34.65 | 3500 | 13.6600 | 0.0702 | 
| 215.6232 | 39.6 | 4000 | 13.6056 | 0.0673 | 
| 195.4624 | 44.55 | 4500 | 13.5280 | 0.0662 | 
| 186.2427 | 49.5 | 5000 | 13.6162 | 0.0656 | 
Framework versions
- Transformers 4.25.1
- Pytorch 1.12.1
- Datasets 2.8.0
- Tokenizers 0.13.2
 
       
      