<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->
timit-distil-kl-alpha-0.75-T-1
This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:
- Loss: 163.0668
- Wer: 0.4560
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 14
- eval_batch_size: 14
- seed: 42
- gradient_accumulation_steps: 2
- total_train_batch_size: 28
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 1000
- num_epochs: 30
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss | Wer |
---|---|---|---|---|
478.744 | 0.61 | 100 | 538.3250 | 0.9915 |
378.7181 | 1.21 | 200 | 369.1372 | 1.1333 |
322.5338 | 1.82 | 300 | 300.0477 | 1.0331 |
273.2232 | 2.42 | 400 | 259.7523 | 0.8517 |
240.6502 | 3.03 | 500 | 232.7382 | 0.7743 |
223.6016 | 3.64 | 600 | 215.9651 | 0.7051 |
201.5882 | 4.24 | 700 | 204.9062 | 0.6621 |
202.3899 | 4.85 | 800 | 196.9740 | 0.6338 |
183.4185 | 5.45 | 900 | 191.3831 | 0.6006 |
179.6837 | 6.06 | 1000 | 186.5637 | 0.5794 |
168.6271 | 6.67 | 1100 | 184.0338 | 0.5780 |
165.3212 | 7.27 | 1200 | 180.1232 | 0.5470 |
162.448 | 7.88 | 1300 | 178.5354 | 0.5453 |
154.0758 | 8.48 | 1400 | 176.6070 | 0.5281 |
160.8933 | 9.09 | 1500 | 174.8729 | 0.5245 |
148.5513 | 9.7 | 1600 | 174.3866 | 0.5165 |
150.4218 | 10.3 | 1700 | 172.3834 | 0.5150 |
146.6692 | 10.91 | 1800 | 171.4406 | 0.5060 |
144.0717 | 11.52 | 1900 | 170.7044 | 0.5053 |
148.1728 | 12.12 | 2000 | 169.8454 | 0.5013 |
134.3326 | 12.73 | 2100 | 169.4328 | 0.4957 |
142.6348 | 13.33 | 2200 | 168.3971 | 0.4943 |
136.7947 | 13.94 | 2300 | 168.1558 | 0.4899 |
137.4703 | 14.55 | 2400 | 167.1046 | 0.4842 |
134.6324 | 15.15 | 2500 | 167.1108 | 0.4789 |
129.9845 | 15.76 | 2600 | 166.7391 | 0.4814 |
137.7542 | 16.36 | 2700 | 166.1870 | 0.4799 |
129.4632 | 16.97 | 2800 | 166.2481 | 0.4745 |
135.0696 | 17.58 | 2900 | 165.3251 | 0.4737 |
128.6716 | 18.18 | 3000 | 165.2547 | 0.4681 |
130.0308 | 18.79 | 3100 | 165.0811 | 0.4694 |
127.9053 | 19.39 | 3200 | 164.8373 | 0.4663 |
124.5187 | 20.0 | 3300 | 164.7788 | 0.4661 |
132.1731 | 20.61 | 3400 | 164.4737 | 0.4665 |
124.8417 | 21.21 | 3500 | 164.2796 | 0.4641 |
129.376 | 21.82 | 3600 | 163.9702 | 0.4638 |
125.4888 | 22.42 | 3700 | 164.0341 | 0.4627 |
126.7772 | 23.03 | 3800 | 163.8773 | 0.4594 |
123.2558 | 23.64 | 3900 | 163.5976 | 0.4584 |
122.6634 | 24.24 | 4000 | 163.5653 | 0.4581 |
128.5773 | 24.85 | 4100 | 163.3437 | 0.4586 |
121.5595 | 25.45 | 4200 | 163.4164 | 0.4579 |
125.9294 | 26.06 | 4300 | 163.3195 | 0.4563 |
122.0572 | 26.67 | 4400 | 163.1707 | 0.4550 |
123.4701 | 27.27 | 4500 | 163.2227 | 0.4572 |
127.0724 | 27.88 | 4600 | 163.1163 | 0.4568 |
120.6483 | 28.48 | 4700 | 163.0764 | 0.4565 |
128.5629 | 29.09 | 4800 | 163.0516 | 0.4560 |
120.0566 | 29.7 | 4900 | 163.0668 | 0.4560 |
Framework versions
- Transformers 4.25.1
- Pytorch 1.12.1
- Datasets 2.8.0
- Tokenizers 0.13.2