generated_from_trainer

<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->

libri-alpha-0.75-Temp-1-attention-3-layers-distil-with-6-layers-att-take-2

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

Training results

Training Loss Epoch Step Validation Loss Wer
860.105 3.6 400 37.1616 0.4129
640.8073 7.21 800 38.0915 0.3892
578.6465 10.81 1200 41.3839 0.3648
478.3375 14.41 1600 43.8448 0.3231
400.7667 18.02 2000 35.5516 0.3103
348.2905 21.62 2400 41.2895 0.2954
297.0109 25.22 2800 42.0566 0.2889
262.856 28.83 3200 38.5730 0.2779
227.4767 32.43 3600 42.4243 0.2657
200.5691 36.04 4000 42.0213 0.2675
181.1116 39.64 4400 42.0471 0.2540

Framework versions