automatic-speech-recognition openslr_SLR66 generated_from_trainer robust-speech-event hf-asr-leaderboard

<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->

This model is a fine-tuned version of facebook/wav2vec2-xls-r-300m on the OPENSLR_SLR66 - NA dataset. It achieves the following results on the evaluation set:

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

Training results

Training Loss Epoch Step Validation Loss Wer
3.0304 4.81 500 1.5676 1.0554
1.5263 9.61 1000 0.4693 0.8023
1.5299 14.42 1500 0.4368 0.7311
1.5063 19.23 2000 0.4360 0.7302
1.455 24.04 2500 0.4213 0.6692
1.4755 28.84 3000 0.4329 0.5943
1.352 33.65 3500 0.4074 0.5765
1.3122 38.46 4000 0.3866 0.5630
1.2799 43.27 4500 0.3860 0.5480
1.212 48.08 5000 0.3590 0.5317
1.1645 52.88 5500 0.3283 0.4757
1.0854 57.69 6000 0.3162 0.4687
1.0292 62.5 6500 0.3126 0.4416
0.9607 67.31 7000 0.2990 0.4066
0.9156 72.12 7500 0.2870 0.4009
0.8329 76.92 8000 0.2791 0.3909
0.7979 81.73 8500 0.2770 0.3670
0.7144 86.54 9000 0.2841 0.3661
0.6997 91.35 9500 0.2721 0.3485
0.6568 96.15 10000 0.2681 0.3437

Framework versions