Fine-tuning of wav2vec2-base on 100h of Librispeech training data. Results on "clean" data are very similar to the ones of the official model. However, the result on "other" is significantly worse - the model seems to have overfitting to the "clean" data.

Model was trained on librispeech-clean-train.100 with following hyper-parameters:


Result (WER) on Librispeech test:

"clean" "other"
6.5 18.7