anuragshas/wav2vec2-xls-r-300m-bn-cv9-with-lm - AI Model Zoo

This model is a fine-tuned version of facebook/wav2vec2-xls-r-300m on the MOZILLA-FOUNDATION/COMMON_VOICE_9_0 - BN dataset. It achieves the following results on the evaluation set:

Loss: 0.2297
Wer: 0.2850
Cer: 0.0660

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 7.5e-05
train_batch_size: 64
eval_batch_size: 64
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 128
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_ratio: 0.1
training_steps: 8692
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Wer	Cer
3.675	2.3	400	3.5052	1.0	1.0
3.0446	4.6	800	2.2759	1.0052	0.5215
1.7276	6.9	1200	0.7083	0.6697	0.1969
1.5171	9.2	1600	0.5328	0.5733	0.1568
1.4176	11.49	2000	0.4571	0.5161	0.1381
1.343	13.79	2400	0.3910	0.4522	0.1160
1.2743	16.09	2800	0.3534	0.4137	0.1044
1.2396	18.39	3200	0.3278	0.3877	0.0959
1.2035	20.69	3600	0.3109	0.3741	0.0917
1.1745	22.99	4000	0.2972	0.3618	0.0882
1.1541	25.29	4400	0.2836	0.3427	0.0832
1.1372	27.59	4800	0.2759	0.3357	0.0812
1.1048	29.89	5200	0.2669	0.3284	0.0783
1.0966	32.18	5600	0.2678	0.3249	0.0775
1.0747	34.48	6000	0.2547	0.3134	0.0748
1.0593	36.78	6400	0.2491	0.3077	0.0728
1.0417	39.08	6800	0.2450	0.3012	0.0711
1.024	41.38	7200	0.2402	0.2956	0.0694
1.0106	43.68	7600	0.2351	0.2915	0.0681
1.0014	45.98	8000	0.2328	0.2896	0.0673
0.9999	48.28	8400	0.2318	0.2866	0.0667

Framework versions

Transformers 4.19.0.dev0
Pytorch 1.11.0+cu102
Datasets 2.1.1.dev0
Tokenizers 0.12.1

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

NSDT 3DConvert

UnrealSynth

DreamTexture.js