libri-alpha-0.75-Temp-1-attention-3-layers-distil-with-6-layers-att-take-2

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

Loss: 42.0471
Wer: 0.2540

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.002
train_batch_size: 64
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 128
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_ratio: 0.2
num_epochs: 40
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Wer
860.105	3.6	400	37.1616	0.4129
640.8073	7.21	800	38.0915	0.3892
578.6465	10.81	1200	41.3839	0.3648
478.3375	14.41	1600	43.8448	0.3231
400.7667	18.02	2000	35.5516	0.3103
348.2905	21.62	2400	41.2895	0.2954
297.0109	25.22	2800	42.0566	0.2889
262.856	28.83	3200	38.5730	0.2779
227.4767	32.43	3600	42.4243	0.2657
200.5691	36.04	4000	42.0213	0.2675
181.1116	39.64	4400	42.0471	0.2540

Framework versions

Transformers 4.25.1
Pytorch 1.12.1
Datasets 2.8.0
Tokenizers 0.13.2

libri-alpha-0.75-Temp-1-attention-3-layers-distil-with-6-layers-att-take-2

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

NSDT 3DConvert

UnrealSynth

DreamTexture.js