libri-alpha-0.75-Temp-1-attention-3-layers-distil-with-6-layers-loss-att-take-2

This model was trained from scratch on the None dataset. It achieves the following results on the evaluation set:

Loss: 26.4101
Wer: 0.2791

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0002
train_batch_size: 16
eval_batch_size: 4
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 32
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_ratio: 0.2
num_epochs: 40
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Wer
202.4293	0.45	200	26.7777	0.2779
197.6471	0.9	400	25.8300	0.2760
204.8931	1.35	600	25.6774	0.2747
193.3182	1.79	800	25.6049	0.2737
205.2241	2.24	1000	25.5552	0.2739
186.0407	2.69	1200	25.4364	0.2737
191.7055	3.14	1400	25.7949	0.2764
185.0721	3.59	1600	26.1202	0.2753
198.8579	4.04	1800	25.8496	0.2763
185.7877	4.48	2000	27.0753	0.2731
194.9394	4.93	2200	25.6920	0.2775
188.2296	5.38	2400	25.7362	0.2742
188.0202	5.83	2600	25.9170	0.2755
191.5541	6.28	2800	26.8590	0.2771
198.2817	6.73	3000	26.4101	0.2791

Framework versions

Transformers 4.24.0
Pytorch 1.12.1
Datasets 2.7.0
Tokenizers 0.11.0

libri-alpha-0.75-Temp-1-attention-3-layers-distil-with-6-layers-loss-att-take-2

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

NSDT 3DConvert

UnrealSynth

DreamTexture.js