timit-distil-kl-alpha-0.75-T-1

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

Loss: 163.0668
Wer: 0.4560

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 14
eval_batch_size: 14
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 28
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 1000
num_epochs: 30
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Wer
478.744	0.61	100	538.3250	0.9915
378.7181	1.21	200	369.1372	1.1333
322.5338	1.82	300	300.0477	1.0331
273.2232	2.42	400	259.7523	0.8517
240.6502	3.03	500	232.7382	0.7743
223.6016	3.64	600	215.9651	0.7051
201.5882	4.24	700	204.9062	0.6621
202.3899	4.85	800	196.9740	0.6338
183.4185	5.45	900	191.3831	0.6006
179.6837	6.06	1000	186.5637	0.5794
168.6271	6.67	1100	184.0338	0.5780
165.3212	7.27	1200	180.1232	0.5470
162.448	7.88	1300	178.5354	0.5453
154.0758	8.48	1400	176.6070	0.5281
160.8933	9.09	1500	174.8729	0.5245
148.5513	9.7	1600	174.3866	0.5165
150.4218	10.3	1700	172.3834	0.5150
146.6692	10.91	1800	171.4406	0.5060
144.0717	11.52	1900	170.7044	0.5053
148.1728	12.12	2000	169.8454	0.5013
134.3326	12.73	2100	169.4328	0.4957
142.6348	13.33	2200	168.3971	0.4943
136.7947	13.94	2300	168.1558	0.4899
137.4703	14.55	2400	167.1046	0.4842
134.6324	15.15	2500	167.1108	0.4789
129.9845	15.76	2600	166.7391	0.4814
137.7542	16.36	2700	166.1870	0.4799
129.4632	16.97	2800	166.2481	0.4745
135.0696	17.58	2900	165.3251	0.4737
128.6716	18.18	3000	165.2547	0.4681
130.0308	18.79	3100	165.0811	0.4694
127.9053	19.39	3200	164.8373	0.4663
124.5187	20.0	3300	164.7788	0.4661
132.1731	20.61	3400	164.4737	0.4665
124.8417	21.21	3500	164.2796	0.4641
129.376	21.82	3600	163.9702	0.4638
125.4888	22.42	3700	164.0341	0.4627
126.7772	23.03	3800	163.8773	0.4594
123.2558	23.64	3900	163.5976	0.4584
122.6634	24.24	4000	163.5653	0.4581
128.5773	24.85	4100	163.3437	0.4586
121.5595	25.45	4200	163.4164	0.4579
125.9294	26.06	4300	163.3195	0.4563
122.0572	26.67	4400	163.1707	0.4550
123.4701	27.27	4500	163.2227	0.4572
127.0724	27.88	4600	163.1163	0.4568
120.6483	28.48	4700	163.0764	0.4565
128.5629	29.09	4800	163.0516	0.4560
120.0566	29.7	4900	163.0668	0.4560

Framework versions

Transformers 4.25.1
Pytorch 1.12.1
Datasets 2.8.0
Tokenizers 0.13.2

timit-distil-kl-alpha-0.75-T-1

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

NSDT 3DConvert

UnrealSynth

DreamTexture.js