mt5-base-gramatika-final-e8-b16

This model is a fine-tuned version of google/mt5-base on the None dataset. It achieves the following results on the evaluation set:

Loss: 0.2117
Rouge1: 66.7567
Rouge2: 59.3343
Rougel: 66.4993
Rougelsum: 66.5275
Gen Len: 18.5566

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 16
eval_batch_size: 16
seed: 42
optimizer: Adafactor
lr_scheduler_type: linear
num_epochs: 8

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum	Gen Len
0.9122	0.37	300	0.3395	63.1315	53.1537	62.8285	62.8152	18.5833
0.4611	0.73	600	0.2870	64.8744	56.0545	64.604	64.6011	18.5676
0.3866	1.1	900	0.2690	65.2446	56.534	64.9389	64.9484	18.5414
0.2833	1.46	1200	0.2424	65.6718	57.2619	65.4044	65.4076	18.5566
0.2633	1.83	1500	0.2240	65.7057	57.6829	65.4464	65.4601	18.5524
0.2126	2.2	1800	0.2350	66.1634	58.4004	65.9254	65.9147	18.5582
0.1787	2.56	2100	0.2176	66.4508	58.8845	66.1886	66.199	18.5571
0.175	2.93	2400	0.2151	66.1987	58.632	65.9844	65.995	18.5603
0.1231	3.29	2700	0.2227	66.6365	59.1886	66.4067	66.4293	18.5571
0.1195	3.66	3000	0.2117	66.7567	59.3343	66.4993	66.5275	18.5566
0.1146	4.02	3300	0.2197	66.9385	59.8666	66.7575	66.7651	18.5556
0.0757	4.39	3600	0.2235	66.8918	59.768	66.7208	66.7282	18.5608
0.0772	4.76	3900	0.2270	67.0955	59.9474	66.8681	66.8905	18.5566
0.0688	5.12	4200	0.2431	67.2444	60.2703	67.0501	67.0676	18.5550
0.0512	5.49	4500	0.2439	67.198	60.2026	67.0128	67.0433	18.5535
0.0523	5.85	4800	0.2362	67.3463	60.4479	67.1385	67.1792	18.5592
0.0408	6.22	5100	0.2587	67.4973	60.7533	67.305	67.3418	18.5624
0.0324	6.59	5400	0.2502	67.6102	60.905	67.428	67.4547	18.5566
0.0336	6.95	5700	0.2583	67.531	60.7718	67.355	67.3762	18.5587
0.0236	7.32	6000	0.2710	67.5641	60.7633	67.3445	67.3835	18.5603
0.0222	7.68	6300	0.2729	67.5898	60.8587	67.3926	67.4234	18.5608

Framework versions

Transformers 4.30.1
Pytorch 1.11.0a0+b6df043
Datasets 2.12.0
Tokenizers 0.13.3

mt5-base-gramatika-final-e8-b16

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

NSDT 3DConvert

UnrealSynth

DreamTexture.js