mt5-large-gramatika-final-e8-b16

This model is a fine-tuned version of google/mt5-large on the None dataset. It achieves the following results on the evaluation set:

Loss: 0.1999
Rouge1: 66.308
Rouge2: 58.8739
Rougel: 66.1027
Rougelsum: 66.1039
Gen Len: 18.5592

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 16
eval_batch_size: 16
seed: 42
optimizer: Adafactor
lr_scheduler_type: linear
num_epochs: 8

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum	Gen Len
0.8846	0.37	300	0.2954	64.6179	55.294	64.2807	64.2792	18.5597
0.3711	0.73	600	0.2474	65.6388	57.2663	65.3219	65.3365	18.5592
0.2874	1.1	900	0.2193	65.8689	57.6871	65.5424	65.5719	18.5603
0.1953	1.46	1200	0.2131	66.0438	57.8166	65.7565	65.7705	18.5409
0.1919	1.83	1500	0.1999	66.308	58.8739	66.1027	66.1039	18.5592
0.1487	2.2	1800	0.2034	66.5939	59.0628	66.3361	66.3475	18.5592
0.1132	2.56	2100	0.2010	67.0441	59.8117	66.8455	66.8562	18.5487
0.1087	2.93	2400	0.2001	67.0048	59.7807	66.7885	66.7972	18.5535
0.0681	3.29	2700	0.2143	67.2327	60.2527	67.0047	67.0106	18.5556
0.0621	3.66	3000	0.2093	67.357	60.51	67.1561	67.1709	18.5466
0.062	4.02	3300	0.2157	67.4353	60.7193	67.2526	67.2554	18.5624
0.036	4.39	3600	0.2208	67.5469	60.8111	67.3457	67.3472	18.5503
0.0351	4.76	3900	0.2282	67.3835	60.4009	67.138	67.1612	18.5561
0.0297	5.12	4200	0.2370	67.4004	60.5787	67.2004	67.2087	18.5603
0.0193	5.49	4500	0.2446	67.5339	60.6808	67.3484	67.3737	18.5577
0.0185	5.85	4800	0.2483	67.5055	60.8104	67.3217	67.3443	18.5566
0.0134	6.22	5100	0.2563	67.5748	60.9475	67.3996	67.4081	18.5597
0.0114	6.59	5400	0.2585	67.6337	61.0146	67.4553	67.472	18.5482
0.0099	6.95	5700	0.2622	67.6613	61.037	67.4761	67.4843	18.5498
0.0067	7.32	6000	0.2728	67.7996	61.2206	67.6194	67.6282	18.5561
0.0052	7.68	6300	0.2802	67.8009	61.2862	67.6178	67.6357	18.5545

Framework versions

Transformers 4.30.1
Pytorch 1.11.0a0+b6df043
Datasets 2.12.0
Tokenizers 0.13.3

mt5-large-gramatika-final-e8-b16

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

NSDT 3DConvert

UnrealSynth

DreamTexture.js