flan-t5-base-fce-e8-b16

This model is a fine-tuned version of google/flan-t5-base on the None dataset. It achieves the following results on the evaluation set:

Loss: 0.3114
Rouge1: 86.9035
Rouge2: 79.2645
Rougel: 86.4197
Rougelsum: 86.4231
Gen Len: 14.8850

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 16
eval_batch_size: 16
seed: 42
optimizer: Adafactor
lr_scheduler_type: linear
num_epochs: 8

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum	Gen Len
0.4128	0.23	400	0.3457	86.8983	79.1632	86.3755	86.3944	14.8435
0.3783	0.45	800	0.3469	86.8995	78.8428	86.3368	86.3283	14.8955
0.3627	0.68	1200	0.3114	86.9035	79.2645	86.4197	86.4231	14.8850
0.3484	0.9	1600	0.3239	87.2292	79.8056	86.7218	86.7237	14.8759
0.2696	1.13	2000	0.3419	87.15	79.6016	86.6082	86.6241	14.8959
0.22	1.35	2400	0.3270	87.0232	79.4806	86.5137	86.5173	14.8868
0.2327	1.58	2800	0.3185	87.1028	79.6758	86.5985	86.6221	14.9005
0.2354	1.81	3200	0.3125	87.143	79.786	86.6545	86.6788	14.9010
0.2177	2.03	3600	0.3292	87.0858	79.5707	86.5451	86.5456	14.9133
0.1347	2.26	4000	0.3342	87.1768	79.9161	86.6402	86.6666	14.9142
0.1411	2.48	4400	0.3456	87.1049	79.9438	86.6152	86.6265	14.9110
0.1487	2.71	4800	0.3393	86.5182	78.468	86.0005	86.0283	14.8813
0.1498	2.93	5200	0.3347	87.2024	79.7098	86.6782	86.6904	14.8859
0.1055	3.16	5600	0.4027	87.1281	79.799	86.5714	86.5965	14.9105
0.0862	3.39	6000	0.4046	87.2721	79.8755	86.6838	86.6956	14.9073
0.0894	3.61	6400	0.3776	87.1508	79.865	86.6178	86.6424	14.8946
0.0942	3.84	6800	0.3781	87.2854	80.0876	86.7694	86.7867	14.8927
0.0816	4.06	7200	0.4300	87.3854	80.1162	86.8398	86.8446	14.8978
0.0582	4.29	7600	0.4201	87.2594	80.1824	86.7653	86.7807	14.9019
0.0588	4.51	8000	0.4129	87.3373	80.1802	86.8332	86.8414	14.9014
0.0571	4.74	8400	0.4437	87.2985	80.0215	86.8171	86.8238	14.8946
0.0587	4.97	8800	0.4019	87.2321	80.0933	86.6888	86.6931	14.9105
0.0381	5.19	9200	0.4822	87.2798	80.1822	86.7799	86.7886	14.9014
0.0378	5.42	9600	0.4831	87.409	80.3418	86.8845	86.8844	14.8927
0.0368	5.64	10000	0.4809	87.2276	79.9415	86.6776	86.6833	14.9105
0.0359	5.87	10400	0.4964	87.2916	80.1468	86.7693	86.7704	14.9028
0.0311	6.09	10800	0.5266	87.3443	80.1762	86.7852	86.7825	14.8991
0.0225	6.32	11200	0.5550	87.3142	80.2689	86.7856	86.7884	14.9037
0.0239	6.55	11600	0.5308	87.4003	80.2637	86.8373	86.8356	14.9023
0.0236	6.77	12000	0.5490	87.3865	80.3184	86.8563	86.8626	14.9037
0.0223	7.0	12400	0.5454	87.3842	80.2875	86.8109	86.8293	14.9055
0.0164	7.22	12800	0.5818	87.4641	80.3669	86.8908	86.9062	14.8964
0.0155	7.45	13200	0.5927	87.4191	80.3356	86.8541	86.8718	14.9014
0.0152	7.67	13600	0.5990	87.4257	80.2974	86.8481	86.8589	14.9005
0.0144	7.9	14000	0.6084	87.4754	80.3558	86.9086	86.9184	14.9014

Framework versions

Transformers 4.28.1
Pytorch 1.11.0a0+b6df043
Datasets 2.12.0
Tokenizers 0.13.3

flan-t5-base-fce-e8-b16

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

NSDT 3DConvert

UnrealSynth

DreamTexture.js