flan-t5-base-gecfirst-e8-b16

This model is a fine-tuned version of google/flan-t5-base on the None dataset. It achieves the following results on the evaluation set:

Loss: 0.2233
Rouge1: 42.0185
Rouge2: 34.0704
Rougel: 42.0403
Rougelsum: 41.8957
Gen Len: 18.9865

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 16
eval_batch_size: 16
seed: 42
optimizer: Adafactor
lr_scheduler_type: linear
num_epochs: 8

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum	Gen Len
0.9817	0.25	74	0.4035	38.3182	28.427	38.3215	38.2591	18.9882
0.6805	0.5	148	0.3467	39.9659	30.9972	40.04	39.9619	18.9831
0.5885	0.75	222	0.3205	40.4848	31.33	40.4959	40.3976	18.9797
0.5476	1.0	296	0.2869	40.2589	31.6407	40.3592	40.223	18.9831
0.4504	1.25	370	0.2754	40.7626	31.8985	40.755	40.6406	18.9831
0.4463	1.49	444	0.2650	40.908	32.2358	40.9207	40.8062	18.9831
0.4155	1.74	518	0.2561	41.056	32.4906	41.029	40.9401	18.9831
0.3948	1.99	592	0.2493	41.1813	32.7917	41.2183	41.1256	18.9831
0.329	2.24	666	0.2413	41.8005	33.7235	41.88	41.7556	18.9831
0.3195	2.49	740	0.2390	41.5207	33.2502	41.5599	41.4291	18.9848
0.3148	2.74	814	0.2344	41.5913	33.398	41.614	41.4909	18.9831
0.316	2.99	888	0.2266	41.6858	33.7369	41.731	41.6293	18.9831
0.2498	3.24	962	0.2353	41.7077	33.3652	41.7111	41.6256	18.9848
0.2534	3.49	1036	0.2299	41.8645	33.9926	41.9435	41.8168	18.9848
0.2435	3.74	1110	0.2233	42.0185	34.0704	42.0403	41.8957	18.9865
0.2514	3.99	1184	0.2292	41.9069	33.8917	41.9112	41.7937	18.9831
0.193	4.24	1258	0.2462	41.9671	34.0261	42.024	41.9178	18.9831
0.1927	4.48	1332	0.2322	42.2226	34.6158	42.3306	42.1946	18.9848
0.1984	4.73	1406	0.2278	41.9762	34.0828	41.999	41.9107	18.9848
0.1929	4.98	1480	0.2299	41.8244	33.831	41.8673	41.7786	18.9848
0.1522	5.23	1554	0.2432	41.9142	33.9634	41.9635	41.859	18.9848
0.1509	5.48	1628	0.2408	41.707	33.6909	41.7144	41.6345	18.9831
0.1457	5.73	1702	0.2426	42.1729	34.2971	42.2318	42.119	18.9848
0.1497	5.98	1776	0.2386	42.1408	34.3303	42.148	42.0599	18.9865
0.1195	6.23	1850	0.2627	41.897	34.0092	41.9336	41.8262	18.9865
0.1145	6.48	1924	0.2560	41.8456	33.7951	41.8578	41.7709	18.9865
0.1198	6.73	1998	0.2525	41.8393	33.6033	41.8313	41.7505	18.9831
0.114	6.98	2072	0.2524	41.8194	33.7992	41.8752	41.775	18.9848
0.1	7.23	2146	0.2690	41.9724	33.9339	42.0248	41.9444	18.9848
0.0948	7.47	2220	0.2715	41.8806	33.9232	41.9392	41.8432	18.9848
0.0947	7.72	2294	0.2722	41.8981	33.8642	41.9622	41.877	18.9848
0.0904	7.97	2368	0.2705	41.8596	33.9216	41.915	41.8342	18.9848

Framework versions

Transformers 4.28.1
Pytorch 1.11.0a0+b6df043
Datasets 2.12.0
Tokenizers 0.13.3

flan-t5-base-gecfirst-e8-b16

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

NSDT 3DConvert

UnrealSynth

DreamTexture.js