speller-t5-4

This model is a fine-tuned version of sberbank-ai/ruT5-base on the None dataset. It achieves the following results on the evaluation set:

Loss: 0.1871
Rouge1: 17.2619
Rouge2: 7.5893
Rougel: 17.5595
Rougelsum: 17.5595
Gen Len: 42.25

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 1
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum	Gen Len
0.9773	0.04	500	0.5651	14.7321	5.2264	14.7863	14.8471	47.2321
0.8463	0.07	1000	0.4230	16.3628	5.6052	16.3158	16.4325	47.9018
0.6458	0.11	1500	0.3528	16.2099	5.5195	16.2034	16.3225	47.5179
0.6147	0.14	2000	0.3269	16.313	5.7216	16.313	16.4242	47.2232
0.5102	0.18	2500	0.3012	16.6071	6.0119	16.6239	16.5792	43.1696
0.4585	0.21	3000	0.2823	16.6295	6.0714	16.6741	16.6071	47.25
0.4801	0.25	3500	0.2748	16.8779	6.3885	16.8779	16.8779	44.5268
0.4721	0.29	4000	0.2605	17.1947	7.4353	17.3867	17.3867	42.7054
0.4132	0.32	4500	0.2530	17.2619	7.5605	17.5054	17.5054	42.9286
0.4255	0.36	5000	0.2495	17.1503	7.4107	17.3363	17.3363	42.5625
0.3952	0.39	5500	0.2424	17.2619	7.4702	17.4479	17.4479	42.5089
0.3229	0.43	6000	0.2354	17.2619	7.5605	17.5054	17.5054	44.0268
0.4474	0.47	6500	0.2310	17.2619	7.5335	17.4545	17.4545	42.5625
0.3736	0.5	7000	0.2300	17.2619	7.5335	17.4545	17.4545	42.4286
0.332	0.54	7500	0.2133	17.2619	7.5622	17.5085	17.5085	42.4732
0.3347	0.57	8000	0.2148	17.2619	7.5605	17.5054	17.5054	42.5
0.4257	0.61	8500	0.2093	17.2619	7.5605	17.5054	17.5054	42.3482
0.3072	0.64	9000	0.2009	17.2619	7.5893	17.5595	17.5595	42.3661
0.3184	0.68	9500	0.2028	17.2619	7.5893	17.5595	17.5595	42.4464
0.3013	0.72	10000	0.2083	17.2619	7.5893	17.5595	17.5595	42.2589
0.3202	0.75	10500	0.2056	17.2619	7.5893	17.5595	17.5595	42.4911
0.2689	0.79	11000	0.2020	17.2619	7.5893	17.5595	17.5595	42.8304
0.4168	0.82	11500	0.1962	17.2619	7.5893	17.5595	17.5595	42.2054
0.287	0.86	12000	0.1930	17.2619	7.5893	17.5595	17.5595	42.1875
0.3515	0.9	12500	0.1899	17.2619	7.5893	17.5595	17.5595	42.1875
0.2713	0.93	13000	0.1868	17.2619	7.5893	17.5595	17.5595	42.3304
0.2914	0.97	13500	0.1871	17.2619	7.5893	17.5595	17.5595	42.25

Framework versions

Transformers 4.26.0
Pytorch 1.13.1+cu116
Datasets 2.9.0
Tokenizers 0.13.2

speller-t5-4

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

NSDT 3DConvert

UnrealSynth

DreamTexture.js