speller-t5-big-2

This model is a fine-tuned version of sberbank-ai/ruT5-base on the None dataset. It achieves the following results on the evaluation set:

Loss: 0.1711
Rouge1: 22.619
Rouge2: 10.523
Rougel: 22.619
Rougelsum: 22.619
Gen Len: 42.9107

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 1
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum	Gen Len
1.244	0.04	500	0.5814	18.4902	6.4123	18.3883	18.5119	48.8214
0.6967	0.07	1000	0.4315	20.0	7.2173	20.0744	19.9702	47.0357
0.6362	0.11	1500	0.3721	21.1905	8.514	21.131	21.1607	47.3929
0.5561	0.14	2000	0.3265	22.0238	9.29	21.9643	21.994	45.6696
0.5094	0.18	2500	0.3049	22.0238	9.29	21.9643	21.994	46.0
0.429	0.21	3000	0.2858	22.0238	9.29	21.9643	21.994	44.9464
0.4557	0.25	3500	0.2696	22.1726	9.4388	22.0238	22.0982	45.2054
0.4268	0.29	4000	0.2565	22.1726	9.4388	22.0238	22.0982	44.5268
0.3955	0.32	4500	0.2480	22.1726	9.4388	22.0238	22.0982	44.2589
0.3672	0.36	5000	0.2387	22.619	10.523	22.619	22.619	44.2946
0.4059	0.39	5500	0.2268	22.619	10.523	22.619	22.619	44.1429
0.4005	0.43	6000	0.2216	22.619	10.523	22.619	22.619	44.4911
0.4176	0.47	6500	0.2187	22.619	10.523	22.619	22.619	44.1339
0.3413	0.5	7000	0.2115	22.619	10.523	22.619	22.619	43.9732
0.3618	0.54	7500	0.2068	22.619	10.523	22.619	22.619	43.9821
0.3157	0.57	8000	0.2037	22.619	10.523	22.619	22.619	43.0714
0.3502	0.61	8500	0.1956	22.619	10.523	22.619	22.619	42.8214
0.353	0.64	9000	0.1932	22.619	10.523	22.619	22.619	42.8393
0.3516	0.68	9500	0.1891	22.619	10.523	22.619	22.619	42.2589
0.3225	0.72	10000	0.1836	22.619	10.523	22.619	22.619	42.1964
0.2993	0.75	10500	0.1818	22.619	10.523	22.619	22.619	43.6607
0.3353	0.79	11000	0.1814	22.619	10.523	22.619	22.619	42.4018
0.3325	0.82	11500	0.1807	22.619	10.523	22.619	22.619	43.1786
0.3181	0.86	12000	0.1752	22.619	10.523	22.619	22.619	43.25
0.3337	0.9	12500	0.1729	22.619	10.523	22.619	22.619	42.3929
0.281	0.93	13000	0.1737	22.619	10.523	22.619	22.619	43.8214
0.45	0.97	13500	0.1711	22.619	10.523	22.619	22.619	42.9107

Framework versions

Transformers 4.26.0
Pytorch 1.13.1+cu116
Datasets 2.9.0
Tokenizers 0.13.2

speller-t5-big-2

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

NSDT 3DConvert

UnrealSynth

DreamTexture.js