<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->
speller-t5-909_both_
This model is a fine-tuned version of sberbank-ai/ruT5-large on the None dataset. It achieves the following results on the evaluation set:
- Loss: 0.0771
- Rouge1: 20.0565
- Rouge2: 7.9096
- Rougel: 20.1271
- Rougelsum: 20.1977
- Gen Len: 41.2712
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 1
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
---|---|---|---|---|---|---|---|---|
0.1653 | 0.1 | 1500 | 0.1176 | 19.8446 | 7.4011 | 19.8446 | 19.9153 | 41.2712 |
0.2083 | 0.2 | 3000 | 0.1023 | 19.7034 | 8.7571 | 19.7034 | 19.774 | 41.1186 |
0.1617 | 0.31 | 4500 | 0.0975 | 19.2797 | 7.9096 | 19.2797 | 19.209 | 41.2797 |
0.17 | 0.41 | 6000 | 0.0949 | 20.5508 | 8.7571 | 20.5862 | 20.6215 | 41.2712 |
0.1416 | 0.51 | 7500 | 0.0871 | 20.0565 | 7.9096 | 20.1271 | 20.1977 | 41.1017 |
0.1409 | 0.61 | 9000 | 0.0807 | 20.0565 | 7.9096 | 20.1271 | 20.1977 | 41.1695 |
0.1094 | 0.72 | 10500 | 0.0746 | 19.9859 | 7.6271 | 19.9506 | 19.9859 | 41.2627 |
0.1256 | 0.82 | 12000 | 0.0754 | 19.9859 | 7.6271 | 19.9506 | 19.9859 | 41.2119 |
0.1206 | 0.92 | 13500 | 0.0771 | 20.0565 | 7.9096 | 20.1271 | 20.1977 | 41.2712 |
Framework versions
- Transformers 4.26.0
- Pytorch 1.13.1+cu116
- Datasets 2.9.0
- Tokenizers 0.13.2