<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->
speller-t5-900
This model is a fine-tuned version of sberbank-ai/ruT5-base on the None dataset. It achieves the following results on the evaluation set:
- Loss: 0.1758
- Rouge1: 19.3503
- Rouge2: 8.3333
- Rougel: 19.3503
- Rougelsum: 19.3503
- Gen Len: 41.4153
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 1
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
---|---|---|---|---|---|---|---|---|
1.0227 | 0.03 | 500 | 0.5411 | 17.6201 | 7.1186 | 17.6554 | 17.5847 | 45.5424 |
0.7224 | 0.07 | 1000 | 0.4269 | 18.1497 | 7.1186 | 18.1497 | 17.9732 | 42.7797 |
0.7101 | 0.1 | 1500 | 0.3542 | 18.9972 | 7.9661 | 18.9972 | 18.9619 | 42.3983 |
0.5962 | 0.14 | 2000 | 0.3283 | 18.9972 | 7.9661 | 18.9972 | 18.9619 | 42.2542 |
0.535 | 0.17 | 2500 | 0.3104 | 18.9972 | 7.9661 | 18.9972 | 18.9619 | 42.2627 |
0.6124 | 0.2 | 3000 | 0.2843 | 18.9972 | 7.9661 | 18.9972 | 18.9619 | 42.4915 |
0.491 | 0.24 | 3500 | 0.2706 | 18.9972 | 7.9661 | 18.9972 | 18.9619 | 42.4322 |
0.5028 | 0.27 | 4000 | 0.2647 | 19.5429 | 8.5876 | 19.5429 | 19.5621 | 42.3898 |
0.4547 | 0.31 | 4500 | 0.2548 | 18.9972 | 7.9661 | 18.9972 | 18.9619 | 42.178 |
0.4335 | 0.34 | 5000 | 0.2448 | 19.5429 | 8.5876 | 19.5429 | 19.5621 | 42.178 |
0.4511 | 0.38 | 5500 | 0.2377 | 19.4915 | 8.5876 | 19.4915 | 19.4915 | 42.3305 |
0.4765 | 0.41 | 6000 | 0.2337 | 19.5429 | 8.5876 | 19.5429 | 19.5621 | 41.4237 |
0.4355 | 0.44 | 6500 | 0.2233 | 19.4915 | 8.5876 | 19.4915 | 19.4915 | 41.7881 |
0.3924 | 0.48 | 7000 | 0.2172 | 19.4915 | 8.5876 | 19.4915 | 19.4915 | 40.9492 |
0.3898 | 0.51 | 7500 | 0.2153 | 19.4915 | 8.5876 | 19.4915 | 19.4915 | 41.6356 |
0.4236 | 0.55 | 8000 | 0.2102 | 19.4915 | 8.5876 | 19.4915 | 19.4915 | 41.0254 |
0.3484 | 0.58 | 8500 | 0.2116 | 19.4915 | 8.5876 | 19.4915 | 19.4915 | 41.8305 |
0.5514 | 0.61 | 9000 | 0.2017 | 19.6328 | 8.7571 | 19.5975 | 19.6328 | 41.1864 |
0.3298 | 0.65 | 9500 | 0.1945 | 19.6328 | 8.7571 | 19.5975 | 19.6328 | 41.2966 |
0.3807 | 0.68 | 10000 | 0.1966 | 19.6328 | 8.7571 | 19.5975 | 19.6328 | 41.6525 |
0.3177 | 0.72 | 10500 | 0.1918 | 19.3503 | 8.3333 | 19.3503 | 19.3503 | 41.2627 |
0.3374 | 0.75 | 11000 | 0.1903 | 19.6328 | 8.7571 | 19.5975 | 19.6328 | 41.2373 |
0.3123 | 0.78 | 11500 | 0.1900 | 19.6328 | 8.7571 | 19.5975 | 19.6328 | 41.2203 |
0.3377 | 0.82 | 12000 | 0.1847 | 19.6328 | 8.7571 | 19.5975 | 19.6328 | 41.2712 |
0.3138 | 0.85 | 12500 | 0.1814 | 19.6328 | 8.7571 | 19.5975 | 19.6328 | 41.1864 |
0.335 | 0.89 | 13000 | 0.1784 | 19.6328 | 8.7571 | 19.5975 | 19.6328 | 41.1695 |
0.3142 | 0.92 | 13500 | 0.1768 | 19.6328 | 8.7571 | 19.5975 | 19.6328 | 41.2542 |
0.3245 | 0.95 | 14000 | 0.1753 | 19.6328 | 8.7571 | 19.5975 | 19.6328 | 41.2034 |
0.3277 | 0.99 | 14500 | 0.1758 | 19.3503 | 8.3333 | 19.3503 | 19.3503 | 41.4153 |
Framework versions
- Transformers 4.26.0
- Pytorch 1.7.1+cu110
- Datasets 2.9.0
- Tokenizers 0.13.2