<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->
mt5-small-finetuned-26feb-1
This model is a fine-tuned version of google/mt5-small on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 2.4486
- Rouge1: 20.86
- Rouge2: 6.45
- Rougel: 20.49
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 12
- eval_batch_size: 12
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 60
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel |
---|---|---|---|---|---|---|
5.4839 | 1.93 | 500 | 2.5990 | 16.14 | 5.18 | 15.98 |
3.1051 | 3.86 | 1000 | 2.4754 | 18.71 | 5.68 | 18.41 |
2.8659 | 5.79 | 1500 | 2.4006 | 18.22 | 5.54 | 18.06 |
2.71 | 7.72 | 2000 | 2.3848 | 19.91 | 6.0 | 19.65 |
2.5845 | 9.65 | 2500 | 2.3956 | 18.72 | 5.72 | 18.4 |
2.4895 | 11.58 | 3000 | 2.3719 | 19.9 | 6.1 | 19.54 |
2.402 | 13.51 | 3500 | 2.3691 | 19.86 | 5.79 | 19.51 |
2.3089 | 15.44 | 4000 | 2.3747 | 20.22 | 6.74 | 19.88 |
2.2681 | 17.37 | 4500 | 2.3754 | 19.44 | 5.53 | 19.03 |
2.1927 | 19.31 | 5000 | 2.3419 | 20.02 | 5.91 | 19.69 |
2.1278 | 21.24 | 5500 | 2.3496 | 20.26 | 6.21 | 19.79 |
2.0928 | 23.17 | 6000 | 2.3756 | 19.9 | 6.04 | 19.48 |
2.0658 | 25.1 | 6500 | 2.3615 | 19.61 | 6.04 | 19.28 |
2.0063 | 27.03 | 7000 | 2.3516 | 20.38 | 6.52 | 20.14 |
1.9581 | 28.96 | 7500 | 2.3743 | 20.61 | 6.26 | 20.24 |
1.941 | 30.89 | 8000 | 2.3726 | 19.73 | 5.8 | 19.31 |
1.9172 | 32.82 | 8500 | 2.3891 | 19.73 | 5.98 | 19.51 |
1.8764 | 34.75 | 9000 | 2.3782 | 20.1 | 6.15 | 19.74 |
1.8453 | 36.68 | 9500 | 2.3851 | 19.96 | 6.0 | 19.61 |
1.845 | 38.61 | 10000 | 2.4046 | 20.66 | 6.32 | 20.24 |
1.7919 | 40.54 | 10500 | 2.4169 | 20.65 | 6.25 | 20.38 |
1.7945 | 42.47 | 11000 | 2.4206 | 20.68 | 5.74 | 20.37 |
1.7689 | 44.4 | 11500 | 2.4246 | 20.69 | 6.09 | 20.4 |
1.7215 | 46.33 | 12000 | 2.4237 | 20.49 | 6.43 | 20.21 |
1.7306 | 48.26 | 12500 | 2.4217 | 20.55 | 6.49 | 20.18 |
1.7035 | 50.19 | 13000 | 2.4389 | 20.81 | 6.55 | 20.48 |
1.6934 | 52.12 | 13500 | 2.4377 | 20.75 | 6.85 | 20.35 |
1.7 | 54.05 | 14000 | 2.4486 | 20.86 | 6.45 | 20.49 |
1.6909 | 55.98 | 14500 | 2.4451 | 20.5 | 6.55 | 20.12 |
1.6804 | 57.92 | 15000 | 2.4457 | 20.21 | 6.5 | 19.84 |
1.6693 | 59.85 | 15500 | 2.4473 | 20.35 | 6.6 | 19.96 |
Framework versions
- Transformers 4.26.1
- Pytorch 1.13.1+cu116
- Datasets 2.10.0
- Tokenizers 0.13.2