<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->
mt5-small-finetuned-29jan-1
This model is a fine-tuned version of google/mt5-small on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 2.4883
- Rouge1: 19.5044
- Rouge2: 6.2046
- Rougel: 19.3543
- Rougelsum: 19.381
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 10
- eval_batch_size: 10
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 30
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum |
---|---|---|---|---|---|---|---|
6.4829 | 1.0 | 217 | 2.7590 | 12.7914 | 3.3267 | 12.493 | 12.4137 |
3.4814 | 2.0 | 434 | 2.7229 | 16.7805 | 4.8009 | 16.4908 | 16.5233 |
3.2161 | 3.0 | 651 | 2.6422 | 18.3488 | 5.0629 | 18.1397 | 18.1976 |
3.045 | 4.0 | 868 | 2.6008 | 18.1363 | 5.7597 | 17.9056 | 17.9882 |
2.9475 | 5.0 | 1085 | 2.6061 | 18.9355 | 6.0803 | 18.6355 | 18.7673 |
2.8547 | 6.0 | 1302 | 2.5628 | 17.904 | 5.8618 | 17.7818 | 17.8446 |
2.7685 | 7.0 | 1519 | 2.5311 | 18.9128 | 5.9625 | 18.7142 | 18.842 |
2.705 | 8.0 | 1736 | 2.5371 | 19.6663 | 6.0395 | 19.3416 | 19.408 |
2.6438 | 9.0 | 1953 | 2.5427 | 19.1516 | 6.0007 | 18.9663 | 19.0156 |
2.6086 | 10.0 | 2170 | 2.5211 | 19.0945 | 6.4325 | 18.918 | 18.9664 |
2.5394 | 11.0 | 2387 | 2.5226 | 18.9019 | 6.3004 | 18.7281 | 18.8082 |
2.5004 | 12.0 | 2604 | 2.5136 | 18.9701 | 6.1868 | 18.7234 | 18.8098 |
2.4666 | 13.0 | 2821 | 2.4958 | 18.155 | 6.1513 | 18.0758 | 18.1362 |
2.4255 | 14.0 | 3038 | 2.5101 | 18.7561 | 6.2634 | 18.6477 | 18.7123 |
2.3856 | 15.0 | 3255 | 2.4860 | 19.2239 | 6.4539 | 19.1162 | 19.1403 |
2.3594 | 16.0 | 3472 | 2.4905 | 19.0075 | 6.1541 | 18.9106 | 18.9616 |
2.3301 | 17.0 | 3689 | 2.4970 | 18.7102 | 6.2065 | 18.4881 | 18.5588 |
2.3032 | 18.0 | 3906 | 2.4744 | 19.3199 | 6.6458 | 19.1365 | 19.1733 |
2.2825 | 19.0 | 4123 | 2.4907 | 18.9608 | 6.3074 | 18.8124 | 18.8502 |
2.2609 | 20.0 | 4340 | 2.4772 | 19.2785 | 6.4725 | 19.0379 | 19.0556 |
2.2384 | 21.0 | 4557 | 2.4874 | 18.9376 | 6.2922 | 18.7618 | 18.8442 |
2.2176 | 22.0 | 4774 | 2.4853 | 18.9962 | 6.2231 | 18.7551 | 18.7958 |
2.2095 | 23.0 | 4991 | 2.4960 | 18.6517 | 5.8114 | 18.4809 | 18.4811 |
2.1958 | 24.0 | 5208 | 2.4911 | 18.9743 | 6.2245 | 18.7692 | 18.869 |
2.1777 | 25.0 | 5425 | 2.4788 | 18.9623 | 6.0877 | 18.7591 | 18.7917 |
2.1645 | 26.0 | 5642 | 2.4883 | 19.2814 | 6.2264 | 19.1407 | 19.1835 |
2.1575 | 27.0 | 5859 | 2.4910 | 19.4592 | 6.3513 | 19.2842 | 19.3017 |
2.142 | 28.0 | 6076 | 2.4815 | 19.3045 | 6.2179 | 19.1271 | 19.1084 |
2.1396 | 29.0 | 6293 | 2.4858 | 19.4159 | 6.275 | 19.2582 | 19.2731 |
2.1438 | 30.0 | 6510 | 2.4883 | 19.5044 | 6.2046 | 19.3543 | 19.381 |
Framework versions
- Transformers 4.26.0
- Pytorch 1.13.1+cu116
- Datasets 2.9.0
- Tokenizers 0.13.2