<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->
mt5-small-finetuned-19jan-5
This model is a fine-tuned version of google/mt5-small on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 2.6411
- Rouge1: 7.6385
- Rouge2: 0.3333
- Rougel: 7.4817
- Rougelsum: 7.4859
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 10
- eval_batch_size: 10
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 60
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum |
---|---|---|---|---|---|---|---|
19.2402 | 1.0 | 60 | 8.2701 | 2.1815 | 0.1429 | 2.2246 | 2.2081 |
12.7954 | 2.0 | 120 | 5.3510 | 3.3524 | 0.3929 | 3.3843 | 3.3893 |
8.8288 | 3.0 | 180 | 3.5929 | 4.3158 | 0.4242 | 4.2947 | 4.2986 |
6.9994 | 4.0 | 240 | 3.2479 | 4.1515 | 0.5195 | 4.1991 | 4.1535 |
5.7594 | 5.0 | 300 | 3.0701 | 4.4127 | 0.4838 | 4.4044 | 4.4096 |
5.075 | 6.0 | 360 | 3.0252 | 5.6953 | 0.925 | 5.6925 | 5.6771 |
4.6336 | 7.0 | 420 | 2.9917 | 5.8009 | 1.1576 | 5.8699 | 5.871 |
4.3993 | 8.0 | 480 | 2.9676 | 5.8763 | 1.1953 | 5.9074 | 5.8808 |
4.1863 | 9.0 | 540 | 2.9213 | 6.2006 | 1.3455 | 6.2031 | 6.1713 |
4.0672 | 10.0 | 600 | 2.9115 | 5.3167 | 1.2394 | 5.3518 | 5.3606 |
3.9671 | 11.0 | 660 | 2.8743 | 5.2749 | 1.2394 | 5.3117 | 5.2936 |
3.86 | 12.0 | 720 | 2.8472 | 5.8311 | 1.1505 | 5.9026 | 5.8415 |
3.8103 | 13.0 | 780 | 2.8158 | 6.3536 | 1.1505 | 6.3989 | 6.3321 |
3.7412 | 14.0 | 840 | 2.7794 | 6.4438 | 1.1505 | 6.4702 | 6.4715 |
3.6757 | 15.0 | 900 | 2.7632 | 6.3778 | 0.9616 | 6.4342 | 6.417 |
3.643 | 16.0 | 960 | 2.7335 | 6.2346 | 0.9616 | 6.2724 | 6.2393 |
3.5952 | 17.0 | 1020 | 2.7152 | 5.9718 | 0.7727 | 6.0017 | 5.9683 |
3.585 | 18.0 | 1080 | 2.6998 | 8.8466 | 0.3333 | 8.7787 | 8.7648 |
3.493 | 19.0 | 1140 | 2.6982 | 8.1089 | 0.3333 | 7.95 | 7.9352 |
3.4807 | 20.0 | 1200 | 2.6911 | 7.9967 | 0.3333 | 7.8437 | 7.843 |
3.451 | 21.0 | 1260 | 2.6885 | 7.9967 | 0.3333 | 7.8437 | 7.843 |
3.4368 | 22.0 | 1320 | 2.6945 | 8.2061 | 0.3333 | 8.0333 | 8.0097 |
3.4044 | 23.0 | 1380 | 2.6909 | 8.6753 | 0.3333 | 8.5901 | 8.4835 |
3.3862 | 24.0 | 1440 | 2.6899 | 8.4263 | 0.3333 | 8.2222 | 8.1901 |
3.3421 | 25.0 | 1500 | 2.6897 | 8.2061 | 0.3333 | 8.0333 | 8.0097 |
3.3414 | 26.0 | 1560 | 2.6801 | 8.2061 | 0.3333 | 8.0333 | 8.0097 |
3.3354 | 27.0 | 1620 | 2.6772 | 8.2061 | 0.3333 | 8.0333 | 8.0097 |
3.299 | 28.0 | 1680 | 2.6780 | 8.2061 | 0.3333 | 8.0333 | 8.0097 |
3.3058 | 29.0 | 1740 | 2.6711 | 8.0944 | 0.3333 | 7.9019 | 7.8787 |
3.2678 | 30.0 | 1800 | 2.6693 | 8.0944 | 0.3333 | 7.9019 | 7.8787 |
3.2538 | 31.0 | 1860 | 2.6661 | 8.0944 | 0.3333 | 7.9019 | 7.8787 |
3.2361 | 32.0 | 1920 | 2.6687 | 8.0944 | 0.3333 | 7.9019 | 7.8787 |
3.2326 | 33.0 | 1980 | 2.6625 | 8.0944 | 0.3333 | 7.9019 | 7.8787 |
3.2142 | 34.0 | 2040 | 2.6648 | 8.0526 | 0.3333 | 7.9026 | 7.8801 |
3.1875 | 35.0 | 2100 | 2.6634 | 8.5204 | 0.3333 | 8.3199 | 8.3352 |
3.1717 | 36.0 | 2160 | 2.6611 | 8.5083 | 0.3333 | 8.3228 | 8.3359 |
3.1706 | 37.0 | 2220 | 2.6641 | 8.5083 | 0.3333 | 8.3228 | 8.3359 |
3.1541 | 38.0 | 2280 | 2.6573 | 8.5083 | 0.3333 | 8.3228 | 8.3359 |
3.1468 | 39.0 | 2340 | 2.6626 | 8.5083 | 0.3333 | 8.3228 | 8.3359 |
3.1376 | 40.0 | 2400 | 2.6602 | 8.5083 | 0.3333 | 8.3228 | 8.3359 |
3.1572 | 41.0 | 2460 | 2.6539 | 7.9385 | 0.3333 | 7.8019 | 7.8519 |
3.147 | 42.0 | 2520 | 2.6527 | 7.9385 | 0.3333 | 7.8019 | 7.8519 |
3.1199 | 43.0 | 2580 | 2.6487 | 7.9385 | 0.3333 | 7.8019 | 7.8519 |
3.1286 | 44.0 | 2640 | 2.6493 | 7.6385 | 0.3333 | 7.4817 | 7.4859 |
3.1042 | 45.0 | 2700 | 2.6519 | 8.1885 | 0.3333 | 7.9894 | 8.0292 |
3.099 | 46.0 | 2760 | 2.6525 | 8.1885 | 0.3333 | 7.9894 | 8.0292 |
3.1106 | 47.0 | 2820 | 2.6514 | 8.1885 | 0.3333 | 7.9894 | 8.0292 |
3.1036 | 48.0 | 2880 | 2.6501 | 7.6385 | 0.3333 | 7.4817 | 7.4859 |
3.0934 | 49.0 | 2940 | 2.6501 | 7.6385 | 0.3333 | 7.4817 | 7.4859 |
3.0822 | 50.0 | 3000 | 2.6435 | 7.6385 | 0.3333 | 7.4817 | 7.4859 |
3.0858 | 51.0 | 3060 | 2.6479 | 7.6385 | 0.3333 | 7.4817 | 7.4859 |
3.0825 | 52.0 | 3120 | 2.6455 | 7.6385 | 0.3333 | 7.4817 | 7.4859 |
3.063 | 53.0 | 3180 | 2.6437 | 7.6385 | 0.3333 | 7.4817 | 7.4859 |
3.0641 | 54.0 | 3240 | 2.6429 | 7.6385 | 0.3333 | 7.4817 | 7.4859 |
3.0703 | 55.0 | 3300 | 2.6430 | 7.6385 | 0.3333 | 7.4817 | 7.4859 |
3.0554 | 56.0 | 3360 | 2.6413 | 7.6385 | 0.3333 | 7.4817 | 7.4859 |
3.0498 | 57.0 | 3420 | 2.6415 | 7.6385 | 0.3333 | 7.4817 | 7.4859 |
3.0668 | 58.0 | 3480 | 2.6411 | 7.6385 | 0.3333 | 7.4817 | 7.4859 |
3.0657 | 59.0 | 3540 | 2.6409 | 7.6385 | 0.3333 | 7.4817 | 7.4859 |
3.0591 | 60.0 | 3600 | 2.6411 | 7.6385 | 0.3333 | 7.4817 | 7.4859 |
Framework versions
- Transformers 4.25.1
- Pytorch 1.13.1+cu116
- Datasets 2.8.0
- Tokenizers 0.13.2