<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->
t5-small-finetuned-xsum
This model is a fine-tuned version of t5-small on the None dataset. It achieves the following results on the evaluation set:
- Loss: 0.5039
- Rouge1: 80.8002
- Rouge2: 77.6863
- Rougel: 80.6746
- Rougelsum: 80.7336
- Gen Len: 14.1393
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 150
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
---|---|---|---|---|---|---|---|---|
No log | 1.0 | 113 | 0.7526 | 80.8695 | 77.9379 | 80.7636 | 80.7858 | 15.6577 |
No log | 2.0 | 226 | 0.6720 | 81.3067 | 78.4208 | 81.2158 | 81.2418 | 15.5721 |
No log | 3.0 | 339 | 0.6472 | 81.5539 | 78.6973 | 81.4666 | 81.5086 | 15.4799 |
No log | 4.0 | 452 | 0.6217 | 81.4038 | 78.5499 | 81.3074 | 81.3618 | 15.5369 |
0.7253 | 5.0 | 565 | 0.5931 | 81.6448 | 78.8248 | 81.5691 | 81.5943 | 15.4698 |
0.7253 | 6.0 | 678 | 0.5696 | 81.7034 | 78.8675 | 81.6142 | 81.64 | 15.453 |
0.7253 | 7.0 | 791 | 0.5646 | 81.7875 | 78.9862 | 81.727 | 81.7454 | 15.4329 |
0.7253 | 8.0 | 904 | 0.5531 | 81.7309 | 78.905 | 81.6487 | 81.6755 | 15.4346 |
0.5664 | 9.0 | 1017 | 0.5409 | 81.5895 | 78.7604 | 81.4983 | 81.5348 | 15.4178 |
0.5664 | 10.0 | 1130 | 0.5467 | 81.5002 | 78.6385 | 81.3915 | 81.4247 | 15.448 |
0.5664 | 11.0 | 1243 | 0.5290 | 81.622 | 78.7775 | 81.5339 | 81.5417 | 15.4128 |
0.5664 | 12.0 | 1356 | 0.5247 | 81.5904 | 78.7444 | 81.4944 | 81.5451 | 15.3641 |
0.5664 | 13.0 | 1469 | 0.5164 | 81.6429 | 78.8116 | 81.5581 | 81.5926 | 15.3574 |
0.499 | 14.0 | 1582 | 0.5182 | 81.5478 | 78.6836 | 81.455 | 81.5058 | 15.3607 |
0.499 | 15.0 | 1695 | 0.5058 | 81.4959 | 78.6553 | 81.4129 | 81.4608 | 15.2433 |
0.499 | 16.0 | 1808 | 0.4985 | 81.7445 | 78.971 | 81.6587 | 81.7184 | 15.1913 |
0.499 | 17.0 | 1921 | 0.4953 | 81.6383 | 78.8217 | 81.5419 | 81.6169 | 15.2131 |
0.456 | 18.0 | 2034 | 0.4914 | 81.6345 | 78.8308 | 81.506 | 81.5956 | 15.1695 |
0.456 | 19.0 | 2147 | 0.4918 | 81.4352 | 78.6371 | 81.319 | 81.4134 | 15.0923 |
0.456 | 20.0 | 2260 | 0.4950 | 81.4993 | 78.6787 | 81.401 | 81.466 | 15.1661 |
0.456 | 21.0 | 2373 | 0.4839 | 81.7538 | 78.9501 | 81.655 | 81.7017 | 15.0453 |
0.456 | 22.0 | 2486 | 0.4917 | 81.445 | 78.6264 | 81.3378 | 81.4072 | 15.1376 |
0.4233 | 23.0 | 2599 | 0.4790 | 81.6857 | 78.8888 | 81.5719 | 81.6151 | 15.0084 |
0.4233 | 24.0 | 2712 | 0.4801 | 81.5146 | 78.7124 | 81.3956 | 81.4622 | 15.047 |
0.4233 | 25.0 | 2825 | 0.4833 | 81.4409 | 78.6415 | 81.3185 | 81.4137 | 15.0587 |
0.4233 | 26.0 | 2938 | 0.4846 | 81.4886 | 78.6912 | 81.3758 | 81.4498 | 15.0487 |
0.396 | 27.0 | 3051 | 0.4782 | 81.4474 | 78.6254 | 81.3355 | 81.4372 | 14.8674 |
0.396 | 28.0 | 3164 | 0.4767 | 81.5577 | 78.7259 | 81.4474 | 81.4906 | 14.7919 |
0.396 | 29.0 | 3277 | 0.4745 | 81.648 | 78.7893 | 81.5163 | 81.5654 | 14.9262 |
0.396 | 30.0 | 3390 | 0.4679 | 81.4867 | 78.6765 | 81.3659 | 81.4438 | 14.7399 |
0.3738 | 31.0 | 3503 | 0.4722 | 81.8409 | 79.0512 | 81.7436 | 81.7752 | 15.0 |
0.3738 | 32.0 | 3616 | 0.4698 | 81.6546 | 78.867 | 81.5383 | 81.6146 | 14.9279 |
0.3738 | 33.0 | 3729 | 0.4629 | 81.5223 | 78.6672 | 81.3989 | 81.4953 | 14.6342 |
0.3738 | 34.0 | 3842 | 0.4689 | 81.3717 | 78.4879 | 81.2359 | 81.3315 | 14.7534 |
0.3738 | 35.0 | 3955 | 0.4714 | 81.4911 | 78.6664 | 81.3923 | 81.4424 | 14.8842 |
0.3585 | 36.0 | 4068 | 0.4615 | 81.2428 | 78.3459 | 81.1102 | 81.1907 | 14.4581 |
0.3585 | 37.0 | 4181 | 0.4709 | 81.4039 | 78.4899 | 81.2677 | 81.3251 | 14.7299 |
0.3585 | 38.0 | 4294 | 0.4723 | 81.4856 | 78.6036 | 81.3627 | 81.4373 | 14.8154 |
0.3585 | 39.0 | 4407 | 0.4704 | 81.3924 | 78.5321 | 81.3295 | 81.3834 | 14.6158 |
0.3377 | 40.0 | 4520 | 0.4649 | 81.4455 | 78.5817 | 81.3635 | 81.4353 | 14.5906 |
0.3377 | 41.0 | 4633 | 0.4646 | 81.2935 | 78.4222 | 81.1972 | 81.2504 | 14.4983 |
0.3377 | 42.0 | 4746 | 0.4692 | 81.2082 | 78.3097 | 81.1218 | 81.18 | 14.5872 |
0.3377 | 43.0 | 4859 | 0.4673 | 81.3108 | 78.426 | 81.2192 | 81.2795 | 14.5822 |
0.3377 | 44.0 | 4972 | 0.4763 | 81.1511 | 78.272 | 81.0806 | 81.1195 | 14.6174 |
0.327 | 45.0 | 5085 | 0.4703 | 81.1883 | 78.2662 | 81.0751 | 81.1556 | 14.5017 |
0.327 | 46.0 | 5198 | 0.4701 | 81.3512 | 78.497 | 81.2556 | 81.2958 | 14.5638 |
0.327 | 47.0 | 5311 | 0.4675 | 81.3069 | 78.4365 | 81.2282 | 81.2661 | 14.5872 |
0.327 | 48.0 | 5424 | 0.4713 | 81.3203 | 78.4676 | 81.2704 | 81.2963 | 14.6074 |
0.3132 | 49.0 | 5537 | 0.4685 | 81.4914 | 78.6478 | 81.4025 | 81.4461 | 14.5319 |
0.3132 | 50.0 | 5650 | 0.4704 | 81.4722 | 78.6349 | 81.401 | 81.4166 | 14.5185 |
0.3132 | 51.0 | 5763 | 0.4746 | 81.5848 | 78.7389 | 81.4983 | 81.5228 | 14.5487 |
0.3132 | 52.0 | 5876 | 0.4736 | 81.6095 | 78.7901 | 81.5205 | 81.5296 | 14.5101 |
0.3132 | 53.0 | 5989 | 0.4738 | 81.4913 | 78.6738 | 81.3957 | 81.4394 | 14.5956 |
0.3043 | 54.0 | 6102 | 0.4733 | 81.3817 | 78.4843 | 81.2916 | 81.3222 | 14.6795 |
0.3043 | 55.0 | 6215 | 0.4747 | 81.4791 | 78.5596 | 81.3719 | 81.391 | 14.6023 |
0.3043 | 56.0 | 6328 | 0.4755 | 81.3834 | 78.4865 | 81.3217 | 81.334 | 14.5537 |
0.3043 | 57.0 | 6441 | 0.4687 | 81.0989 | 78.1685 | 81.0179 | 81.0713 | 14.3742 |
0.3 | 58.0 | 6554 | 0.4746 | 81.5527 | 78.6533 | 81.4764 | 81.5086 | 14.4933 |
0.3 | 59.0 | 6667 | 0.4717 | 81.4294 | 78.5367 | 81.3555 | 81.393 | 14.3842 |
0.3 | 60.0 | 6780 | 0.4797 | 81.2638 | 78.347 | 81.1945 | 81.2258 | 14.5906 |
0.3 | 61.0 | 6893 | 0.4740 | 81.2198 | 78.3272 | 81.1701 | 81.1926 | 14.5151 |
0.2835 | 62.0 | 7006 | 0.4742 | 81.1578 | 78.2664 | 81.0988 | 81.1239 | 14.5218 |
0.2835 | 63.0 | 7119 | 0.4766 | 81.0438 | 78.1511 | 80.9756 | 81.0192 | 14.4312 |
0.2835 | 64.0 | 7232 | 0.4716 | 80.967 | 78.013 | 80.8988 | 80.9291 | 14.2936 |
0.2835 | 65.0 | 7345 | 0.4729 | 81.0833 | 78.16 | 81.0197 | 81.0557 | 14.3221 |
0.2835 | 66.0 | 7458 | 0.4788 | 81.1509 | 78.2452 | 81.0636 | 81.1082 | 14.4396 |
0.2753 | 67.0 | 7571 | 0.4802 | 81.2766 | 78.3662 | 81.1998 | 81.2395 | 14.4362 |
0.2753 | 68.0 | 7684 | 0.4764 | 80.986 | 78.0241 | 80.907 | 80.9182 | 14.4295 |
0.2753 | 69.0 | 7797 | 0.4783 | 81.2485 | 78.3734 | 81.1851 | 81.232 | 14.4178 |
0.2753 | 70.0 | 7910 | 0.4794 | 81.1969 | 78.2884 | 81.1091 | 81.1547 | 14.3943 |
0.2671 | 71.0 | 8023 | 0.4746 | 81.1233 | 78.1151 | 81.0474 | 81.1019 | 14.307 |
0.2671 | 72.0 | 8136 | 0.4750 | 81.0761 | 78.1707 | 80.99 | 81.0476 | 14.3758 |
0.2671 | 73.0 | 8249 | 0.4769 | 81.1251 | 78.1659 | 81.0612 | 81.1 | 14.3876 |
0.2671 | 74.0 | 8362 | 0.4809 | 81.1335 | 78.2075 | 81.0582 | 81.1128 | 14.3909 |
0.2671 | 75.0 | 8475 | 0.4819 | 81.1342 | 78.2138 | 81.0627 | 81.0962 | 14.4866 |
0.2611 | 76.0 | 8588 | 0.4779 | 81.1457 | 78.2469 | 81.0921 | 81.1249 | 14.4228 |
0.2611 | 77.0 | 8701 | 0.4849 | 81.3577 | 78.4481 | 81.2961 | 81.3106 | 14.4916 |
0.2611 | 78.0 | 8814 | 0.4854 | 81.2655 | 78.3525 | 81.2021 | 81.2204 | 14.4379 |
0.2611 | 79.0 | 8927 | 0.4779 | 81.209 | 78.2182 | 81.1394 | 81.2018 | 14.3104 |
0.2582 | 80.0 | 9040 | 0.4792 | 81.1094 | 78.109 | 81.0498 | 81.0597 | 14.443 |
0.2582 | 81.0 | 9153 | 0.4798 | 81.2118 | 78.2388 | 81.158 | 81.1979 | 14.2836 |
0.2582 | 82.0 | 9266 | 0.4785 | 81.1646 | 78.1741 | 81.0892 | 81.1318 | 14.2819 |
0.2582 | 83.0 | 9379 | 0.4894 | 81.0246 | 78.0386 | 80.9799 | 81.0082 | 14.5352 |
0.2582 | 84.0 | 9492 | 0.4852 | 80.8181 | 77.748 | 80.7271 | 80.7857 | 14.307 |
0.2502 | 85.0 | 9605 | 0.4853 | 80.8607 | 77.8081 | 80.7819 | 80.826 | 14.2819 |
0.2502 | 86.0 | 9718 | 0.4821 | 81.0753 | 78.0443 | 80.977 | 81.0585 | 14.1628 |
0.2502 | 87.0 | 9831 | 0.4836 | 81.0502 | 78.0125 | 80.9437 | 81.0227 | 14.1829 |
0.2502 | 88.0 | 9944 | 0.4867 | 81.0832 | 78.0581 | 80.9885 | 81.0629 | 14.2232 |
0.2474 | 89.0 | 10057 | 0.4885 | 80.8361 | 77.8075 | 80.7644 | 80.8113 | 14.245 |
0.2474 | 90.0 | 10170 | 0.4926 | 80.9648 | 77.9656 | 80.8858 | 80.9429 | 14.3758 |
0.2474 | 91.0 | 10283 | 0.4862 | 81.0499 | 77.9909 | 80.9529 | 81.0251 | 14.1594 |
0.2474 | 92.0 | 10396 | 0.4885 | 80.9683 | 77.9313 | 80.8599 | 80.9175 | 14.2735 |
0.2381 | 93.0 | 10509 | 0.4858 | 80.964 | 77.9419 | 80.8997 | 80.9494 | 14.3272 |
0.2381 | 94.0 | 10622 | 0.4891 | 80.7479 | 77.6563 | 80.6516 | 80.7074 | 14.3993 |
0.2381 | 95.0 | 10735 | 0.4909 | 80.7166 | 77.5902 | 80.5996 | 80.6605 | 14.2651 |
0.2381 | 96.0 | 10848 | 0.4845 | 81.174 | 78.0743 | 81.0369 | 81.0988 | 14.099 |
0.2381 | 97.0 | 10961 | 0.4886 | 80.8086 | 77.7071 | 80.7341 | 80.7905 | 14.2131 |
0.2387 | 98.0 | 11074 | 0.4932 | 80.8045 | 77.7326 | 80.7178 | 80.767 | 14.4245 |
0.2387 | 99.0 | 11187 | 0.4890 | 80.8664 | 77.7815 | 80.7817 | 80.8381 | 14.2886 |
0.2387 | 100.0 | 11300 | 0.4843 | 80.742 | 77.6378 | 80.636 | 80.6881 | 13.9463 |
0.2387 | 101.0 | 11413 | 0.4900 | 80.5951 | 77.4218 | 80.4692 | 80.5166 | 14.2131 |
0.2343 | 102.0 | 11526 | 0.4933 | 80.9176 | 77.8813 | 80.7994 | 80.8637 | 14.2047 |
0.2343 | 103.0 | 11639 | 0.4936 | 80.8901 | 77.8476 | 80.7945 | 80.8447 | 14.1812 |
0.2343 | 104.0 | 11752 | 0.4955 | 80.9422 | 77.9055 | 80.8414 | 80.8666 | 14.2634 |
0.2343 | 105.0 | 11865 | 0.4971 | 80.9296 | 77.8819 | 80.8107 | 80.8588 | 14.2617 |
0.2343 | 106.0 | 11978 | 0.4968 | 80.8841 | 77.8315 | 80.7829 | 80.8081 | 14.2735 |
0.2309 | 107.0 | 12091 | 0.4949 | 80.8845 | 77.831 | 80.766 | 80.8197 | 14.1862 |
0.2309 | 108.0 | 12204 | 0.4979 | 80.817 | 77.77 | 80.7009 | 80.7623 | 14.3104 |
0.2309 | 109.0 | 12317 | 0.4970 | 80.9426 | 77.884 | 80.8245 | 80.8849 | 14.1862 |
0.2309 | 110.0 | 12430 | 0.4951 | 80.8217 | 77.7436 | 80.7137 | 80.7638 | 14.2181 |
0.225 | 111.0 | 12543 | 0.4971 | 80.9686 | 77.908 | 80.8479 | 80.8945 | 14.1846 |
0.225 | 112.0 | 12656 | 0.4998 | 80.7005 | 77.6143 | 80.5708 | 80.6458 | 14.1711 |
0.225 | 113.0 | 12769 | 0.5010 | 80.6586 | 77.5568 | 80.5494 | 80.6194 | 14.0872 |
0.225 | 114.0 | 12882 | 0.5020 | 80.5762 | 77.4938 | 80.4571 | 80.5458 | 14.2013 |
0.225 | 115.0 | 12995 | 0.4950 | 80.7756 | 77.6717 | 80.6326 | 80.7221 | 14.0671 |
0.2248 | 116.0 | 13108 | 0.4974 | 80.6474 | 77.5174 | 80.5385 | 80.5867 | 14.1527 |
0.2248 | 117.0 | 13221 | 0.5002 | 80.6698 | 77.5889 | 80.5657 | 80.6265 | 14.2567 |
0.2248 | 118.0 | 13334 | 0.4998 | 80.5252 | 77.4293 | 80.4052 | 80.4697 | 14.2466 |
0.2248 | 119.0 | 13447 | 0.4994 | 80.4945 | 77.381 | 80.4047 | 80.4778 | 14.1527 |
0.2244 | 120.0 | 13560 | 0.5007 | 80.687 | 77.5655 | 80.5506 | 80.6345 | 14.0889 |
0.2244 | 121.0 | 13673 | 0.5003 | 80.6725 | 77.5569 | 80.5444 | 80.6284 | 14.0889 |
0.2244 | 122.0 | 13786 | 0.5012 | 80.5519 | 77.4681 | 80.4309 | 80.5167 | 14.2064 |
0.2244 | 123.0 | 13899 | 0.4984 | 80.5439 | 77.4156 | 80.4356 | 80.5053 | 14.1091 |
0.2142 | 124.0 | 14012 | 0.5003 | 80.506 | 77.3663 | 80.3909 | 80.4609 | 14.1258 |
0.2142 | 125.0 | 14125 | 0.5012 | 80.5929 | 77.47 | 80.4856 | 80.5471 | 14.1057 |
0.2142 | 126.0 | 14238 | 0.5015 | 80.8261 | 77.7005 | 80.7021 | 80.7576 | 14.1846 |
0.2142 | 127.0 | 14351 | 0.4989 | 80.7286 | 77.6025 | 80.6101 | 80.6584 | 14.1309 |
0.2142 | 128.0 | 14464 | 0.4995 | 80.7703 | 77.6574 | 80.6209 | 80.7013 | 14.2047 |
0.2239 | 129.0 | 14577 | 0.5030 | 80.7682 | 77.6334 | 80.6326 | 80.7075 | 14.2685 |
0.2239 | 130.0 | 14690 | 0.5017 | 80.8383 | 77.7289 | 80.7019 | 80.7832 | 14.1997 |
0.2239 | 131.0 | 14803 | 0.5007 | 80.8551 | 77.7449 | 80.7238 | 80.7909 | 14.1359 |
0.2239 | 132.0 | 14916 | 0.5004 | 80.9042 | 77.8065 | 80.762 | 80.8351 | 14.1191 |
0.2139 | 133.0 | 15029 | 0.5001 | 80.9345 | 77.8301 | 80.784 | 80.8658 | 14.1107 |
0.2139 | 134.0 | 15142 | 0.5023 | 80.8865 | 77.7869 | 80.7474 | 80.8159 | 14.1174 |
0.2139 | 135.0 | 15255 | 0.5036 | 80.8305 | 77.7027 | 80.709 | 80.7658 | 14.1812 |
0.2139 | 136.0 | 15368 | 0.5027 | 80.8305 | 77.7027 | 80.709 | 80.7658 | 14.1812 |
0.2139 | 137.0 | 15481 | 0.5029 | 80.801 | 77.6711 | 80.6703 | 80.7443 | 14.1862 |
0.2152 | 138.0 | 15594 | 0.5022 | 80.9639 | 77.8473 | 80.8252 | 80.8959 | 14.1544 |
0.2152 | 139.0 | 15707 | 0.5033 | 80.9216 | 77.8092 | 80.7903 | 80.8622 | 14.1594 |
0.2152 | 140.0 | 15820 | 0.5034 | 80.8977 | 77.7878 | 80.7642 | 80.8317 | 14.1628 |
0.2152 | 141.0 | 15933 | 0.5030 | 80.8977 | 77.7878 | 80.7642 | 80.8317 | 14.1628 |
0.2187 | 142.0 | 16046 | 0.5024 | 80.6562 | 77.5245 | 80.5202 | 80.6065 | 14.0923 |
0.2187 | 143.0 | 16159 | 0.5031 | 80.8168 | 77.7025 | 80.686 | 80.7553 | 14.1326 |
0.2187 | 144.0 | 16272 | 0.5031 | 80.6336 | 77.4867 | 80.4892 | 80.5611 | 14.1023 |
0.2187 | 145.0 | 16385 | 0.5038 | 80.8002 | 77.6863 | 80.6746 | 80.7336 | 14.1393 |
0.2187 | 146.0 | 16498 | 0.5040 | 80.8977 | 77.7878 | 80.7642 | 80.8317 | 14.1628 |
0.2166 | 147.0 | 16611 | 0.5039 | 80.8977 | 77.7878 | 80.7642 | 80.8317 | 14.1628 |
0.2166 | 148.0 | 16724 | 0.5038 | 80.8002 | 77.6863 | 80.6746 | 80.7336 | 14.1393 |
0.2166 | 149.0 | 16837 | 0.5039 | 80.8002 | 77.6863 | 80.6746 | 80.7336 | 14.1393 |
0.2166 | 150.0 | 16950 | 0.5039 | 80.8002 | 77.6863 | 80.6746 | 80.7336 | 14.1393 |
Framework versions
- Transformers 4.26.0
- Pytorch 1.13.1+cu116
- Datasets 2.9.0
- Tokenizers 0.13.2