generated_from_trainer

<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->

mt5-large-gecfirst-e8-b16

This model is a fine-tuned version of google/mt5-large on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
1.8204 0.25 74 0.4021 61.4087 52.3887 61.2674 61.3674 18.7804
0.7246 0.5 148 0.3252 63.347 55.3862 63.1874 63.2961 18.7652
0.6142 0.75 222 0.3028 63.725 56.2856 63.5597 63.6491 18.7838
0.5472 1.0 296 0.2919 63.8647 56.6097 63.7525 63.8544 18.7973
0.3687 1.25 370 0.2777 64.0686 56.686 63.883 63.9804 18.7703
0.3907 1.49 444 0.2870 64.0517 56.6668 63.9062 64.0017 18.7838
0.3466 1.74 518 0.2726 64.2559 57.4463 64.1045 64.2199 18.7770
0.3341 1.99 592 0.2672 64.1391 56.9117 64.0719 64.1665 18.7753
0.2036 2.24 666 0.2834 64.5476 57.8246 64.3771 64.5255 18.7804
0.2091 2.49 740 0.2897 64.1422 56.9715 64.0481 64.1689 18.7432
0.2002 2.74 814 0.2703 64.6648 57.707 64.4805 64.5948 18.7804
0.204 2.99 888 0.2824 64.0966 56.9705 63.9888 64.073 18.7551
0.1185 3.24 962 0.3022 64.4346 57.6011 64.3542 64.4615 18.7939
0.117 3.49 1036 0.2870 64.455 57.3607 64.2925 64.3963 18.7669
0.1135 3.74 1110 0.2890 64.7671 58.0409 64.5938 64.6987 18.7669
0.1175 3.99 1184 0.2977 64.8082 58.0379 64.6993 64.7849 18.7652
0.0726 4.24 1258 0.3135 64.5297 57.6752 64.4134 64.5109 18.7736
0.0654 4.48 1332 0.3298 64.5051 57.6982 64.3561 64.4885 18.7787
0.0719 4.73 1406 0.3139 64.8793 58.1936 64.749 64.8532 18.7720
0.0665 4.98 1480 0.3174 64.9015 58.1975 64.786 64.907 18.7703
0.0452 5.23 1554 0.3272 64.5715 58.067 64.4336 64.5425 18.7889
0.0395 5.48 1628 0.3337 64.7712 58.1058 64.6351 64.7423 18.7703
0.0367 5.73 1702 0.3422 64.9298 58.4592 64.8188 64.8927 18.7787
0.0393 5.98 1776 0.3394 64.8953 58.162 64.7892 64.8822 18.7787
0.0247 6.23 1850 0.3532 64.9207 58.2827 64.8053 64.8903 18.7872
0.0222 6.48 1924 0.3543 64.902 58.3086 64.793 64.8973 18.7736
0.0203 6.73 1998 0.3628 65.1022 58.7138 64.9734 65.0891 18.7720
0.0218 6.98 2072 0.3599 64.9409 58.387 64.7925 64.9157 18.7720
0.0156 7.23 2146 0.3802 65.1242 58.8116 64.9962 65.1097 18.7736
0.013 7.47 2220 0.3845 64.9358 58.4528 64.8099 64.925 18.7703
0.0114 7.72 2294 0.3913 64.9827 58.6449 64.863 64.9661 18.7720
0.0125 7.97 2368 0.3886 65.0031 58.5507 64.8805 64.9845 18.7720

Framework versions