generated_from_trainer

<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->

mt5-base-gecfirst-e8-b16

This model is a fine-tuned version of google/mt5-base on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
3.409 0.25 74 0.6899 58.0459 46.7233 57.9944 57.9576 18.7669
1.0497 0.5 148 0.4335 61.3353 51.8804 61.174 61.1541 18.7703
0.8355 0.75 222 0.3734 62.5279 54.5952 62.4436 62.4377 18.7720
0.7339 1.0 296 0.3814 62.8071 54.8468 62.7075 62.6933 18.7770
0.5946 1.25 370 0.3418 63.1523 55.3752 62.9987 62.9879 18.7770
0.5746 1.49 444 0.3234 62.9253 55.1955 62.821 62.7592 18.7905
0.5278 1.74 518 0.3252 63.3056 55.6505 63.1271 63.0661 18.7804
0.4886 1.99 592 0.3265 63.1652 55.0909 62.979 62.9613 18.7753
0.366 2.24 666 0.3126 63.8131 56.5685 63.7303 63.6682 18.7703
0.3553 2.49 740 0.3192 63.6195 55.9276 63.4796 63.4692 18.7703
0.3558 2.74 814 0.3009 63.8499 56.2662 63.73 63.6591 18.7736
0.353 2.99 888 0.3014 63.7417 56.241 63.6192 63.5985 18.7686
0.2398 3.24 962 0.3119 63.999 56.8854 63.88 63.8705 18.7804
0.2459 3.49 1036 0.3222 64.0299 56.5581 63.9247 63.8934 18.7686
0.2423 3.74 1110 0.3125 63.6601 56.1864 63.4956 63.4819 18.7686
0.243 3.99 1184 0.3174 63.6676 56.1724 63.5183 63.4947 18.7736
0.1696 4.24 1258 0.3353 63.9905 56.3781 63.7979 63.7802 18.7652
0.1643 4.48 1332 0.3386 64.0219 56.7311 63.8823 63.8654 18.7703
0.1728 4.73 1406 0.3306 64.0261 56.7331 63.8978 63.8731 18.7720
0.1657 4.98 1480 0.3269 63.9735 56.4556 63.8514 63.8168 18.7703
0.1186 5.23 1554 0.3390 63.9831 56.6624 63.8953 63.8717 18.7703
0.1129 5.48 1628 0.3521 63.8674 56.528 63.7626 63.7362 18.7770
0.1061 5.73 1702 0.3539 63.9886 56.5753 63.881 63.8615 18.7703
0.1179 5.98 1776 0.3490 63.9949 56.7369 63.8929 63.8516 18.7736
0.0793 6.23 1850 0.3704 64.1527 57.0111 64.0496 63.9953 18.7686
0.0779 6.48 1924 0.3723 64.1833 57.0654 64.0686 64.0317 18.7669
0.0827 6.73 1998 0.3663 64.2185 56.9382 64.1096 64.0743 18.7736
0.0807 6.98 2072 0.3691 64.2298 56.9752 64.0957 64.0777 18.7686
0.0633 7.23 2146 0.3865 64.4729 57.5503 64.3733 64.3509 18.7652
0.0603 7.47 2220 0.3919 64.3001 57.2684 64.1693 64.1391 18.7635
0.0565 7.72 2294 0.3946 64.4077 57.3413 64.2825 64.2491 18.7635
0.0583 7.97 2368 0.3923 64.4078 57.3672 64.2775 64.2367 18.7652

Framework versions