generated_from_trainer

<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->

mt5-large-fce-e8-b16

This model is a fine-tuned version of google/mt5-large on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
1.2105 0.23 400 0.4344 84.6268 76.3447 84.0402 84.0182 15.4564
0.4664 0.45 800 0.4256 84.3821 75.6104 83.8113 83.8303 15.4404
0.434 0.68 1200 0.3839 84.0212 75.7319 83.4232 83.431 15.4952
0.406 0.9 1600 0.3713 84.7743 76.7805 84.2379 84.2352 15.4514
0.3193 1.13 2000 0.3665 84.634 76.5132 84.0604 84.0755 15.4774
0.2693 1.35 2400 0.3718 84.6587 76.7057 84.099 84.1045 15.4619
0.2815 1.58 2800 0.3617 84.5181 76.6792 83.9922 83.9976 15.4820
0.2776 1.81 3200 0.3526 84.5329 76.3656 83.9027 83.9238 15.4614
0.2551 2.03 3600 0.3720 84.504 76.6676 83.9957 84.0108 15.4801
0.1617 2.26 4000 0.3648 84.4385 76.3684 83.8585 83.8657 15.4897
0.1711 2.48 4400 0.3671 84.5241 76.6518 83.9862 83.9987 15.4902
0.1771 2.71 4800 0.3607 84.6437 76.6682 84.103 84.1174 15.4683
0.1803 2.93 5200 0.3582 84.479 76.6205 83.9509 83.9504 15.4715
0.1199 3.16 5600 0.3971 84.6367 76.7872 84.0191 84.0534 15.4715
0.1005 3.39 6000 0.4085 84.5153 76.6564 83.9365 83.9506 15.4820
0.1033 3.61 6400 0.4007 84.3191 76.399 83.8183 83.8142 15.4728
0.1067 3.84 6800 0.4014 84.5289 76.5335 83.9706 83.9967 15.4674
0.09 4.06 7200 0.4328 84.3978 76.6231 83.8654 83.8728 15.4783
0.0574 4.29 7600 0.4305 84.4476 76.7198 83.8943 83.9 15.4820
0.0579 4.51 8000 0.4510 84.5536 76.7635 83.977 83.9745 15.4719
0.061 4.74 8400 0.4447 84.5632 76.9892 84.0419 84.0501 15.4815
0.0608 4.97 8800 0.4353 84.6004 76.8883 84.0518 84.0596 15.4788
0.0362 5.19 9200 0.4853 84.7169 77.1321 84.1485 84.1486 15.4760
0.0333 5.42 9600 0.5053 84.851 77.4661 84.307 84.3106 15.4829
0.0325 5.64 10000 0.5066 84.7412 77.3031 84.2107 84.2006 15.4948
0.0335 5.87 10400 0.4947 84.7596 77.2636 84.2156 84.224 15.4906
0.0269 6.09 10800 0.5306 84.7484 77.2693 84.1824 84.1962 15.4811
0.0184 6.32 11200 0.5535 84.8066 77.3749 84.2765 84.2989 15.4756
0.0177 6.55 11600 0.5555 84.7335 77.2108 84.1917 84.2084 15.4865
0.0168 6.77 12000 0.5538 84.7053 77.2902 84.184 84.1929 15.4792
0.0165 7.0 12400 0.5614 84.7332 77.3098 84.2055 84.2055 15.4879
0.0092 7.22 12800 0.6222 84.7668 77.3059 84.2235 84.2397 15.4724
0.0086 7.45 13200 0.6485 84.8211 77.4247 84.2857 84.2996 15.4751
0.0098 7.67 13600 0.6417 84.7854 77.4226 84.2457 84.2652 15.4865
0.0088 7.9 14000 0.6445 84.7809 77.4171 84.2396 84.2591 15.4852

Framework versions