generated_from_trainer

<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->

mt5-large-gramatika-final-e8-b16

This model is a fine-tuned version of google/mt5-large on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
0.8846 0.37 300 0.2954 64.6179 55.294 64.2807 64.2792 18.5597
0.3711 0.73 600 0.2474 65.6388 57.2663 65.3219 65.3365 18.5592
0.2874 1.1 900 0.2193 65.8689 57.6871 65.5424 65.5719 18.5603
0.1953 1.46 1200 0.2131 66.0438 57.8166 65.7565 65.7705 18.5409
0.1919 1.83 1500 0.1999 66.308 58.8739 66.1027 66.1039 18.5592
0.1487 2.2 1800 0.2034 66.5939 59.0628 66.3361 66.3475 18.5592
0.1132 2.56 2100 0.2010 67.0441 59.8117 66.8455 66.8562 18.5487
0.1087 2.93 2400 0.2001 67.0048 59.7807 66.7885 66.7972 18.5535
0.0681 3.29 2700 0.2143 67.2327 60.2527 67.0047 67.0106 18.5556
0.0621 3.66 3000 0.2093 67.357 60.51 67.1561 67.1709 18.5466
0.062 4.02 3300 0.2157 67.4353 60.7193 67.2526 67.2554 18.5624
0.036 4.39 3600 0.2208 67.5469 60.8111 67.3457 67.3472 18.5503
0.0351 4.76 3900 0.2282 67.3835 60.4009 67.138 67.1612 18.5561
0.0297 5.12 4200 0.2370 67.4004 60.5787 67.2004 67.2087 18.5603
0.0193 5.49 4500 0.2446 67.5339 60.6808 67.3484 67.3737 18.5577
0.0185 5.85 4800 0.2483 67.5055 60.8104 67.3217 67.3443 18.5566
0.0134 6.22 5100 0.2563 67.5748 60.9475 67.3996 67.4081 18.5597
0.0114 6.59 5400 0.2585 67.6337 61.0146 67.4553 67.472 18.5482
0.0099 6.95 5700 0.2622 67.6613 61.037 67.4761 67.4843 18.5498
0.0067 7.32 6000 0.2728 67.7996 61.2206 67.6194 67.6282 18.5561
0.0052 7.68 6300 0.2802 67.8009 61.2862 67.6178 67.6357 18.5545

Framework versions