<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->
t5-v1_1-large-gramatika-e8-b16
This model is a fine-tuned version of google/t5-v1_1-large on the None dataset. It achieves the following results on the evaluation set:
- Loss: 0.2989
- Rouge1: 37.1965
- Rouge2: 24.2827
- Rougel: 36.5452
- Rougelsum: 36.5445
- Gen Len: 18.9586
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.001
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- optimizer: Adafactor
- lr_scheduler_type: linear
- num_epochs: 8
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
---|---|---|---|---|---|---|---|---|
7.3244 | 0.09 | 74 | 3.8419 | 0.3537 | 0.0 | 0.3598 | 0.3573 | 6.9757 |
2.7066 | 0.18 | 148 | 0.9252 | 32.7398 | 19.8083 | 31.7966 | 31.7841 | 18.9313 |
0.9934 | 0.26 | 222 | 0.7273 | 30.368 | 17.9104 | 29.4136 | 29.3998 | 18.9065 |
0.7874 | 0.35 | 296 | 0.6207 | 31.8458 | 18.3572 | 30.0515 | 30.0257 | 18.7863 |
0.7069 | 0.44 | 370 | 0.6075 | 32.1758 | 19.8468 | 31.0615 | 31.0736 | 18.4541 |
0.6554 | 0.53 | 444 | 0.5267 | 30.604 | 17.5666 | 29.302 | 29.2871 | 18.8514 |
0.6254 | 0.61 | 518 | 0.5145 | 31.0439 | 18.0458 | 29.9822 | 29.9589 | 18.9183 |
0.5855 | 0.7 | 592 | 0.4891 | 31.0492 | 18.0405 | 29.8537 | 29.821 | 18.9574 |
0.5556 | 0.79 | 666 | 0.4685 | 29.6069 | 16.7684 | 28.5009 | 28.4625 | 18.9657 |
0.5386 | 0.88 | 740 | 0.4521 | 30.9245 | 18.2542 | 29.7053 | 29.6875 | 18.9426 |
0.516 | 0.96 | 814 | 0.4458 | 29.137 | 16.6893 | 28.2723 | 28.2606 | 18.9556 |
0.4669 | 1.05 | 888 | 0.4507 | 29.4366 | 17.1905 | 28.6334 | 28.6032 | 18.9378 |
0.4567 | 1.14 | 962 | 0.4331 | 27.3096 | 15.2103 | 26.5713 | 26.5633 | 18.9603 |
0.4424 | 1.23 | 1036 | 0.4349 | 29.3283 | 16.9229 | 28.5846 | 28.5672 | 18.9580 |
0.4589 | 1.31 | 1110 | 0.4120 | 32.624 | 19.88 | 31.6376 | 31.6099 | 18.9509 |
0.4314 | 1.4 | 1184 | 0.3658 | 36.1436 | 22.72 | 35.2657 | 35.23 | 18.9633 |
0.3823 | 1.49 | 1258 | 0.3540 | 36.9872 | 23.5963 | 36.1081 | 36.0763 | 18.9580 |
0.3782 | 1.58 | 1332 | 0.3522 | 37.4911 | 24.4598 | 36.6052 | 36.5828 | 18.9568 |
0.3724 | 1.66 | 1406 | 0.3459 | 37.4042 | 24.1839 | 36.4717 | 36.4641 | 18.9538 |
0.3624 | 1.75 | 1480 | 0.3387 | 37.1286 | 23.9806 | 36.2797 | 36.2668 | 18.9562 |
0.3617 | 1.84 | 1554 | 0.3359 | 37.6265 | 24.457 | 36.7569 | 36.7362 | 18.9562 |
0.3541 | 1.93 | 1628 | 0.3243 | 37.2929 | 24.17 | 36.4118 | 36.3794 | 18.9591 |
0.3343 | 2.01 | 1702 | 0.3295 | 36.2905 | 22.9738 | 35.5576 | 35.5383 | 18.9562 |
0.2963 | 2.1 | 1776 | 0.3232 | 37.0949 | 24.0381 | 36.3987 | 36.3927 | 18.9574 |
0.293 | 2.19 | 1850 | 0.3236 | 37.3034 | 24.4663 | 36.5368 | 36.5221 | 18.9515 |
0.2807 | 2.28 | 1924 | 0.3253 | 37.1597 | 23.9664 | 36.392 | 36.3799 | 18.9532 |
0.2833 | 2.36 | 1998 | 0.3229 | 37.3155 | 24.5448 | 36.5408 | 36.5315 | 18.9562 |
0.2811 | 2.45 | 2072 | 0.3197 | 37.0244 | 24.0061 | 36.2442 | 36.2437 | 18.9580 |
0.2897 | 2.54 | 2146 | 0.3262 | 37.7875 | 24.8509 | 36.9713 | 36.9418 | 18.9556 |
0.281 | 2.63 | 2220 | 0.3170 | 37.3213 | 24.4874 | 36.5776 | 36.578 | 18.9556 |
0.283 | 2.71 | 2294 | 0.3191 | 37.7782 | 24.9701 | 37.0462 | 37.0249 | 18.9520 |
0.2789 | 2.8 | 2368 | 0.3100 | 37.3532 | 24.4451 | 36.6026 | 36.6134 | 18.9550 |
0.2852 | 2.89 | 2442 | 0.3074 | 37.7314 | 24.8536 | 36.8982 | 36.8855 | 18.9568 |
0.2769 | 2.98 | 2516 | 0.3050 | 37.8349 | 24.9791 | 37.0464 | 37.0394 | 18.9562 |
0.2232 | 3.07 | 2590 | 0.3182 | 37.3663 | 24.6293 | 36.6626 | 36.6556 | 18.9586 |
0.2136 | 3.15 | 2664 | 0.3161 | 37.442 | 24.7267 | 36.7524 | 36.7319 | 18.9591 |
0.2111 | 3.24 | 2738 | 0.3166 | 37.5901 | 25.033 | 36.9297 | 36.9232 | 18.9597 |
0.2263 | 3.33 | 2812 | 0.3116 | 37.2114 | 24.4805 | 36.5848 | 36.5715 | 18.9580 |
0.226 | 3.42 | 2886 | 0.3135 | 37.2795 | 24.5095 | 36.5465 | 36.5392 | 18.9615 |
0.2196 | 3.5 | 2960 | 0.3082 | 37.4243 | 24.7474 | 36.7057 | 36.6987 | 18.9597 |
0.2336 | 3.59 | 3034 | 0.3110 | 37.3997 | 24.7691 | 36.7111 | 36.676 | 18.9586 |
0.2252 | 3.68 | 3108 | 0.3062 | 37.5415 | 24.708 | 36.8383 | 36.8225 | 18.9556 |
0.2215 | 3.77 | 3182 | 0.3060 | 37.8759 | 25.2347 | 37.1689 | 37.1637 | 18.9586 |
0.2199 | 3.85 | 3256 | 0.3071 | 37.8181 | 25.203 | 37.1552 | 37.1284 | 18.9597 |
0.2198 | 3.94 | 3330 | 0.2989 | 37.1965 | 24.2827 | 36.5452 | 36.5445 | 18.9586 |
0.1995 | 4.03 | 3404 | 0.3188 | 37.4228 | 24.7797 | 36.7334 | 36.7136 | 18.9615 |
0.1644 | 4.12 | 3478 | 0.3319 | 37.8745 | 25.0742 | 37.1756 | 37.1481 | 18.9586 |
0.1658 | 4.2 | 3552 | 0.3237 | 37.6372 | 24.8896 | 36.9718 | 36.9736 | 18.9580 |
0.1659 | 4.29 | 3626 | 0.3246 | 37.5971 | 24.7016 | 36.8816 | 36.8483 | 18.9639 |
0.166 | 4.38 | 3700 | 0.3199 | 37.4797 | 24.6671 | 36.8323 | 36.8183 | 18.9556 |
0.1695 | 4.47 | 3774 | 0.3182 | 37.341 | 24.5427 | 36.6646 | 36.6467 | 18.9639 |
0.1686 | 4.55 | 3848 | 0.3184 | 37.7272 | 25.1173 | 37.0446 | 37.023 | 18.9586 |
0.173 | 4.64 | 3922 | 0.3169 | 37.6081 | 25.0347 | 36.9833 | 36.9654 | 18.9556 |
0.1663 | 4.73 | 3996 | 0.3195 | 37.649 | 24.9763 | 36.9835 | 36.9837 | 18.9544 |
0.1708 | 4.82 | 4070 | 0.3163 | 37.5371 | 24.8374 | 36.8298 | 36.81 | 18.9538 |
0.1672 | 4.9 | 4144 | 0.3083 | 37.7726 | 25.0021 | 37.0716 | 37.0516 | 18.9556 |
0.1717 | 4.99 | 4218 | 0.3074 | 37.3131 | 24.4449 | 36.6688 | 36.6471 | 18.9580 |
0.1247 | 5.08 | 4292 | 0.3454 | 37.2492 | 24.3823 | 36.5979 | 36.5877 | 18.9556 |
0.1213 | 5.17 | 4366 | 0.3413 | 37.6422 | 24.7825 | 36.932 | 36.9114 | 18.9532 |
0.1188 | 5.25 | 4440 | 0.3422 | 37.0801 | 24.3343 | 36.4899 | 36.4784 | 18.9603 |
0.1232 | 5.34 | 4514 | 0.3399 | 37.3245 | 24.5098 | 36.6559 | 36.6598 | 18.9603 |
0.124 | 5.43 | 4588 | 0.3353 | 37.5291 | 24.8469 | 36.9393 | 36.9157 | 18.9603 |
0.1229 | 5.52 | 4662 | 0.3375 | 37.5804 | 24.8558 | 36.9588 | 36.9544 | 18.9550 |
0.1224 | 5.6 | 4736 | 0.3386 | 37.413 | 24.546 | 36.7393 | 36.7255 | 18.9550 |
0.1255 | 5.69 | 4810 | 0.3353 | 37.6581 | 24.9656 | 36.9669 | 36.9585 | 18.9574 |
0.1247 | 5.78 | 4884 | 0.3335 | 37.4876 | 24.8124 | 36.8152 | 36.8063 | 18.9591 |
0.1222 | 5.87 | 4958 | 0.3406 | 37.6636 | 25.0828 | 37.0048 | 36.9894 | 18.9586 |
0.1295 | 5.96 | 5032 | 0.3401 | 37.4384 | 24.6793 | 36.7714 | 36.7632 | 18.9580 |
0.1028 | 6.04 | 5106 | 0.3799 | 37.5785 | 24.9752 | 36.9199 | 36.9153 | 18.9580 |
0.0853 | 6.13 | 5180 | 0.3847 | 37.5618 | 24.8706 | 36.8649 | 36.8567 | 18.9591 |
0.0846 | 6.22 | 5254 | 0.3817 | 37.4679 | 24.6632 | 36.837 | 36.832 | 18.9550 |
0.0878 | 6.31 | 5328 | 0.3797 | 37.4487 | 24.706 | 36.8022 | 36.7943 | 18.9574 |
0.0837 | 6.39 | 5402 | 0.3878 | 37.3141 | 24.3938 | 36.6554 | 36.6488 | 18.9532 |
0.0869 | 6.48 | 5476 | 0.3777 | 37.4747 | 24.7014 | 36.8219 | 36.817 | 18.9526 |
0.0884 | 6.57 | 5550 | 0.3833 | 37.4056 | 24.6101 | 36.7594 | 36.7579 | 18.9568 |
0.0818 | 6.66 | 5624 | 0.3888 | 37.3717 | 24.5535 | 36.7294 | 36.7492 | 18.9538 |
0.0863 | 6.74 | 5698 | 0.3751 | 37.366 | 24.5767 | 36.7547 | 36.7647 | 18.9532 |
0.0874 | 6.83 | 5772 | 0.3786 | 37.514 | 24.7148 | 36.8546 | 36.8595 | 18.9580 |
0.0888 | 6.92 | 5846 | 0.3799 | 37.4945 | 24.8184 | 36.8356 | 36.8387 | 18.9580 |
0.0818 | 7.01 | 5920 | 0.3936 | 37.5272 | 24.7012 | 36.8834 | 36.8741 | 18.9568 |
0.0607 | 7.09 | 5994 | 0.4133 | 37.396 | 24.4797 | 36.7453 | 36.74 | 18.9580 |
0.0607 | 7.18 | 6068 | 0.4209 | 37.3158 | 24.3938 | 36.6666 | 36.6663 | 18.9568 |
0.0618 | 7.27 | 6142 | 0.4209 | 37.2297 | 24.3193 | 36.5939 | 36.5864 | 18.9574 |
0.0634 | 7.36 | 6216 | 0.4211 | 37.4652 | 24.5617 | 36.8308 | 36.8168 | 18.9550 |
0.0605 | 7.44 | 6290 | 0.4222 | 37.3595 | 24.5574 | 36.7481 | 36.7443 | 18.9562 |
0.0625 | 7.53 | 6364 | 0.4201 | 37.4266 | 24.6134 | 36.7968 | 36.7865 | 18.9556 |
0.0621 | 7.62 | 6438 | 0.4200 | 37.335 | 24.5693 | 36.7204 | 36.7125 | 18.9562 |
0.061 | 7.71 | 6512 | 0.4191 | 37.4042 | 24.6454 | 36.7709 | 36.7707 | 18.9562 |
0.0584 | 7.79 | 6586 | 0.4189 | 37.4168 | 24.6259 | 36.8094 | 36.7893 | 18.9568 |
0.0584 | 7.88 | 6660 | 0.4236 | 37.4738 | 24.7008 | 36.8525 | 36.8305 | 18.9568 |
0.0588 | 7.97 | 6734 | 0.4238 | 37.4491 | 24.6411 | 36.8082 | 36.8053 | 18.9568 |
Framework versions
- Transformers 4.30.1
- Pytorch 1.11.0a0+b6df043
- Datasets 2.12.0
- Tokenizers 0.13.3