<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->
flan-t5-large-extraction-cnndm_4000-all-ep20
This model is a fine-tuned version of google/flan-t5-large on the None dataset. It achieves the following results on the evaluation set:
- Loss: 1.7262
- Rouge1: 34.7376
- Rouge2: 15.0654
- Rougel: 29.8906
- Rougelsum: 29.9217
- Gen Len: 19.0
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 8
- eval_batch_size: 24
- seed: 1799
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 20
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
---|---|---|---|---|---|---|---|---|
2.1456 | 0.4 | 200 | 1.8334 | 34.4069 | 14.6208 | 29.4432 | 29.4556 | 18.986 |
1.9791 | 0.8 | 400 | 1.7728 | 34.2402 | 14.6376 | 29.5349 | 29.5363 | 18.986 |
1.8807 | 1.2 | 600 | 1.7601 | 34.5439 | 15.1529 | 29.8476 | 29.803 | 18.986 |
1.8082 | 1.6 | 800 | 1.7609 | 34.6764 | 15.2571 | 30.0381 | 29.9641 | 18.99 |
1.8193 | 2.0 | 1000 | 1.7262 | 34.7376 | 15.0654 | 29.8906 | 29.9217 | 19.0 |
1.7098 | 2.4 | 1200 | 1.7295 | 35.1726 | 15.5835 | 30.3994 | 30.4092 | 18.998 |
1.702 | 2.8 | 1400 | 1.7391 | 34.3418 | 15.311 | 29.6477 | 29.6509 | 19.0 |
1.6659 | 3.2 | 1600 | 1.7512 | 34.5394 | 15.5073 | 30.1215 | 30.1061 | 18.996 |
1.6216 | 3.6 | 1800 | 1.7485 | 34.9325 | 15.5266 | 30.0803 | 30.0892 | 19.0 |
1.6217 | 4.0 | 2000 | 1.7286 | 34.8255 | 15.328 | 29.7924 | 29.8133 | 19.0 |
1.5645 | 4.4 | 2200 | 1.7570 | 35.0673 | 15.3704 | 29.9302 | 29.9333 | 19.0 |
1.5454 | 4.8 | 2400 | 1.7453 | 34.7481 | 15.1067 | 29.9309 | 29.873 | 19.0 |
1.5012 | 5.2 | 2600 | 1.7661 | 35.0534 | 15.4418 | 30.0802 | 30.0423 | 19.0 |
1.4867 | 5.6 | 2800 | 1.7647 | 35.5583 | 15.5157 | 30.4281 | 30.4351 | 19.0 |
1.4914 | 6.0 | 3000 | 1.7587 | 35.0749 | 15.5551 | 29.9916 | 30.0154 | 19.0 |
1.4307 | 6.4 | 3200 | 1.7922 | 34.8942 | 15.248 | 29.6589 | 29.6454 | 19.0 |
1.432 | 6.8 | 3400 | 1.7761 | 34.5489 | 15.0722 | 29.585 | 29.5869 | 19.0 |
1.4142 | 7.2 | 3600 | 1.7966 | 34.4441 | 15.3042 | 29.8935 | 29.8639 | 19.0 |
1.3605 | 7.6 | 3800 | 1.8043 | 34.511 | 15.4382 | 29.5617 | 29.564 | 19.0 |
1.3846 | 8.0 | 4000 | 1.7951 | 35.1945 | 15.5203 | 29.8749 | 29.879 | 19.0 |
1.3311 | 8.4 | 4200 | 1.8232 | 35.4176 | 15.8548 | 30.4327 | 30.3956 | 19.0 |
1.3329 | 8.8 | 4400 | 1.8223 | 35.4969 | 15.6317 | 30.2432 | 30.222 | 19.0 |
1.3214 | 9.2 | 4600 | 1.8441 | 35.4614 | 15.8153 | 29.9465 | 29.9406 | 19.0 |
1.2979 | 9.6 | 4800 | 1.8392 | 34.9863 | 15.5971 | 30.003 | 30.0149 | 19.0 |
1.2782 | 10.0 | 5000 | 1.8395 | 35.1628 | 15.8039 | 30.1611 | 30.1363 | 19.0 |
1.2708 | 10.4 | 5200 | 1.8701 | 34.9822 | 15.5759 | 29.7916 | 29.7935 | 19.0 |
1.2659 | 10.8 | 5400 | 1.8575 | 35.3481 | 15.5923 | 29.9914 | 29.9391 | 19.0 |
1.2302 | 11.2 | 5600 | 1.8695 | 35.4173 | 15.8071 | 30.3649 | 30.3185 | 19.0 |
1.229 | 11.6 | 5800 | 1.8685 | 35.2428 | 15.795 | 29.9917 | 29.9481 | 18.996 |
1.2185 | 12.0 | 6000 | 1.8837 | 34.8893 | 15.6796 | 29.9493 | 29.9607 | 19.0 |
1.1934 | 12.4 | 6200 | 1.8790 | 34.8951 | 15.4768 | 29.9075 | 29.9236 | 19.0 |
1.1922 | 12.8 | 6400 | 1.9001 | 35.1576 | 15.7902 | 29.9736 | 29.9386 | 19.0 |
1.2022 | 13.2 | 6600 | 1.8984 | 34.7087 | 15.3546 | 29.5658 | 29.5201 | 18.998 |
1.173 | 13.6 | 6800 | 1.9125 | 34.9104 | 15.3827 | 29.7326 | 29.7116 | 19.0 |
1.1873 | 14.0 | 7000 | 1.8984 | 35.3422 | 15.639 | 30.1525 | 30.1138 | 19.0 |
1.1561 | 14.4 | 7200 | 1.9096 | 35.2554 | 15.747 | 30.0491 | 29.9829 | 19.0 |
1.154 | 14.8 | 7400 | 1.9043 | 35.0523 | 15.5168 | 29.974 | 29.9442 | 18.992 |
1.1351 | 15.2 | 7600 | 1.9202 | 35.2258 | 15.8262 | 30.4862 | 30.4702 | 18.996 |
1.1528 | 15.6 | 7800 | 1.9253 | 35.2228 | 15.7183 | 30.1423 | 30.1282 | 18.996 |
1.1413 | 16.0 | 8000 | 1.9251 | 35.2414 | 15.5881 | 30.0468 | 30.0212 | 18.998 |
1.1375 | 16.4 | 8200 | 1.9256 | 34.8926 | 15.451 | 29.8252 | 29.8018 | 18.992 |
1.1025 | 16.8 | 8400 | 1.9359 | 35.3022 | 15.7612 | 30.0581 | 30.0105 | 19.0 |
1.1045 | 17.2 | 8600 | 1.9452 | 35.2059 | 15.7584 | 30.0534 | 30.0004 | 19.0 |
1.1052 | 17.6 | 8800 | 1.9353 | 35.2928 | 15.7749 | 30.1623 | 30.1519 | 19.0 |
1.1261 | 18.0 | 9000 | 1.9412 | 35.5988 | 16.0219 | 30.4918 | 30.4493 | 19.0 |
1.1044 | 18.4 | 9200 | 1.9454 | 35.4386 | 15.7284 | 30.2409 | 30.194 | 19.0 |
1.0999 | 18.8 | 9400 | 1.9449 | 35.2472 | 15.7148 | 30.1136 | 30.119 | 18.992 |
1.1044 | 19.2 | 9600 | 1.9466 | 35.2895 | 15.7895 | 30.1843 | 30.1468 | 19.0 |
1.1021 | 19.6 | 9800 | 1.9474 | 35.2796 | 15.7082 | 30.0929 | 30.0662 | 19.0 |
1.0954 | 20.0 | 10000 | 1.9488 | 35.319 | 15.7079 | 30.1428 | 30.1129 | 19.0 |
Framework versions
- Transformers 4.18.0
- Pytorch 1.10.0+cu111
- Datasets 2.5.1
- Tokenizers 0.12.1