<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->
flan-t5-base-clang8-e8-b16
This model is a fine-tuned version of google/flan-t5-base on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 0.3295
- Rouge1: 81.7467
- Rouge2: 75.6203
- Rougel: 81.3074
- Rougelsum: 81.3474
- Gen Len: 16.7252
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.001
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- optimizer: Adafactor
- lr_scheduler_type: linear
- num_epochs: 8
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
---|---|---|---|---|---|---|---|---|
0.2375 | 0.34 | 50000 | 0.3787 | 79.3834 | 72.8125 | 78.8797 | 78.9476 | 16.4064 |
0.193 | 0.68 | 100000 | 0.3648 | 80.5738 | 74.213 | 80.1251 | 80.1723 | 16.6474 |
0.1741 | 1.02 | 150000 | 0.3528 | 80.3142 | 73.9623 | 79.8468 | 79.9051 | 16.5495 |
0.1472 | 1.36 | 200000 | 0.3404 | 81.6263 | 75.4272 | 81.2208 | 81.2709 | 16.5801 |
0.1438 | 1.7 | 250000 | 0.3295 | 81.7467 | 75.6203 | 81.3074 | 81.3474 | 16.7252 |
0.1366 | 2.04 | 300000 | 0.3335 | 82.2013 | 76.1543 | 81.8099 | 81.8542 | 16.6817 |
0.1149 | 2.38 | 350000 | 0.3388 | 82.0549 | 76.0367 | 81.612 | 81.6705 | 16.6349 |
0.1158 | 2.72 | 400000 | 0.3380 | 81.8668 | 75.7396 | 81.3624 | 81.4208 | 16.6718 |
0.1096 | 3.06 | 450000 | 0.3355 | 82.2566 | 76.2443 | 81.8557 | 81.8799 | 16.6314 |
0.091 | 3.4 | 500000 | 0.3563 | 81.6493 | 75.5706 | 81.192 | 81.2568 | 16.6118 |
0.0929 | 3.74 | 550000 | 0.3457 | 81.8688 | 75.9296 | 81.3592 | 81.4311 | 16.5591 |
0.0866 | 4.08 | 600000 | 0.3640 | 82.1338 | 76.0786 | 81.7067 | 81.7521 | 16.6241 |
0.071 | 4.42 | 650000 | 0.3623 | 82.1396 | 76.1338 | 81.7451 | 81.7737 | 16.6887 |
0.0726 | 4.76 | 700000 | 0.3504 | 82.2745 | 76.2648 | 81.875 | 81.9091 | 16.6860 |
0.0663 | 5.1 | 750000 | 0.3918 | 82.0424 | 76.092 | 81.6495 | 81.6595 | 16.6449 |
0.0535 | 5.44 | 800000 | 0.3818 | 82.1967 | 76.2488 | 81.7716 | 81.807 | 16.6835 |
0.0544 | 5.78 | 850000 | 0.3857 | 81.966 | 75.9981 | 81.5936 | 81.6119 | 16.6449 |
0.0483 | 6.13 | 900000 | 0.4260 | 82.3128 | 76.3527 | 81.908 | 81.9394 | 16.6730 |
0.0388 | 6.47 | 950000 | 0.4271 | 82.2929 | 76.3275 | 81.8769 | 81.9049 | 16.7122 |
0.0387 | 6.81 | 1000000 | 0.4203 | 82.3302 | 76.4099 | 81.9579 | 81.9744 | 16.6992 |
0.0338 | 7.15 | 1050000 | 0.4709 | 82.1839 | 76.2599 | 81.7796 | 81.8151 | 16.6737 |
0.0277 | 7.49 | 1100000 | 0.4756 | 82.2163 | 76.2606 | 81.7974 | 81.8287 | 16.6691 |
0.0271 | 7.83 | 1150000 | 0.4760 | 82.1215 | 76.1794 | 81.7053 | 81.7387 | 16.6965 |
Framework versions
- Transformers 4.27.4
- Pytorch 1.11.0a0+b6df043
- Datasets 2.11.0
- Tokenizers 0.13.2