<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->
t5-small-finetuned-t5
This model is a fine-tuned version of t5-small on the None dataset. It achieves the following results on the evaluation set:
- Loss: 0.7570
- Rouge1: 70.2787
- Rouge2: 55.5377
- Rougel: 63.9121
- Rougelsum: 64.3555
- Gen Len: 17.2031
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 20
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
---|---|---|---|---|---|---|---|---|
No log | 1.0 | 12 | 0.9687 | 65.8469 | 50.6135 | 60.4571 | 60.6727 | 16.8281 |
No log | 2.0 | 24 | 0.9260 | 65.8803 | 50.712 | 60.4633 | 60.6409 | 16.8125 |
No log | 3.0 | 36 | 0.8922 | 66.4957 | 50.843 | 60.8577 | 61.1762 | 17.0156 |
No log | 4.0 | 48 | 0.8663 | 67.4976 | 52.3429 | 61.5901 | 61.8548 | 16.8906 |
No log | 5.0 | 60 | 0.8471 | 67.7331 | 52.4301 | 61.4319 | 61.6517 | 16.9062 |
No log | 6.0 | 72 | 0.8304 | 68.0701 | 52.916 | 61.8953 | 62.0532 | 16.7812 |
No log | 7.0 | 84 | 0.8151 | 68.5051 | 53.9372 | 62.9052 | 63.0314 | 16.7969 |
No log | 8.0 | 96 | 0.8033 | 68.7841 | 54.2968 | 63.019 | 63.1808 | 16.9375 |
No log | 9.0 | 108 | 0.7928 | 68.9694 | 54.605 | 63.1564 | 63.4155 | 16.9844 |
No log | 10.0 | 120 | 0.7851 | 69.2494 | 54.8377 | 63.4448 | 63.7369 | 17.0312 |
No log | 11.0 | 132 | 0.7802 | 69.2075 | 54.9086 | 63.6303 | 63.8855 | 17.0312 |
No log | 12.0 | 144 | 0.7756 | 69.2319 | 54.8675 | 63.4849 | 63.928 | 17.0312 |
No log | 13.0 | 156 | 0.7716 | 69.0732 | 54.472 | 63.0335 | 63.4969 | 17.1562 |
No log | 14.0 | 168 | 0.7657 | 68.993 | 54.5342 | 62.9104 | 63.2738 | 17.0312 |
No log | 15.0 | 180 | 0.7632 | 70.1458 | 55.4883 | 63.8544 | 64.3299 | 17.2031 |
No log | 16.0 | 192 | 0.7614 | 69.8971 | 54.8277 | 63.4274 | 63.842 | 17.2031 |
No log | 17.0 | 204 | 0.7595 | 70.2733 | 55.4028 | 63.8387 | 64.2672 | 17.2031 |
No log | 18.0 | 216 | 0.7581 | 70.2787 | 55.5377 | 63.9121 | 64.3555 | 17.2031 |
No log | 19.0 | 228 | 0.7573 | 70.2787 | 55.5377 | 63.9121 | 64.3555 | 17.2031 |
No log | 20.0 | 240 | 0.7570 | 70.2787 | 55.5377 | 63.9121 | 64.3555 | 17.2031 |
Framework versions
- Transformers 4.28.1
- Pytorch 2.0.0+cu118
- Datasets 2.12.0
- Tokenizers 0.13.3