<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->
t5-small-finetuned-xsum
This model is a fine-tuned version of t5-small on the None dataset. It achieves the following results on the evaluation set:
- Loss: 0.7508
- Rouge1: 80.2235
- Rouge2: 65.3985
- Rougel: 72.688
- Rougelsum: 72.6198
- Gen Len: 22.875
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 4e-05
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 20
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
---|---|---|---|---|---|---|---|---|
No log | 1.0 | 12 | 1.6130 | 70.4837 | 52.7814 | 63.3244 | 63.3287 | 20.3281 |
No log | 2.0 | 24 | 1.4112 | 72.9304 | 54.9574 | 64.9972 | 65.0999 | 21.125 |
No log | 3.0 | 36 | 1.2749 | 73.2378 | 55.3429 | 65.6596 | 65.6389 | 21.4844 |
No log | 4.0 | 48 | 1.1673 | 73.2812 | 55.4138 | 65.6078 | 65.606 | 21.4688 |
No log | 5.0 | 60 | 1.0788 | 73.4664 | 56.0973 | 66.1987 | 66.2205 | 21.4375 |
No log | 6.0 | 72 | 1.0031 | 72.9477 | 55.5372 | 65.7097 | 65.6621 | 21.375 |
No log | 7.0 | 84 | 0.9399 | 74.12 | 56.6822 | 66.7108 | 66.7065 | 21.6562 |
No log | 8.0 | 96 | 0.8896 | 74.1359 | 56.5629 | 66.9081 | 66.8847 | 21.875 |
No log | 9.0 | 108 | 0.8495 | 74.2343 | 57.1158 | 67.0403 | 67.057 | 21.8906 |
No log | 10.0 | 120 | 0.8210 | 75.9254 | 59.5339 | 69.0056 | 68.979 | 22.1094 |
No log | 11.0 | 132 | 0.8033 | 77.9804 | 62.5401 | 70.9546 | 70.9545 | 22.2344 |
No log | 12.0 | 144 | 0.7897 | 78.0975 | 62.6435 | 70.8877 | 70.8848 | 22.2969 |
No log | 13.0 | 156 | 0.7794 | 78.5314 | 63.3776 | 71.1166 | 70.9786 | 22.4531 |
No log | 14.0 | 168 | 0.7729 | 79.2929 | 64.5595 | 72.1205 | 72.0081 | 22.4375 |
No log | 15.0 | 180 | 0.7658 | 79.3893 | 64.524 | 72.1437 | 72.0392 | 22.4844 |
No log | 16.0 | 192 | 0.7603 | 79.3798 | 64.6518 | 72.1035 | 71.9902 | 22.7031 |
No log | 17.0 | 204 | 0.7568 | 79.7509 | 64.9921 | 72.2505 | 72.1764 | 22.7656 |
No log | 18.0 | 216 | 0.7534 | 79.7509 | 64.9921 | 72.2505 | 72.1764 | 22.7656 |
No log | 19.0 | 228 | 0.7514 | 80.2235 | 65.3985 | 72.688 | 72.6198 | 22.875 |
No log | 20.0 | 240 | 0.7508 | 80.2235 | 65.3985 | 72.688 | 72.6198 | 22.875 |
Framework versions
- Transformers 4.28.1
- Pytorch 2.0.0+cu118
- Datasets 2.11.0
- Tokenizers 0.13.3