<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->
mt5-small-finetuned-xlsum-pt
This model is a fine-tuned version of google/mt5-small on the xlsum dataset. It achieves the following results on the evaluation set:
- Loss: 0.0605
- Rouge1: 32.3494
- Rouge2: 30.3905
- Rougel: 32.3618
- Rougelsum: 32.3812
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5.6e-05
- train_batch_size: 10
- eval_batch_size: 10
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 20
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum |
---|---|---|---|---|---|---|---|
6.6937 | 1.0 | 250 | 0.3756 | 18.9876 | 12.3642 | 17.2829 | 18.0448 |
0.8443 | 2.0 | 500 | 0.2099 | 30.25 | 26.014 | 28.5706 | 29.7488 |
0.4789 | 3.0 | 750 | 0.1865 | 32.1825 | 28.5039 | 30.5814 | 31.7753 |
0.3354 | 4.0 | 1000 | 0.1681 | 32.3097 | 28.7547 | 30.7607 | 31.9049 |
0.276 | 5.0 | 1250 | 0.1478 | 32.1616 | 28.9927 | 31.0027 | 31.8607 |
0.2201 | 6.0 | 1500 | 0.1230 | 32.0846 | 29.0709 | 31.1033 | 31.8126 |
0.1839 | 7.0 | 1750 | 0.1069 | 32.248 | 29.5883 | 31.5348 | 32.0254 |
0.1564 | 8.0 | 2000 | 0.0893 | 32.1162 | 29.825 | 31.8022 | 32.0306 |
0.1333 | 9.0 | 2250 | 0.0834 | 32.172 | 30.1658 | 32.1334 | 32.1825 |
0.1204 | 10.0 | 2500 | 0.0744 | 32.3183 | 30.3614 | 32.3315 | 32.354 |
0.1126 | 11.0 | 2750 | 0.0757 | 32.336 | 30.376 | 32.348 | 32.3675 |
0.0978 | 12.0 | 3000 | 0.0618 | 32.336 | 30.376 | 32.348 | 32.3675 |
0.0915 | 13.0 | 3250 | 0.0677 | 32.3494 | 30.3905 | 32.3618 | 32.3812 |
0.0863 | 14.0 | 3500 | 0.0657 | 32.3494 | 30.3905 | 32.3618 | 32.3812 |
0.0837 | 15.0 | 3750 | 0.0583 | 32.3494 | 30.3905 | 32.3618 | 32.3812 |
0.0804 | 16.0 | 4000 | 0.0644 | 32.3494 | 30.3905 | 32.3618 | 32.3812 |
0.0755 | 17.0 | 4250 | 0.0630 | 32.3494 | 30.3905 | 32.3618 | 32.3812 |
0.0755 | 18.0 | 4500 | 0.0604 | 32.3494 | 30.3905 | 32.3618 | 32.3812 |
0.0738 | 19.0 | 4750 | 0.0627 | 32.3494 | 30.3905 | 32.3618 | 32.3812 |
0.0735 | 20.0 | 5000 | 0.0605 | 32.3494 | 30.3905 | 32.3618 | 32.3812 |
Framework versions
- Transformers 4.34.1
- Pytorch 2.1.0+cu118
- Datasets 2.14.6
- Tokenizers 0.14.1