<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->
mt5-large-gecid-e8-b8
This model is a fine-tuned version of google/mt5-large on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 0.3000
- Rouge1: 64.4729
- Rouge2: 57.8072
- Rougel: 64.3868
- Rougelsum: 64.3569
- Gen Len: 18.7495
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.001
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Adafactor
- lr_scheduler_type: linear
- num_epochs: 8
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
---|---|---|---|---|---|---|---|---|
0.8319 | 0.57 | 500 | 0.4310 | 61.7619 | 53.4157 | 61.6684 | 61.6628 | 18.7567 |
0.4258 | 1.13 | 1000 | 0.3541 | 62.8056 | 55.1747 | 62.7392 | 62.7231 | 18.7601 |
0.2777 | 1.7 | 1500 | 0.3231 | 63.4739 | 56.1433 | 63.366 | 63.3544 | 18.7546 |
0.2023 | 2.26 | 2000 | 0.3068 | 64.1314 | 57.5343 | 64.0453 | 64.024 | 18.7546 |
0.1432 | 2.83 | 2500 | 0.3000 | 64.4729 | 57.8072 | 64.3868 | 64.3569 | 18.7495 |
0.0976 | 3.39 | 3000 | 0.3257 | 64.7215 | 58.3266 | 64.6223 | 64.5957 | 18.7601 |
0.0811 | 3.96 | 3500 | 0.3112 | 64.7518 | 58.4888 | 64.6487 | 64.6454 | 18.7648 |
0.0472 | 4.52 | 4000 | 0.3389 | 64.9658 | 58.822 | 64.8741 | 64.8621 | 18.7592 |
0.0413 | 5.09 | 4500 | 0.3557 | 64.9468 | 58.8286 | 64.8609 | 64.8501 | 18.7592 |
0.0248 | 5.66 | 5000 | 0.3452 | 65.2004 | 59.2566 | 65.0876 | 65.0889 | 18.7605 |
0.0195 | 6.22 | 5500 | 0.3719 | 65.1043 | 59.083 | 65.0369 | 65.026 | 18.7541 |
0.013 | 6.79 | 6000 | 0.3947 | 65.3124 | 59.486 | 65.2434 | 65.2324 | 18.7571 |
0.0084 | 7.35 | 6500 | 0.4056 | 65.4053 | 59.6589 | 65.3249 | 65.3115 | 18.7580 |
0.0055 | 7.92 | 7000 | 0.4216 | 65.3303 | 59.5344 | 65.2475 | 65.2284 | 18.7567 |
Framework versions
- Transformers 4.28.1
- Pytorch 1.11.0a0+b6df043
- Datasets 2.12.0
- Tokenizers 0.13.2