<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->
mt5-small-finetuned-ceb-to-en
This model is a fine-tuned version of google/mt5-small on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 2.7918
- Bleu: 6.6526
- Gen Len: 18.6833
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 30
Training results
Training Loss | Epoch | Step | Validation Loss | Bleu | Gen Len |
---|---|---|---|---|---|
No log | 1.0 | 200 | 3.5253 | 0.8512 | 14.7241 |
No log | 2.0 | 400 | 3.2010 | 2.4929 | 18.2537 |
5.2995 | 3.0 | 600 | 3.0428 | 3.4365 | 18.5389 |
5.2995 | 4.0 | 800 | 2.9507 | 3.8069 | 18.5981 |
3.4982 | 5.0 | 1000 | 2.9166 | 4.3253 | 18.6185 |
3.4982 | 6.0 | 1200 | 2.8733 | 4.4634 | 18.6389 |
3.4982 | 7.0 | 1400 | 2.8533 | 4.6691 | 18.7056 |
3.033 | 8.0 | 1600 | 2.8196 | 5.0742 | 18.7241 |
3.033 | 9.0 | 1800 | 2.8147 | 5.1342 | 18.6963 |
2.7548 | 10.0 | 2000 | 2.8037 | 5.1828 | 18.7185 |
2.7548 | 11.0 | 2200 | 2.8060 | 5.7308 | 18.7204 |
2.7548 | 12.0 | 2400 | 2.7843 | 5.5475 | 18.7611 |
2.5246 | 13.0 | 2600 | 2.7785 | 5.8638 | 18.6648 |
2.5246 | 14.0 | 2800 | 2.7880 | 5.9661 | 18.65 |
2.3599 | 15.0 | 3000 | 2.7751 | 5.8611 | 18.6796 |
2.3599 | 16.0 | 3200 | 2.7864 | 6.0026 | 18.6759 |
2.3599 | 17.0 | 3400 | 2.7718 | 5.9613 | 18.6815 |
2.2156 | 18.0 | 3600 | 2.7731 | 6.0885 | 18.6852 |
2.2156 | 19.0 | 3800 | 2.7813 | 5.9766 | 18.7019 |
2.1247 | 20.0 | 4000 | 2.7832 | 6.2296 | 18.6481 |
2.1247 | 21.0 | 4200 | 2.7895 | 6.0762 | 18.6815 |
2.1247 | 22.0 | 4400 | 2.7939 | 6.3456 | 18.7333 |
2.0354 | 23.0 | 4600 | 2.7812 | 6.634 | 18.7241 |
2.0354 | 24.0 | 4800 | 2.7832 | 6.7509 | 18.6944 |
1.9776 | 25.0 | 5000 | 2.7918 | 6.7466 | 18.7204 |
1.9776 | 26.0 | 5200 | 2.7892 | 6.6911 | 18.6778 |
1.9776 | 27.0 | 5400 | 2.7877 | 6.6905 | 18.7167 |
1.9273 | 28.0 | 5600 | 2.7914 | 6.7334 | 18.6944 |
1.9273 | 29.0 | 5800 | 2.7926 | 6.7418 | 18.7296 |
1.9107 | 30.0 | 6000 | 2.7918 | 6.6526 | 18.6833 |
Framework versions
- Transformers 4.33.3
- Pytorch 2.0.1+cu118
- Datasets 2.14.5
- Tokenizers 0.13.3