<!-- This model card has been generated automatically according to the information Keras had access to. You should probably proofread and complete it, then remove this comment. -->
nlewins/mt5-small-finetuned-ceb-to-en-tfZ
This model is a fine-tuned version of google/mt5-small on an unknown dataset. It achieves the following results on the evaluation set:
- Train Loss: 2.3206
- Validation Loss: 3.1352
- Train Bleu: 6.9732
- Train Gen Len: 33.6093
- Epoch: 39
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- optimizer: {'name': 'AdamWeightDecay', 'learning_rate': 1e-04, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-07, 'amsgrad': False, 'weight_decay_rate': 0.01}
- training_precision: float32
Training results
Train Loss | Validation Loss | Train Bleu | Train Gen Len | Epoch |
---|---|---|---|---|
8.3663 | 4.9326 | 0.0133 | 508.5833 | 0 |
6.4712 | 4.3528 | 0.0497 | 197.7093 | 1 |
5.8897 | 4.1185 | 0.2051 | 122.2426 | 2 |
5.5178 | 3.9618 | 0.3938 | 95.8593 | 3 |
5.2559 | 3.8786 | 0.6506 | 57.1111 | 4 |
5.0615 | 3.8403 | 0.5691 | 51.1630 | 5 |
4.9009 | 3.7958 | 0.8084 | 39.2685 | 6 |
4.7622 | 3.7542 | 0.8647 | 51.6370 | 7 |
4.6515 | 3.7048 | 0.9399 | 55.9722 | 8 |
4.5334 | 3.6699 | 1.1787 | 52.1241 | 9 |
4.4349 | 3.6267 | 1.5369 | 43.9111 | 10 |
4.3203 | 3.5959 | 1.5818 | 40.9870 | 11 |
4.2457 | 3.5649 | 1.4775 | 50.0574 | 12 |
4.1571 | 3.5386 | 1.4422 | 54.0852 | 13 |
4.0728 | 3.5110 | 1.7008 | 48.4907 | 14 |
3.9757 | 3.4783 | 1.9308 | 45.6537 | 15 |
3.9063 | 3.4515 | 2.3652 | 41.0037 | 16 |
3.8218 | 3.4235 | 2.7974 | 41.2296 | 17 |
3.7604 | 3.3947 | 2.4012 | 43.0667 | 18 |
3.6770 | 3.3680 | 2.9153 | 39.3056 | 19 |
3.6081 | 3.3500 | 3.2060 | 40.7574 | 20 |
3.5273 | 3.3267 | 3.0406 | 41.6926 | 21 |
3.4594 | 3.3039 | 3.0793 | 40.8852 | 22 |
3.3859 | 3.2710 | 2.6225 | 53.7741 | 23 |
3.3230 | 3.2544 | 3.0726 | 46.8148 | 24 |
3.2496 | 3.2400 | 4.0928 | 38.8981 | 25 |
3.1771 | 3.2242 | 4.2043 | 39.2056 | 26 |
3.1167 | 3.2031 | 3.9126 | 42.8056 | 27 |
3.0427 | 3.1927 | 4.2052 | 39.0981 | 28 |
2.9833 | 3.1833 | 5.0329 | 34.4593 | 29 |
2.9094 | 3.1710 | 4.7041 | 37.4611 | 30 |
2.8395 | 3.1557 | 5.6423 | 35.4833 | 31 |
2.7686 | 3.1503 | 5.3470 | 36.2185 | 32 |
2.7053 | 3.1500 | 5.5612 | 36.6574 | 33 |
2.6525 | 3.1370 | 5.8202 | 35.2241 | 34 |
2.5782 | 3.1341 | 5.7995 | 38.1556 | 35 |
2.5070 | 3.1295 | 6.8106 | 32.2759 | 36 |
2.4419 | 3.1287 | 6.8896 | 34.6815 | 37 |
2.3756 | 3.1295 | 6.8859 | 36.5444 | 38 |
2.3206 | 3.1352 | 6.9732 | 33.6093 | 39 |
Framework versions
- Transformers 4.33.3
- TensorFlow 2.14.0
- Datasets 2.14.5
- Tokenizers 0.13.3