<!-- This model card has been generated automatically according to the information Keras had access to. You should probably proofread and complete it, then remove this comment. -->
nlewins/mt5-small-finetuned-ceb-to-en-tfF
This model is a fine-tuned version of google/mt5-small on an unknown dataset. It achieves the following results on the evaluation set:
- Train Loss: 2.8310
- Validation Loss: 2.4015
- Train Bleu: 8.1285
- Train Gen Len: 24.9738
- Epoch: 14
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- optimizer: {'name': 'AdamWeightDecay', 'learning_rate': {'module': 'keras.optimizers.schedules', 'class_name': 'ExponentialDecay', 'config': {'initial_learning_rate': 0.0001, 'decay_steps': 10000, 'decay_rate': 0.6, 'staircase': False, 'name': None}, 'registered_name': None}, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-07, 'amsgrad': False, 'weight_decay_rate': 0.0001}
- training_precision: float32
Training results
Train Loss | Validation Loss | Train Bleu | Train Gen Len | Epoch |
---|---|---|---|---|
5.5533 | 3.4696 | 0.3245 | 83.2584 | 0 |
4.9562 | 3.3422 | 0.4530 | 67.5953 | 1 |
4.5932 | 3.2308 | 0.6581 | 69.0826 | 2 |
4.3586 | 3.1293 | 0.8691 | 52.5993 | 3 |
4.1536 | 3.0517 | 0.9126 | 54.0998 | 4 |
3.9791 | 2.9735 | 1.1463 | 46.1946 | 5 |
3.8239 | 2.8983 | 1.5638 | 43.0474 | 6 |
3.6819 | 2.8297 | 3.5463 | 27.3876 | 7 |
3.5491 | 2.7644 | 3.7140 | 26.6296 | 8 |
3.4196 | 2.6981 | 3.7475 | 28.8782 | 9 |
3.2935 | 2.6319 | 4.5706 | 29.1112 | 10 |
3.1766 | 2.5738 | 4.6794 | 29.0564 | 11 |
3.0598 | 2.5163 | 6.6200 | 24.2396 | 12 |
2.9397 | 2.4572 | 6.8257 | 25.4146 | 13 |
2.8310 | 2.4015 | 8.1285 | 24.9738 | 14 |
Framework versions
- Transformers 4.34.0
- TensorFlow 2.14.0
- Datasets 2.14.5
- Tokenizers 0.14.1