<!-- This model card has been generated automatically according to the information Keras had access to. You should probably proofread and complete it, then remove this comment. -->
nlewins/mt5-small-finetuned-ceb-to-en-tfB
This model is a fine-tuned version of google/mt5-small on an unknown dataset. It achieves the following results on the evaluation set:
- Train Loss: 2.1878
- Validation Loss: 2.8672
- Train Bleu: 9.2653
- Train Gen Len: 36.1537
- Epoch: 14
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- optimizer: {'name': 'AdamWeightDecay', 'learning_rate': {'module': 'keras.optimizers.schedules', 'class_name': 'ExponentialDecay', 'config': {'initial_learning_rate': 0.0001, 'decay_steps': 10000, 'decay_rate': 0.6, 'staircase': False, 'name': None}, 'registered_name': None}, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-07, 'amsgrad': False, 'weight_decay_rate': 0.001}
- training_precision: float32
Training results
Train Loss | Validation Loss | Train Bleu | Train Gen Len | Epoch |
---|---|---|---|---|
4.4999 | 3.4628 | 1.2745 | 60.2167 | 0 |
3.9760 | 3.2352 | 3.6864 | 36.5093 | 1 |
3.6750 | 3.1059 | 3.9249 | 47.7481 | 2 |
3.4297 | 3.0283 | 4.1365 | 48.5130 | 3 |
3.2527 | 2.9850 | 5.4970 | 43.9037 | 4 |
3.0913 | 2.9621 | 6.0053 | 38.5148 | 5 |
2.9751 | 2.9220 | 5.2995 | 49.2481 | 6 |
2.8262 | 2.9061 | 6.3456 | 42.9852 | 7 |
2.7195 | 2.9013 | 7.2433 | 40.0963 | 8 |
2.6230 | 2.8939 | 7.5359 | 39.3241 | 9 |
2.5184 | 2.8758 | 8.3874 | 37.6593 | 10 |
2.4252 | 2.8733 | 8.4245 | 38.2185 | 11 |
2.3377 | 2.8657 | 7.3978 | 42.1981 | 12 |
2.2623 | 2.8664 | 7.5845 | 43.0315 | 13 |
2.1878 | 2.8672 | 9.2653 | 36.1537 | 14 |
Framework versions
- Transformers 4.33.3
- TensorFlow 2.14.0
- Datasets 2.14.5
- Tokenizers 0.13.3