<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->
fugumt-en-ja-finetuned-en-to-ja-21939
This model is a fine-tuned version of staka/fugumt-en-ja on the None dataset. It achieves the following results on the evaluation set:
- Loss: 0.0581
- Bleu: 89.7632
- Gen Len: 14.3064
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 25
Training results
Training Loss | Epoch | Step | Validation Loss | Bleu | Gen Len |
---|---|---|---|---|---|
1.383 | 1.0 | 1372 | 1.0017 | 34.4203 | 14.7159 |
0.9732 | 2.0 | 2744 | 0.7704 | 45.7145 | 14.4503 |
0.7675 | 3.0 | 4116 | 0.6137 | 52.183 | 14.4216 |
0.6561 | 4.0 | 5488 | 0.5079 | 56.5552 | 14.5663 |
0.5517 | 5.0 | 6860 | 0.4237 | 61.6533 | 14.3675 |
0.4903 | 6.0 | 8232 | 0.3585 | 64.3794 | 14.4424 |
0.4332 | 7.0 | 9604 | 0.3038 | 67.2954 | 14.3566 |
0.3888 | 8.0 | 10976 | 0.2617 | 70.2939 | 14.4263 |
0.3368 | 9.0 | 12348 | 0.2272 | 72.0016 | 14.4322 |
0.3111 | 10.0 | 13720 | 0.1968 | 75.4577 | 14.3623 |
0.2758 | 11.0 | 15092 | 0.1691 | 76.7411 | 14.4518 |
0.2423 | 12.0 | 16464 | 0.1477 | 79.1824 | 14.3626 |
0.2253 | 13.0 | 17836 | 0.1319 | 81.0754 | 14.3106 |
0.2119 | 14.0 | 19208 | 0.1168 | 82.8484 | 14.3082 |
0.1946 | 15.0 | 20580 | 0.1033 | 84.1828 | 14.2985 |
0.1766 | 16.0 | 21952 | 0.0941 | 84.3156 | 14.3937 |
0.1663 | 17.0 | 23324 | 0.0851 | 86.1691 | 14.3255 |
0.1537 | 18.0 | 24696 | 0.0788 | 86.1466 | 14.3622 |
0.1465 | 19.0 | 26068 | 0.0727 | 88.1811 | 14.311 |
0.1385 | 20.0 | 27440 | 0.0674 | 89.3273 | 14.2689 |
0.1327 | 21.0 | 28812 | 0.0644 | 88.8862 | 14.3216 |
0.127 | 22.0 | 30184 | 0.0618 | 88.6946 | 14.3412 |
0.1214 | 23.0 | 31556 | 0.0596 | 89.6609 | 14.3072 |
0.1174 | 24.0 | 32928 | 0.0583 | 89.8021 | 14.3028 |
0.1142 | 25.0 | 34300 | 0.0581 | 89.7632 | 14.3064 |
Framework versions
- Transformers 4.34.0
- Pytorch 2.1.0+cu118
- Datasets 2.14.5
- Tokenizers 0.14.1