<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->
opus-mt-en-zh-hk
This model is a fine-tuned version of steve-tong/opus-mt-en-zh-tw on the None dataset. It achieves the following results on the evaluation set:
- Loss: 5.7483
- Bleu: 2.0939
- Gen Len: 8.8344
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 50
Training results
Training Loss | Epoch | Step | Bleu | Gen Len | Validation Loss |
---|---|---|---|---|---|
6.1985 | 1.0 | 3204 | 0.0368 | 15.7821 | 5.5151 |
5.1515 | 2.0 | 6408 | 0.0795 | 19.0206 | 4.8442 |
4.4578 | 3.0 | 9612 | 0.1236 | 15.8192 | 4.5900 |
4.0205 | 4.0 | 12816 | 0.2263 | 11.7562 | 4.3855 |
3.6807 | 5.0 | 16020 | 0.3763 | 10.0861 | 4.2938 |
3.3622 | 6.0 | 19224 | 0.8981 | 9.1685 | 4.2150 |
3.1207 | 7.0 | 22428 | 0.9003 | 8.7014 | 4.3173 |
2.8693 | 8.0 | 25632 | 1.2798 | 8.6273 | 4.2797 |
2.7172 | 9.0 | 28836 | 1.3176 | 8.4922 | 4.2541 |
2.5925 | 10.0 | 32040 | 1.2774 | 8.6812 | 4.2033 |
2.4255 | 11.0 | 35244 | 1.3112 | 8.5317 | 4.3955 |
2.3242 | 12.0 | 38448 | 1.4831 | 8.7599 | 4.4269 |
2.1889 | 13.0 | 41652 | 1.5538 | 8.6474 | 4.3731 |
2.0876 | 14.0 | 44856 | 1.45 | 8.5721 | 4.4453 |
2.0078 | 15.0 | 48060 | 1.4117 | 8.6339 | 4.5300 |
1.9271 | 16.0 | 51264 | 1.546 | 8.7039 | 4.5676 |
1.8291 | 17.0 | 54468 | 1.406 | 8.6009 | 4.6800 |
1.7886 | 18.0 | 57672 | 1.2522 | 8.549 | 4.6512 |
1.6894 | 19.0 | 60876 | 1.6497 | 8.6231 | 4.8486 |
1.6176 | 20.0 | 64080 | 1.5496 | 8.6013 | 4.7852 |
1.5721 | 21.0 | 67284 | 1.5994 | 8.7434 | 4.8427 |
1.5352 | 22.0 | 70488 | 1.4812 | 8.6895 | 4.8117 |
1.4536 | 23.0 | 73692 | 1.527 | 8.7088 | 4.9496 |
1.3996 | 24.0 | 76896 | 1.596 | 8.7047 | 5.0385 |
1.3619 | 25.0 | 80100 | 1.4476 | 8.9811 | 5.0234 |
1.3395 | 26.0 | 83304 | 1.4646 | 8.7657 | 5.0767 |
1.2822 | 27.0 | 86508 | 1.3204 | 8.8608 | 5.1034 |
1.254 | 28.0 | 89712 | 1.8617 | 8.9263 | 5.1776 |
1.1714 | 29.0 | 92916 | 1.3892 | 8.7879 | 5.1935 |
1.1895 | 30.0 | 96120 | 1.4488 | 8.7516 | 5.2259 |
1.1355 | 31.0 | 99324 | 1.4837 | 8.6726 | 5.3575 |
1.114 | 32.0 | 102528 | 1.4092 | 8.6701 | 5.3746 |
1.0678 | 33.0 | 105732 | 1.6906 | 8.79 | 5.3924 |
1.0689 | 34.0 | 108936 | 1.7832 | 8.8237 | 5.4634 |
1.0323 | 35.0 | 112140 | 2.0318 | 8.8081 | 5.4653 |
0.9952 | 36.0 | 115344 | 1.9861 | 8.832 | 5.5036 |
0.9845 | 37.0 | 118548 | 1.6519 | 8.7566 | 5.5411 |
0.9545 | 38.0 | 121752 | 1.6037 | 8.8245 | 5.5439 |
0.9143 | 39.0 | 124956 | 2.0811 | 8.8068 | 5.6464 |
0.9264 | 40.0 | 128160 | 1.7974 | 9.0354 | 5.6386 |
0.8856 | 41.0 | 131364 | 2.0425 | 8.8093 | 5.6490 |
0.8818 | 42.0 | 134568 | 2.1628 | 8.7829 | 5.6748 |
0.8592 | 43.0 | 137772 | 2.0719 | 8.825 | 5.6744 |
0.8536 | 44.0 | 140976 | 1.6899 | 8.8377 | 5.6870 |
0.8428 | 45.0 | 144180 | 2.128 | 8.8241 | 5.7233 |
0.8315 | 46.0 | 147384 | 2.0585 | 8.8151 | 5.7139 |
0.8185 | 47.0 | 150588 | 2.0572 | 8.8299 | 5.7853 |
0.8142 | 48.0 | 153792 | 2.0756 | 8.8427 | 5.7462 |
0.7832 | 49.0 | 156996 | 2.1042 | 8.8381 | 5.7406 |
0.7934 | 50.0 | 160200 | 5.7483 | 2.0939 | 8.8344 |
Framework versions
- Transformers 4.31.0
- Pytorch 2.0.1+cu117
- Datasets 2.14.0
- Tokenizers 0.13.3