<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->
MIX2_en-ja_helsinki
This model is a fine-tuned version of Helsinki-NLP/opus-mt-en-jap on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 1.6703
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0003
- train_batch_size: 96
- eval_batch_size: 64
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 4
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
3.5357 | 0.02 | 4000 | 2.9519 |
2.8601 | 0.04 | 8000 | 2.6962 |
2.6183 | 0.06 | 12000 | 2.5156 |
2.4731 | 0.08 | 16000 | 2.4312 |
2.3731 | 0.1 | 20000 | 2.3575 |
2.2964 | 0.11 | 24000 | 2.3319 |
2.238 | 0.13 | 28000 | 2.2802 |
2.1919 | 0.15 | 32000 | 2.2552 |
2.1479 | 0.17 | 36000 | 2.2354 |
2.1104 | 0.19 | 40000 | 2.2210 |
2.0788 | 0.21 | 44000 | 2.1835 |
2.0552 | 0.23 | 48000 | 2.1391 |
2.0228 | 0.25 | 52000 | 2.1338 |
2.0062 | 0.27 | 56000 | 2.1115 |
1.9868 | 0.29 | 60000 | 2.1025 |
1.9628 | 0.31 | 64000 | 2.1334 |
1.9474 | 0.32 | 68000 | 2.0935 |
1.9318 | 0.34 | 72000 | 2.1030 |
1.9187 | 0.36 | 76000 | 2.0605 |
1.9019 | 0.38 | 80000 | 2.0388 |
1.8916 | 0.4 | 84000 | 2.0360 |
1.8775 | 0.42 | 88000 | 2.0356 |
1.8689 | 0.44 | 92000 | 2.0315 |
1.8558 | 0.46 | 96000 | 2.0169 |
1.8431 | 0.48 | 100000 | 2.0213 |
1.8373 | 0.5 | 104000 | 2.0071 |
1.8224 | 0.52 | 108000 | 2.0093 |
1.8181 | 0.53 | 112000 | 1.9952 |
1.8087 | 0.55 | 116000 | 1.9927 |
1.7998 | 0.57 | 120000 | 1.9726 |
1.7947 | 0.59 | 124000 | 1.9817 |
1.7874 | 0.61 | 128000 | 1.9650 |
1.7781 | 0.63 | 132000 | 1.9688 |
1.7712 | 0.65 | 136000 | 1.9655 |
1.7631 | 0.67 | 140000 | 1.9561 |
1.7577 | 0.69 | 144000 | 1.9529 |
1.7528 | 0.71 | 148000 | 1.9447 |
1.746 | 0.73 | 152000 | 1.9700 |
1.7386 | 0.74 | 156000 | 1.9413 |
1.7329 | 0.76 | 160000 | 1.9329 |
1.7285 | 0.78 | 164000 | 1.9289 |
1.7227 | 0.8 | 168000 | 1.9337 |
1.7186 | 0.82 | 172000 | 1.9263 |
1.7116 | 0.84 | 176000 | 1.9407 |
1.7072 | 0.86 | 180000 | 1.9059 |
1.7032 | 0.88 | 184000 | 1.9380 |
1.6932 | 0.9 | 188000 | 1.9183 |
1.6921 | 0.92 | 192000 | 1.9131 |
1.6875 | 0.94 | 196000 | 1.9180 |
1.6846 | 0.96 | 200000 | 1.9040 |
1.6797 | 0.97 | 204000 | 1.9089 |
1.6725 | 0.99 | 208000 | 1.9024 |
1.6589 | 1.01 | 212000 | 1.8909 |
1.6507 | 1.03 | 216000 | 1.8837 |
1.6441 | 1.05 | 220000 | 1.8906 |
1.6445 | 1.07 | 224000 | 1.8914 |
1.6394 | 1.09 | 228000 | 1.8833 |
1.6382 | 1.11 | 232000 | 1.8837 |
1.6376 | 1.13 | 236000 | 1.8869 |
1.6329 | 1.15 | 240000 | 1.8829 |
1.6294 | 1.17 | 244000 | 1.8845 |
1.6273 | 1.18 | 248000 | 1.8888 |
1.6243 | 1.2 | 252000 | 1.8709 |
1.6226 | 1.22 | 256000 | 1.8418 |
1.6177 | 1.24 | 260000 | 1.8587 |
1.6151 | 1.26 | 264000 | 1.8526 |
1.6111 | 1.28 | 268000 | 1.8494 |
1.6084 | 1.3 | 272000 | 1.8781 |
1.6043 | 1.32 | 276000 | 1.8390 |
1.6011 | 1.34 | 280000 | 1.8603 |
1.5999 | 1.36 | 284000 | 1.8515 |
1.5954 | 1.38 | 288000 | 1.8356 |
1.5936 | 1.39 | 292000 | 1.8530 |
1.5916 | 1.41 | 296000 | 1.8475 |
1.5886 | 1.43 | 300000 | 1.8410 |
1.5883 | 1.45 | 304000 | 1.8153 |
1.5828 | 1.47 | 308000 | 1.8254 |
1.582 | 1.49 | 312000 | 1.8139 |
1.578 | 1.51 | 316000 | 1.8366 |
1.5723 | 1.53 | 320000 | 1.8353 |
1.5705 | 1.55 | 324000 | 1.8230 |
1.5691 | 1.57 | 328000 | 1.8194 |
1.5656 | 1.59 | 332000 | 1.8069 |
1.566 | 1.6 | 336000 | 1.8204 |
1.5604 | 1.62 | 340000 | 1.8307 |
1.5573 | 1.64 | 344000 | 1.8209 |
1.5547 | 1.66 | 348000 | 1.8320 |
1.5545 | 1.68 | 352000 | 1.8179 |
1.5519 | 1.7 | 356000 | 1.8323 |
1.545 | 1.72 | 360000 | 1.8005 |
1.5483 | 1.74 | 364000 | 1.8034 |
1.5454 | 1.76 | 368000 | 1.7997 |
1.5393 | 1.78 | 372000 | 1.8078 |
1.5381 | 1.8 | 376000 | 1.8204 |
1.5347 | 1.81 | 380000 | 1.8071 |
1.5327 | 1.83 | 384000 | 1.7997 |
1.529 | 1.85 | 388000 | 1.8012 |
1.5287 | 1.87 | 392000 | 1.8028 |
1.5273 | 1.89 | 396000 | 1.8103 |
1.5194 | 1.91 | 400000 | 1.8008 |
1.5197 | 1.93 | 404000 | 1.8004 |
1.5218 | 1.95 | 408000 | 1.8024 |
1.514 | 1.97 | 412000 | 1.7852 |
1.5146 | 1.99 | 416000 | 1.7908 |
1.5045 | 2.01 | 420000 | 1.7864 |
1.4876 | 2.02 | 424000 | 1.7813 |
1.4846 | 2.04 | 428000 | 1.7822 |
1.4865 | 2.06 | 432000 | 1.7737 |
1.4857 | 2.08 | 436000 | 1.7668 |
1.4825 | 2.1 | 440000 | 1.7681 |
1.4828 | 2.12 | 444000 | 1.7685 |
1.4821 | 2.14 | 448000 | 1.7636 |
1.4778 | 2.16 | 452000 | 1.7778 |
1.4803 | 2.18 | 456000 | 1.7834 |
1.4766 | 2.2 | 460000 | 1.7801 |
1.4741 | 2.22 | 464000 | 1.7601 |
1.4705 | 2.23 | 468000 | 1.7665 |
1.4739 | 2.25 | 472000 | 1.7604 |
1.4694 | 2.27 | 476000 | 1.7803 |
1.4665 | 2.29 | 480000 | 1.7835 |
1.4668 | 2.31 | 484000 | 1.7670 |
1.4605 | 2.33 | 488000 | 1.7629 |
1.4626 | 2.35 | 492000 | 1.7612 |
1.4627 | 2.37 | 496000 | 1.7612 |
1.4569 | 2.39 | 500000 | 1.7557 |
1.455 | 2.41 | 504000 | 1.7599 |
1.4547 | 2.43 | 508000 | 1.7569 |
1.453 | 2.44 | 512000 | 1.7589 |
1.4515 | 2.46 | 516000 | 1.7679 |
1.4501 | 2.48 | 520000 | 1.7574 |
1.4446 | 2.5 | 524000 | 1.7526 |
1.4456 | 2.52 | 528000 | 1.7506 |
1.4445 | 2.54 | 532000 | 1.7484 |
1.4428 | 2.56 | 536000 | 1.7447 |
1.439 | 2.58 | 540000 | 1.7468 |
1.441 | 2.6 | 544000 | 1.7609 |
1.4358 | 2.62 | 548000 | 1.7498 |
1.4318 | 2.64 | 552000 | 1.7592 |
1.4276 | 2.65 | 556000 | 1.7452 |
1.4317 | 2.67 | 560000 | 1.7500 |
1.4277 | 2.69 | 564000 | 1.7392 |
1.4259 | 2.71 | 568000 | 1.7351 |
1.4239 | 2.73 | 572000 | 1.7385 |
1.4191 | 2.75 | 576000 | 1.7487 |
1.4204 | 2.77 | 580000 | 1.7392 |
1.4176 | 2.79 | 584000 | 1.7372 |
1.4147 | 2.81 | 588000 | 1.7347 |
1.4154 | 2.83 | 592000 | 1.7085 |
1.4134 | 2.85 | 596000 | 1.7103 |
1.4091 | 2.87 | 600000 | 1.7124 |
1.4091 | 2.88 | 604000 | 1.7369 |
1.406 | 2.9 | 608000 | 1.7142 |
1.4028 | 2.92 | 612000 | 1.7376 |
1.4019 | 2.94 | 616000 | 1.7201 |
1.4018 | 2.96 | 620000 | 1.7230 |
1.3959 | 2.98 | 624000 | 1.7206 |
1.3985 | 3.0 | 628000 | 1.7183 |
1.3681 | 3.02 | 632000 | 1.7283 |
1.3668 | 3.04 | 636000 | 1.7330 |
1.3687 | 3.06 | 640000 | 1.7187 |
1.3681 | 3.08 | 644000 | 1.7163 |
1.3687 | 3.09 | 648000 | 1.7249 |
1.364 | 3.11 | 652000 | 1.7283 |
1.364 | 3.13 | 656000 | 1.7091 |
1.3652 | 3.15 | 660000 | 1.7030 |
1.3623 | 3.17 | 664000 | 1.7058 |
1.3604 | 3.19 | 668000 | 1.7101 |
1.3598 | 3.21 | 672000 | 1.7104 |
1.3577 | 3.23 | 676000 | 1.7028 |
1.3574 | 3.25 | 680000 | 1.7023 |
1.3546 | 3.27 | 684000 | 1.7197 |
1.3549 | 3.29 | 688000 | 1.7045 |
1.3534 | 3.3 | 692000 | 1.6990 |
1.3511 | 3.32 | 696000 | 1.6971 |
1.3504 | 3.34 | 700000 | 1.6894 |
1.346 | 3.36 | 704000 | 1.6820 |
1.3467 | 3.38 | 708000 | 1.6920 |
1.3461 | 3.4 | 712000 | 1.6897 |
1.3425 | 3.42 | 716000 | 1.6962 |
1.34 | 3.44 | 720000 | 1.6864 |
1.3408 | 3.46 | 724000 | 1.6860 |
1.3387 | 3.48 | 728000 | 1.6924 |
1.3377 | 3.5 | 732000 | 1.6919 |
1.3378 | 3.51 | 736000 | 1.6858 |
1.334 | 3.53 | 740000 | 1.6816 |
1.3347 | 3.55 | 744000 | 1.6867 |
1.3307 | 3.57 | 748000 | 1.6859 |
1.3316 | 3.59 | 752000 | 1.6896 |
1.3257 | 3.61 | 756000 | 1.6824 |
1.3222 | 3.63 | 760000 | 1.6819 |
1.3247 | 3.65 | 764000 | 1.6809 |
1.3207 | 3.67 | 768000 | 1.6775 |
1.3227 | 3.69 | 772000 | 1.6807 |
1.3203 | 3.71 | 776000 | 1.6750 |
1.3203 | 3.72 | 780000 | 1.6758 |
1.316 | 3.74 | 784000 | 1.6787 |
1.3147 | 3.76 | 788000 | 1.6747 |
1.3146 | 3.78 | 792000 | 1.6718 |
1.3137 | 3.8 | 796000 | 1.6744 |
1.3143 | 3.82 | 800000 | 1.6733 |
1.3123 | 3.84 | 804000 | 1.6754 |
1.3069 | 3.86 | 808000 | 1.6734 |
1.3122 | 3.88 | 812000 | 1.6742 |
1.3074 | 3.9 | 816000 | 1.6742 |
1.3006 | 3.92 | 820000 | 1.6709 |
1.308 | 3.93 | 824000 | 1.6714 |
1.3063 | 3.95 | 828000 | 1.6727 |
1.3036 | 3.97 | 832000 | 1.6711 |
1.3048 | 3.99 | 836000 | 1.6703 |
Framework versions
- Transformers 4.19.2
- Pytorch 1.11.0+cu113
- Datasets 2.2.2
- Tokenizers 0.12.1