<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->
mlm_bert-steps27053-bs4096-0.0003-8-8-512-0.1
This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 1.8228
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0003
- train_batch_size: 512
- eval_batch_size: 512
- seed: 42
- distributed_type: multi-GPU
- num_devices: 2
- gradient_accumulation_steps: 8
- total_train_batch_size: 8192
- total_eval_batch_size: 1024
- optimizer: Adam with betas=(0.9,0.98) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.1
- training_steps: 27053
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
9.2672 | 0.15 | 100 | 9.1858 |
8.166 | 0.3 | 200 | 8.0656 |
6.9338 | 0.44 | 300 | 6.8679 |
6.4326 | 0.59 | 400 | 6.3892 |
6.2326 | 0.74 | 500 | 6.1984 |
6.1182 | 0.89 | 600 | 6.0812 |
6.0395 | 1.03 | 700 | 6.0034 |
5.9768 | 1.18 | 800 | 5.9629 |
5.9271 | 1.33 | 900 | 5.9083 |
5.8918 | 1.48 | 1000 | 5.8730 |
5.8617 | 1.63 | 1100 | 5.8512 |
5.84 | 1.77 | 1200 | 5.8212 |
5.8115 | 1.92 | 1300 | 5.8011 |
5.7889 | 2.07 | 1400 | 5.7758 |
5.7719 | 2.22 | 1500 | 5.7586 |
5.7512 | 2.37 | 1600 | 5.7396 |
5.7314 | 2.51 | 1700 | 5.7243 |
5.7207 | 2.66 | 1800 | 5.7065 |
5.7027 | 2.81 | 1900 | 5.6940 |
5.6889 | 2.96 | 2000 | 5.6751 |
5.6756 | 3.1 | 2100 | 5.6593 |
5.6648 | 3.25 | 2200 | 5.6497 |
5.6471 | 3.4 | 2300 | 5.6359 |
5.6326 | 3.55 | 2400 | 5.6251 |
5.6297 | 3.7 | 2500 | 5.6168 |
5.6125 | 3.84 | 2600 | 5.6039 |
5.6059 | 3.99 | 2700 | 5.5956 |
5.5906 | 4.14 | 2800 | 5.5832 |
5.5898 | 4.29 | 2900 | 5.5857 |
5.5816 | 4.44 | 3000 | 5.5710 |
5.5691 | 4.58 | 3100 | 5.5611 |
5.4637 | 4.73 | 3200 | 5.4215 |
5.3117 | 4.88 | 3300 | 5.2148 |
5.1268 | 5.03 | 3400 | 4.9871 |
4.9629 | 5.17 | 3500 | 4.7933 |
4.8129 | 5.32 | 3600 | 4.5922 |
4.4889 | 5.47 | 3700 | 4.2237 |
4.0294 | 5.62 | 3800 | 3.7873 |
3.6563 | 5.77 | 3900 | 3.4882 |
3.4771 | 5.91 | 4000 | 3.3127 |
3.3476 | 6.06 | 4100 | 3.1826 |
3.2367 | 6.21 | 4200 | 3.0838 |
3.146 | 6.36 | 4300 | 2.9982 |
3.08 | 6.51 | 4400 | 2.9275 |
3.0095 | 6.65 | 4500 | 2.8609 |
2.946 | 6.8 | 4600 | 2.8042 |
2.8998 | 6.95 | 4700 | 2.7545 |
2.8479 | 7.1 | 4800 | 2.7136 |
2.811 | 7.24 | 4900 | 2.6697 |
2.7711 | 7.39 | 5000 | 2.6365 |
2.7317 | 7.54 | 5100 | 2.5999 |
2.6977 | 7.69 | 5200 | 2.5754 |
2.6692 | 7.84 | 5300 | 2.5444 |
2.64 | 7.98 | 5400 | 2.5163 |
2.6158 | 8.13 | 5500 | 2.4947 |
2.5959 | 8.28 | 5600 | 2.4712 |
2.5721 | 8.43 | 5700 | 2.4514 |
2.5509 | 8.58 | 5800 | 2.4347 |
2.5319 | 8.72 | 5900 | 2.4133 |
2.5184 | 8.87 | 6000 | 2.3941 |
2.5021 | 9.02 | 6100 | 2.3781 |
2.4867 | 9.17 | 6200 | 2.3639 |
2.4694 | 9.31 | 6300 | 2.3511 |
2.4585 | 9.46 | 6400 | 2.3360 |
2.4411 | 9.61 | 6500 | 2.3206 |
2.43 | 9.76 | 6600 | 2.3104 |
2.4194 | 9.91 | 6700 | 2.2952 |
2.4079 | 10.05 | 6800 | 2.2863 |
2.3947 | 10.2 | 6900 | 2.2767 |
2.3809 | 10.35 | 7000 | 2.2628 |
2.3741 | 10.5 | 7100 | 2.2527 |
2.361 | 10.64 | 7200 | 2.2429 |
2.3544 | 10.79 | 7300 | 2.2363 |
2.3429 | 10.94 | 7400 | 2.2265 |
2.3354 | 11.09 | 7500 | 2.2156 |
2.3232 | 11.24 | 7600 | 2.2077 |
2.3187 | 11.38 | 7700 | 2.2002 |
2.3104 | 11.53 | 7800 | 2.1944 |
2.3009 | 11.68 | 7900 | 2.1853 |
2.2959 | 11.83 | 8000 | 2.1755 |
2.285 | 11.98 | 8100 | 2.1701 |
2.2847 | 12.12 | 8200 | 2.1632 |
2.2711 | 12.27 | 8300 | 2.1556 |
2.2617 | 12.42 | 8400 | 2.1455 |
2.2552 | 12.57 | 8500 | 2.1415 |
2.2466 | 12.71 | 8600 | 2.1311 |
2.2437 | 12.86 | 8700 | 2.1235 |
2.2372 | 13.01 | 8800 | 2.1178 |
2.2271 | 13.16 | 8900 | 2.1109 |
2.2232 | 13.31 | 9000 | 2.1061 |
2.2188 | 13.45 | 9100 | 2.1014 |
2.2111 | 13.6 | 9200 | 2.0953 |
2.2014 | 13.75 | 9300 | 2.0884 |
2.2005 | 13.9 | 9400 | 2.0823 |
2.1918 | 14.05 | 9500 | 2.0775 |
2.1874 | 14.19 | 9600 | 2.0713 |
2.1828 | 14.34 | 9700 | 2.0672 |
2.1802 | 14.49 | 9800 | 2.0624 |
2.1688 | 14.64 | 9900 | 2.0577 |
2.1686 | 14.78 | 10000 | 2.0535 |
2.1642 | 14.93 | 10100 | 2.0471 |
2.1598 | 15.08 | 10200 | 2.0424 |
2.1561 | 15.23 | 10300 | 2.0390 |
2.1528 | 15.38 | 10400 | 2.0353 |
2.1491 | 15.52 | 10500 | 2.0346 |
2.1437 | 15.67 | 10600 | 2.0272 |
2.1418 | 15.82 | 10700 | 2.0239 |
2.1371 | 15.97 | 10800 | 2.0206 |
2.1323 | 16.12 | 10900 | 2.0180 |
2.1309 | 16.26 | 11000 | 2.0133 |
2.1274 | 16.41 | 11100 | 2.0117 |
2.1225 | 16.56 | 11200 | 2.0069 |
2.1187 | 16.71 | 11300 | 2.0035 |
2.1173 | 16.85 | 11400 | 2.0018 |
2.1152 | 17.0 | 11500 | 1.9992 |
2.1059 | 17.15 | 11600 | 1.9942 |
2.1081 | 17.3 | 11700 | 1.9920 |
2.106 | 17.45 | 11800 | 1.9886 |
2.1013 | 17.59 | 11900 | 1.9868 |
2.0974 | 17.74 | 12000 | 1.9850 |
2.0956 | 17.89 | 12100 | 1.9795 |
2.0916 | 18.04 | 12200 | 1.9789 |
2.089 | 18.19 | 12300 | 1.9773 |
2.0876 | 18.33 | 12400 | 1.9740 |
2.0871 | 18.48 | 12500 | 1.9692 |
2.0772 | 18.63 | 12600 | 1.9686 |
2.0823 | 18.78 | 12700 | 1.9661 |
2.0756 | 18.92 | 12800 | 1.9621 |
2.0748 | 19.07 | 12900 | 1.9625 |
2.0708 | 19.22 | 13000 | 1.9572 |
2.0736 | 19.37 | 13100 | 1.9544 |
2.0681 | 19.52 | 13200 | 1.9531 |
2.0639 | 19.66 | 13300 | 1.9522 |
2.0603 | 19.81 | 13400 | 1.9493 |
2.0666 | 19.96 | 13500 | 1.9465 |
2.0609 | 20.11 | 13600 | 1.9461 |
2.0552 | 20.26 | 13700 | 1.9417 |
2.0596 | 20.4 | 13800 | 1.9420 |
2.0569 | 20.55 | 13900 | 1.9376 |
2.0546 | 20.7 | 14000 | 1.9385 |
2.0518 | 20.85 | 14100 | 1.9342 |
2.0519 | 20.99 | 14200 | 1.9310 |
2.0503 | 21.14 | 14300 | 1.9322 |
2.0471 | 21.29 | 14400 | 1.9303 |
2.0447 | 21.44 | 14500 | 1.9282 |
2.044 | 21.59 | 14600 | 1.9236 |
2.041 | 21.73 | 14700 | 1.9224 |
2.0342 | 21.88 | 14800 | 1.9227 |
2.0333 | 22.03 | 14900 | 1.9222 |
2.0374 | 22.18 | 15000 | 1.9174 |
2.0319 | 22.32 | 15100 | 1.9173 |
2.0317 | 22.47 | 15200 | 1.9159 |
2.0304 | 22.62 | 15300 | 1.9135 |
2.0271 | 22.77 | 15400 | 1.9128 |
2.0283 | 22.92 | 15500 | 1.9123 |
2.0209 | 23.06 | 15600 | 1.9092 |
2.0227 | 23.21 | 15700 | 1.9069 |
2.0211 | 23.36 | 15800 | 1.9050 |
2.0195 | 23.51 | 15900 | 1.9062 |
2.0156 | 23.66 | 16000 | 1.9029 |
2.0155 | 23.8 | 16100 | 1.9015 |
2.0156 | 23.95 | 16200 | 1.9004 |
2.0133 | 24.1 | 16300 | 1.8988 |
2.0092 | 24.25 | 16400 | 1.8965 |
2.0091 | 24.39 | 16500 | 1.8974 |
2.01 | 24.54 | 16600 | 1.8964 |
2.0119 | 24.69 | 16700 | 1.8923 |
2.0055 | 24.84 | 16800 | 1.8925 |
2.0063 | 24.99 | 16900 | 1.8894 |
2.0046 | 25.13 | 17000 | 1.8895 |
2.0034 | 25.28 | 17100 | 1.8881 |
2.0014 | 25.43 | 17200 | 1.8872 |
2.0042 | 25.58 | 17300 | 1.8853 |
2.0014 | 25.73 | 17400 | 1.8840 |
1.9992 | 25.87 | 17500 | 1.8842 |
1.9948 | 26.02 | 17600 | 1.8826 |
1.9992 | 26.17 | 17700 | 1.8815 |
1.997 | 26.32 | 17800 | 1.8792 |
1.9942 | 26.46 | 17900 | 1.8801 |
1.9916 | 26.61 | 18000 | 1.8765 |
1.9965 | 26.76 | 18100 | 1.8764 |
1.9921 | 26.91 | 18200 | 1.8751 |
1.9907 | 27.06 | 18300 | 1.8745 |
1.9918 | 27.2 | 18400 | 1.8733 |
1.9864 | 27.35 | 18500 | 1.8727 |
1.9865 | 27.5 | 18600 | 1.8715 |
1.9881 | 27.65 | 18700 | 1.8699 |
1.9834 | 27.8 | 18800 | 1.8689 |
1.9835 | 27.94 | 18900 | 1.8677 |
1.9738 | 28.09 | 19000 | 1.8675 |
1.9807 | 28.24 | 19100 | 1.8679 |
1.9828 | 28.39 | 19200 | 1.8661 |
1.9813 | 28.53 | 19300 | 1.8645 |
1.9772 | 28.68 | 19400 | 1.8642 |
1.9766 | 28.83 | 19500 | 1.8643 |
1.9805 | 28.98 | 19600 | 1.8615 |
1.9746 | 29.13 | 19700 | 1.8627 |
1.9767 | 29.27 | 19800 | 1.8617 |
1.9751 | 29.42 | 19900 | 1.8605 |
1.9742 | 29.57 | 20000 | 1.8597 |
1.9734 | 29.72 | 20100 | 1.8574 |
1.9695 | 29.87 | 20200 | 1.8551 |
1.9695 | 30.01 | 20300 | 1.8560 |
1.9734 | 30.16 | 20400 | 1.8555 |
1.9726 | 30.31 | 20500 | 1.8527 |
1.9707 | 30.46 | 20600 | 1.8543 |
1.9683 | 30.6 | 20700 | 1.8534 |
1.9675 | 30.75 | 20800 | 1.8522 |
1.9668 | 30.9 | 20900 | 1.8496 |
1.9687 | 31.05 | 21000 | 1.8500 |
1.9678 | 31.2 | 21100 | 1.8493 |
1.9659 | 31.34 | 21200 | 1.8497 |
1.962 | 31.49 | 21300 | 1.8483 |
1.9637 | 31.64 | 21400 | 1.8483 |
1.9612 | 31.79 | 21500 | 1.8468 |
1.9642 | 31.93 | 21600 | 1.8463 |
1.9585 | 32.08 | 21700 | 1.8457 |
1.9616 | 32.23 | 21800 | 1.8458 |
1.9593 | 32.38 | 21900 | 1.8429 |
1.9579 | 32.53 | 22000 | 1.8431 |
1.9576 | 32.67 | 22100 | 1.8439 |
1.9552 | 32.82 | 22200 | 1.8409 |
1.9586 | 32.97 | 22300 | 1.8414 |
1.9582 | 33.12 | 22400 | 1.8415 |
1.9542 | 33.27 | 22500 | 1.8406 |
1.9557 | 33.41 | 22600 | 1.8406 |
1.9528 | 33.56 | 22700 | 1.8397 |
1.9542 | 33.71 | 22800 | 1.8374 |
1.9584 | 33.86 | 22900 | 1.8380 |
1.9549 | 34.0 | 23000 | 1.8366 |
1.9549 | 34.15 | 23100 | 1.8365 |
1.9551 | 34.3 | 23200 | 1.8368 |
1.9505 | 34.45 | 23300 | 1.8367 |
1.953 | 34.6 | 23400 | 1.8351 |
1.9508 | 34.74 | 23500 | 1.8345 |
1.9473 | 34.89 | 23600 | 1.8336 |
1.9507 | 35.04 | 23700 | 1.8339 |
1.9499 | 35.19 | 23800 | 1.8329 |
1.9486 | 35.34 | 23900 | 1.8342 |
1.946 | 35.48 | 24000 | 1.8313 |
1.9423 | 35.63 | 24100 | 1.8319 |
1.9439 | 35.78 | 24200 | 1.8315 |
1.9425 | 35.93 | 24300 | 1.8308 |
1.944 | 36.07 | 24400 | 1.8304 |
1.9462 | 36.22 | 24500 | 1.8301 |
1.9424 | 36.37 | 24600 | 1.8295 |
1.9443 | 36.52 | 24700 | 1.8294 |
1.9463 | 36.67 | 24800 | 1.8282 |
1.9448 | 36.81 | 24900 | 1.8290 |
1.9424 | 36.96 | 25000 | 1.8271 |
1.9428 | 37.11 | 25100 | 1.8285 |
1.9503 | 37.26 | 25200 | 1.8263 |
1.9474 | 37.41 | 25300 | 1.8277 |
1.9406 | 37.55 | 25400 | 1.8260 |
1.9407 | 37.7 | 25500 | 1.8267 |
1.946 | 37.85 | 25600 | 1.8257 |
1.9395 | 38.0 | 25700 | 1.8254 |
1.9412 | 38.14 | 25800 | 1.8255 |
1.9421 | 38.29 | 25900 | 1.8252 |
1.9398 | 38.44 | 26000 | 1.8257 |
1.9373 | 38.59 | 26100 | 1.8239 |
1.938 | 38.74 | 26200 | 1.8234 |
1.9399 | 38.88 | 26300 | 1.8230 |
1.9382 | 39.03 | 26400 | 1.8235 |
1.9377 | 39.18 | 26500 | 1.8216 |
1.9358 | 39.33 | 26600 | 1.8217 |
1.9366 | 39.48 | 26700 | 1.8227 |
1.9375 | 39.62 | 26800 | 1.8224 |
1.9385 | 39.77 | 26900 | 1.8233 |
1.9373 | 39.92 | 27000 | 1.8228 |
Framework versions
- Transformers 4.34.1
- Pytorch 2.1.0+cu121
- Datasets 2.14.6
- Tokenizers 0.14.1