<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->
enlm-r
This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 1.4837
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0006
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- gradient_accumulation_steps: 128
- total_train_batch_size: 8192
- total_eval_batch_size: 64
- optimizer: Adam with betas=(0.9,0.98) and epsilon=1e-06
- lr_scheduler_type: polynomial
- lr_scheduler_warmup_steps: 24000
- num_epochs: 81
Training results
| Training Loss | Epoch | Step | Validation Loss |
|---|---|---|---|
| 6.4 | 0.33 | 160 | 10.7903 |
| 6.4 | 0.66 | 320 | 10.1431 |
| 6.4 | 0.99 | 480 | 9.8708 |
| 6.4 | 0.33 | 640 | 9.3884 |
| 6.4 | 0.66 | 800 | 8.7352 |
| 6.4 | 0.99 | 960 | 8.3341 |
| 6.4 | 1.33 | 1120 | 8.0614 |
| 6.4 | 1.66 | 1280 | 7.8582 |
| 4.2719 | 1.99 | 1440 | 7.4879 |
| 3.2 | 3.3 | 1600 | 7.2689 |
| 3.2 | 3.63 | 1760 | 7.1434 |
| 3.2 | 3.96 | 1920 | 7.0576 |
| 3.2 | 4.29 | 2080 | 7.0030 |
| 3.2 | 4.62 | 2240 | 6.9612 |
| 3.2 | 4.95 | 2400 | 6.9394 |
| 3.2 | 5.28 | 2560 | 6.9559 |
| 3.2 | 5.61 | 2720 | 6.8964 |
| 3.2 | 5.94 | 2880 | 6.8939 |
| 3.2 | 6.27 | 3040 | 6.8871 |
| 3.2 | 6.6 | 3200 | 6.8771 |
| 3.2 | 6.93 | 3360 | 6.8617 |
| 3.2 | 7.26 | 3520 | 6.8472 |
| 3.2 | 7.59 | 3680 | 6.8283 |
| 3.2 | 7.92 | 3840 | 6.8082 |
| 3.2 | 8.25 | 4000 | 6.8119 |
| 3.2 | 8.58 | 4160 | 6.7962 |
| 3.2 | 8.91 | 4320 | 6.7751 |
| 3.2 | 9.24 | 4480 | 6.7405 |
| 3.2 | 9.57 | 4640 | 6.7412 |
| 3.2 | 9.9 | 4800 | 6.7279 |
| 3.2 | 10.22 | 4960 | 6.7069 |
| 3.2 | 10.55 | 5120 | 6.6998 |
| 3.2 | 10.88 | 5280 | 6.6875 |
| 3.2 | 11.22 | 5440 | 6.6580 |
| 3.2 | 11.55 | 5600 | 6.6402 |
| 3.2 | 11.88 | 5760 | 6.6281 |
| 3.2 | 12.21 | 5920 | 6.6181 |
| 3.2 | 12.54 | 6080 | 6.5995 |
| 3.2 | 12.87 | 6240 | 6.5970 |
| 3.2 | 13.2 | 6400 | 6.5772 |
| 3.2 | 13.53 | 6560 | 6.5594 |
| 3.2 | 13.85 | 6720 | 6.5400 |
| 3.2 | 14.19 | 6880 | 6.5396 |
| 3.2 | 14.51 | 7040 | 6.5211 |
| 3.2 | 14.84 | 7200 | 6.5140 |
| 3.2 | 15.18 | 7360 | 6.4002 |
| 3.2 | 15.5 | 7520 | 6.3170 |
| 3.2 | 15.83 | 7680 | 6.2621 |
| 3.2 | 16.16 | 7840 | 6.2253 |
| 3.2 | 16.49 | 8000 | 6.1722 |
| 3.2 | 16.82 | 8160 | 6.1106 |
| 3.2 | 17.15 | 8320 | 6.1281 |
| 3.2 | 17.48 | 8480 | 6.0019 |
| 3.2 | 17.81 | 8640 | 5.9069 |
| 3.2 | 18.14 | 8800 | 5.7105 |
| 3.2 | 18.47 | 8960 | 5.2741 |
| 3.2 | 18.8 | 9120 | 5.0369 |
| 5.0352 | 19.13 | 9280 | 4.8148 |
| 4.5102 | 19.26 | 9440 | 4.3175 |
| 4.1247 | 19.59 | 9600 | 3.9518 |
| 3.8443 | 20.12 | 9760 | 3.6712 |
| 3.6334 | 20.45 | 9920 | 3.4654 |
| 3.4698 | 20.78 | 10080 | 3.2994 |
| 3.3267 | 21.11 | 10240 | 3.1638 |
| 3.2173 | 21.44 | 10400 | 3.0672 |
| 3.1255 | 21.77 | 10560 | 2.9687 |
| 3.0344 | 22.1 | 10720 | 2.8865 |
| 2.9645 | 22.43 | 10880 | 2.8104 |
| 2.9046 | 22.76 | 11040 | 2.7497 |
| 2.8707 | 23.09 | 11200 | 2.7040 |
| 2.7903 | 23.42 | 11360 | 2.6416 |
| 2.7339 | 23.75 | 11520 | 2.5891 |
| 2.6894 | 24.08 | 11680 | 2.5370 |
| 2.6461 | 24.41 | 11840 | 2.4960 |
| 2.5976 | 24.74 | 12000 | 2.4508 |
| 2.5592 | 25.07 | 12160 | 2.4194 |
| 2.5305 | 25.4 | 12320 | 2.3790 |
| 2.4993 | 25.73 | 12480 | 2.3509 |
| 2.465 | 26.06 | 12640 | 2.3173 |
| 2.4455 | 26.39 | 12800 | 2.2934 |
| 2.4107 | 26.72 | 12960 | 2.2701 |
| 2.3883 | 27.05 | 13120 | 2.2378 |
| 2.3568 | 27.38 | 13280 | 2.2079 |
| 2.3454 | 27.71 | 13440 | 2.1919 |
| 2.3207 | 28.04 | 13600 | 2.1671 |
| 2.2963 | 28.37 | 13760 | 2.1513 |
| 2.2738 | 28.7 | 13920 | 2.1326 |
| 2.2632 | 29.03 | 14080 | 2.1176 |
| 2.2413 | 29.36 | 14240 | 2.0913 |
| 2.2193 | 29.69 | 14400 | 2.0772 |
| 2.2169 | 30.02 | 14560 | 2.0692 |
| 2.1848 | 30.35 | 14720 | 2.0411 |
| 2.1693 | 30.68 | 14880 | 2.0290 |
| 2.1964 | 31.01 | 15040 | 2.0169 |
| 2.1467 | 31.34 | 15200 | 2.0016 |
| 2.1352 | 31.67 | 15360 | 1.9880 |
| 2.1152 | 32.0 | 15520 | 1.9727 |
| 2.1098 | 32.33 | 15680 | 1.9604 |
| 2.0888 | 32.66 | 15840 | 1.9521 |
| 2.0837 | 32.99 | 16000 | 1.9394 |
| 2.0761 | 33.32 | 16160 | 1.9366 |
| 2.0635 | 33.65 | 16320 | 1.9200 |
| 2.0631 | 33.98 | 16480 | 1.9147 |
| 2.0448 | 34.31 | 16640 | 1.9053 |
| 2.0452 | 34.64 | 16800 | 1.8937 |
| 2.0303 | 34.97 | 16960 | 1.8801 |
| 2.0184 | 35.3 | 17120 | 1.8752 |
| 2.0115 | 35.63 | 17280 | 1.8667 |
| 2.0042 | 35.96 | 17440 | 1.8626 |
| 2.002 | 36.29 | 17600 | 1.8565 |
| 1.9918 | 36.62 | 17760 | 1.8475 |
| 1.9868 | 36.95 | 17920 | 1.8420 |
| 1.9796 | 37.28 | 18080 | 1.8376 |
| 1.976 | 37.61 | 18240 | 1.8318 |
| 1.9647 | 37.94 | 18400 | 1.8225 |
| 1.9561 | 38.27 | 18560 | 1.8202 |
| 1.9544 | 38.6 | 18720 | 1.8084 |
| 1.9454 | 38.93 | 18880 | 1.8057 |
| 1.9333 | 39.26 | 19040 | 1.8030 |
| 1.9411 | 39.59 | 19200 | 1.7966 |
| 1.9289 | 39.92 | 19360 | 1.7865 |
| 1.9261 | 40.25 | 19520 | 1.7815 |
| 1.9207 | 40.58 | 19680 | 1.7881 |
| 1.9164 | 40.91 | 19840 | 1.7747 |
| 1.9152 | 41.24 | 20000 | 1.7786 |
| 1.914 | 41.57 | 20160 | 1.7664 |
| 1.901 | 41.9 | 20320 | 1.7586 |
| 1.8965 | 42.23 | 20480 | 1.7554 |
| 1.8982 | 42.56 | 20640 | 1.7524 |
| 1.8941 | 42.89 | 20800 | 1.7460 |
| 1.8834 | 43.22 | 20960 | 1.7488 |
| 1.8841 | 43.55 | 21120 | 1.7486 |
| 1.8846 | 43.88 | 21280 | 1.7424 |
| 1.8763 | 44.21 | 21440 | 1.7352 |
| 1.8688 | 44.54 | 21600 | 1.7349 |
| 1.8714 | 44.87 | 21760 | 1.7263 |
| 1.8653 | 45.2 | 21920 | 1.7282 |
| 1.8673 | 45.53 | 22080 | 1.7195 |
| 1.8682 | 45.85 | 22240 | 1.7266 |
| 1.8532 | 46.19 | 22400 | 1.7180 |
| 1.8553 | 46.51 | 22560 | 1.7137 |
| 1.8569 | 46.84 | 22720 | 1.7158 |
| 1.8469 | 47.18 | 22880 | 1.7117 |
| 1.845 | 47.5 | 23040 | 1.7031 |
| 1.8475 | 47.83 | 23200 | 1.7089 |
| 1.845 | 48.16 | 23360 | 1.7018 |
| 1.8391 | 48.49 | 23520 | 1.6945 |
| 1.8456 | 48.82 | 23680 | 1.7015 |
| 1.8305 | 49.15 | 23840 | 1.6964 |
| 1.8324 | 49.48 | 24000 | 1.6900 |
| 1.7763 | 49.81 | 24160 | 1.6449 |
| 1.7728 | 50.14 | 24320 | 1.6436 |
| 1.7576 | 50.47 | 24480 | 1.6268 |
| 1.7354 | 50.8 | 24640 | 1.6088 |
| 1.74 | 51.13 | 24800 | 1.6156 |
| 1.7251 | 51.06 | 24960 | 1.6041 |
| 1.719 | 51.39 | 25120 | 1.5938 |
| 1.7257 | 52.12 | 25280 | 1.5983 |
| 1.7184 | 52.45 | 25440 | 1.5919 |
| 1.7093 | 52.78 | 25600 | 1.5848 |
| 1.7114 | 53.11 | 25760 | 1.5922 |
| 1.7076 | 53.44 | 25920 | 1.5843 |
| 1.7 | 53.77 | 26080 | 1.5807 |
| 1.7027 | 54.1 | 26240 | 1.5811 |
| 1.704 | 54.43 | 26400 | 1.5766 |
| 1.6958 | 54.76 | 26560 | 1.5756 |
| 1.6976 | 55.09 | 26720 | 1.5773 |
| 1.6944 | 55.42 | 26880 | 1.5725 |
| 1.6891 | 55.75 | 27040 | 1.5685 |
| 1.6936 | 56.08 | 27200 | 1.5750 |
| 1.6893 | 56.41 | 27360 | 1.5696 |
| 1.6886 | 56.74 | 27520 | 1.5643 |
| 1.6936 | 57.07 | 27680 | 1.5691 |
| 1.6883 | 57.4 | 27840 | 1.5718 |
| 1.6832 | 57.73 | 28000 | 1.5660 |
| 1.9222 | 28.03 | 28160 | 1.7107 |
| 1.7838 | 28.19 | 28320 | 1.6345 |
| 1.7843 | 28.36 | 28480 | 1.6445 |
| 1.7809 | 28.52 | 28640 | 1.6461 |
| 1.783 | 28.69 | 28800 | 1.6505 |
| 1.7869 | 28.85 | 28960 | 1.6364 |
| 1.778 | 29.02 | 29120 | 1.6363 |
| 1.775 | 29.18 | 29280 | 1.6364 |
| 1.7697 | 29.34 | 29440 | 1.6345 |
| 1.7719 | 29.51 | 29600 | 1.6261 |
| 1.7454 | 61.16 | 29760 | 1.6099 |
| 1.741 | 61.49 | 29920 | 1.6006 |
| 1.7314 | 62.02 | 30080 | 1.6041 |
| 1.7314 | 62.35 | 30240 | 1.5914 |
| 1.7246 | 62.68 | 30400 | 1.5917 |
| 1.7642 | 63.01 | 30560 | 1.5923 |
| 1.7221 | 63.34 | 30720 | 1.5857 |
| 1.7185 | 63.67 | 30880 | 1.5836 |
| 1.7022 | 64.0 | 31040 | 1.5836 |
| 1.7107 | 64.33 | 31200 | 1.5739 |
| 1.7082 | 64.66 | 31360 | 1.5724 |
| 1.7055 | 64.99 | 31520 | 1.5734 |
| 1.7019 | 65.32 | 31680 | 1.5707 |
| 1.699 | 65.65 | 31840 | 1.5649 |
| 1.6963 | 65.98 | 32000 | 1.5685 |
| 1.6935 | 66.31 | 32160 | 1.5673 |
| 1.6899 | 66.64 | 32320 | 1.5648 |
| 1.6869 | 66.97 | 32480 | 1.5620 |
| 1.6867 | 67.3 | 32640 | 1.5564 |
| 1.6861 | 67.63 | 32800 | 1.5552 |
| 1.6831 | 67.96 | 32960 | 1.5496 |
| 1.6778 | 68.29 | 33120 | 1.5479 |
| 1.6742 | 68.62 | 33280 | 1.5501 |
| 1.6737 | 68.95 | 33440 | 1.5441 |
| 1.6725 | 69.28 | 33600 | 1.5399 |
| 1.6683 | 69.61 | 33760 | 1.5398 |
| 1.6689 | 69.94 | 33920 | 1.5374 |
| 1.6634 | 70.27 | 34080 | 1.5385 |
| 1.6638 | 70.6 | 34240 | 1.5332 |
| 1.6614 | 70.93 | 34400 | 1.5329 |
| 1.6544 | 71.26 | 34560 | 1.5292 |
| 1.6532 | 71.59 | 34720 | 1.5268 |
| 1.6511 | 71.92 | 34880 | 1.5225 |
| 1.6506 | 72.25 | 35040 | 1.5219 |
| 1.6496 | 72.58 | 35200 | 1.5202 |
| 1.6468 | 72.91 | 35360 | 1.5199 |
| 1.6424 | 73.24 | 35520 | 1.5220 |
| 1.642 | 73.57 | 35680 | 1.5145 |
| 1.6415 | 73.9 | 35840 | 1.5139 |
| 1.6419 | 74.23 | 36000 | 1.5120 |
| 1.633 | 74.56 | 36160 | 1.5113 |
| 1.6354 | 74.89 | 36320 | 1.5139 |
| 1.6312 | 75.22 | 36480 | 1.5068 |
| 1.6298 | 75.55 | 36640 | 1.5056 |
| 1.6268 | 75.88 | 36800 | 1.5000 |
| 1.6277 | 76.21 | 36960 | 1.5033 |
| 1.6198 | 76.54 | 37120 | 1.4988 |
| 1.6246 | 76.87 | 37280 | 1.4978 |
| 1.6184 | 77.2 | 37440 | 1.4966 |
| 1.6187 | 77.53 | 37600 | 1.4954 |
| 1.6192 | 77.85 | 37760 | 1.4951 |
| 1.6134 | 78.19 | 37920 | 1.4936 |
| 1.6176 | 78.51 | 38080 | 1.4908 |
| 1.6103 | 78.84 | 38240 | 1.4904 |
| 1.612 | 79.18 | 38400 | 1.4919 |
| 1.611 | 79.5 | 38560 | 1.4891 |
| 1.6082 | 79.83 | 38720 | 1.4837 |
| 1.6047 | 80.16 | 38880 | 1.4859 |
| 1.6058 | 80.49 | 39040 | 1.4814 |
| 1.602 | 80.82 | 39200 | 1.4837 |
Framework versions
- Transformers 4.20.1
- Pytorch 1.11.0
- Datasets 2.3.2
- Tokenizers 0.12.1