<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->
enlm-roberta
This model is a fine-tuned version of manirai91/enlm-roberta on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 1.4193
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 6e-05
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- gradient_accumulation_steps: 128
- total_train_batch_size: 8192
- total_eval_batch_size: 64
- optimizer: Adam with betas=(0.9,0.98) and epsilon=1e-06
- lr_scheduler_type: polynomial
- num_epochs: 10
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
1.5984 | 0.13 | 160 | 1.4905 |
1.6149 | 0.27 | 320 | 1.4969 |
1.6285 | 0.61 | 480 | 1.5087 |
1.6473 | 0.96 | 640 | 1.5173 |
1.679 | 1.3 | 800 | 1.5441 |
1.683 | 1.64 | 960 | 1.5576 |
1.6942 | 1.99 | 1120 | 1.5681 |
1.6921 | 2.33 | 1280 | 1.5659 |
1.6986 | 2.67 | 1440 | 1.5684 |
1.8496 | 3.02 | 1600 | 1.5586 |
1.6807 | 3.36 | 1760 | 1.5503 |
1.6744 | 3.7 | 1920 | 1.5466 |
1.6838 | 4.05 | 2080 | 1.5471 |
1.6725 | 4.39 | 2240 | 1.5371 |
1.6663 | 4.73 | 2400 | 1.5433 |
1.6644 | 5.08 | 2560 | 1.5321 |
1.66 | 5.42 | 2720 | 1.5325 |
1.6535 | 5.76 | 2880 | 1.5272 |
1.651 | 6.11 | 3040 | 1.5253 |
1.6432 | 6.45 | 3200 | 1.5207 |
1.6452 | 6.79 | 3360 | 1.5239 |
1.6398 | 7.14 | 3520 | 1.5168 |
1.6308 | 7.48 | 3680 | 1.5088 |
1.6332 | 7.82 | 3840 | 1.5065 |
1.6261 | 8.17 | 4000 | 1.5024 |
1.6194 | 8.51 | 4160 | 1.5085 |
1.6178 | 8.85 | 4320 | 1.4961 |
1.6137 | 9.2 | 4480 | 1.4949 |
1.613 | 9.54 | 4640 | 1.4953 |
1.6048 | 9.88 | 4800 | 1.4905 |
1.6058 | 10.23 | 4960 | 1.4893 |
1.6036 | 10.57 | 5120 | 1.4826 |
1.5976 | 10.91 | 5280 | 1.4846 |
1.593 | 11.26 | 5440 | 1.4792 |
1.5914 | 11.6 | 5600 | 1.4734 |
1.5863 | 11.94 | 5760 | 1.4731 |
1.5828 | 12.29 | 5920 | 1.4702 |
1.5831 | 12.63 | 6080 | 1.4649 |
1.5796 | 12.97 | 6240 | 1.4611 |
1.5717 | 13.32 | 6400 | 1.4580 |
1.5737 | 13.66 | 6560 | 1.4576 |
1.7137 | 14.0 | 6720 | 1.4571 |
1.5651 | 14.35 | 6880 | 1.4543 |
1.561 | 14.69 | 7040 | 1.4469 |
1.5578 | 15.25 | 7200 | 1.4469 |
1.5531 | 15.6 | 7360 | 1.4430 |
1.5548 | 15.94 | 7520 | 1.4408 |
1.5523 | 16.28 | 7680 | 1.4390 |
1.5467 | 16.63 | 7840 | 1.4357 |
1.5467 | 16.97 | 8000 | 1.4328 |
1.5406 | 17.23 | 8160 | 1.4290 |
1.5379 | 17.58 | 8320 | 1.4321 |
1.5349 | 17.92 | 8480 | 1.4277 |
1.5343 | 18.26 | 8640 | 1.4238 |
1.5302 | 18.61 | 8800 | 1.4206 |
1.5293 | 18.95 | 8960 | 1.4198 |
1.5278 | 19.29 | 9120 | 1.4207 |
1.523 | 19.64 | 9280 | 1.4193 |
Framework versions
- Transformers 4.20.1
- Pytorch 1.11.0
- Datasets 2.3.2
- Tokenizers 0.12.1