<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->
v29
This model is a fine-tuned version of EleutherAI/gpt-neo-1.3B on the None dataset. It achieves the following results on the evaluation set:
- Loss: 5.1914
- Accuracy: 0.3454
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 1e-05
- train_batch_size: 4
- eval_batch_size: 1
- seed: 42
- distributed_type: multi-GPU
- num_devices: 2
- total_train_batch_size: 8
- total_eval_batch_size: 2
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 210
- num_epochs: 30.0
Training results
Training Loss | Epoch | Step | Validation Loss | Accuracy |
---|---|---|---|---|
2.7512 | 1.0 | 72 | 2.7871 | 0.4451 |
2.6444 | 2.0 | 144 | 2.7832 | 0.4452 |
2.5407 | 3.0 | 216 | 2.7930 | 0.4429 |
2.4257 | 4.0 | 288 | 2.8125 | 0.4411 |
2.3014 | 5.0 | 360 | 2.8477 | 0.4377 |
2.1749 | 6.0 | 432 | 2.8945 | 0.4328 |
2.0323 | 7.0 | 504 | 2.9668 | 0.4256 |
1.8754 | 8.0 | 576 | 3.0352 | 0.4197 |
1.7043 | 9.0 | 648 | 3.1504 | 0.4117 |
1.5224 | 10.0 | 720 | 3.2773 | 0.4029 |
1.3273 | 11.0 | 792 | 3.4141 | 0.3945 |
1.1364 | 12.0 | 864 | 3.5625 | 0.3869 |
0.9499 | 13.0 | 936 | 3.7422 | 0.3784 |
0.7793 | 14.0 | 1008 | 3.8906 | 0.3731 |
0.6176 | 15.0 | 1080 | 4.0625 | 0.3669 |
0.4873 | 16.0 | 1152 | 4.1836 | 0.3630 |
0.3717 | 17.0 | 1224 | 4.3281 | 0.3589 |
0.2797 | 18.0 | 1296 | 4.4570 | 0.3564 |
0.2054 | 19.0 | 1368 | 4.5703 | 0.3539 |
0.1517 | 20.0 | 1440 | 4.6719 | 0.3525 |
0.1115 | 21.0 | 1512 | 4.7539 | 0.3511 |
0.0854 | 22.0 | 1584 | 4.8359 | 0.3494 |
0.0669 | 23.0 | 1656 | 4.9062 | 0.3492 |
0.0494 | 24.0 | 1728 | 4.9609 | 0.3486 |
0.0377 | 25.0 | 1800 | 5.0273 | 0.3472 |
0.0302 | 26.0 | 1872 | 5.0625 | 0.3471 |
0.026 | 27.0 | 1944 | 5.1406 | 0.3455 |
0.0255 | 28.0 | 2016 | 5.1211 | 0.3465 |
0.0241 | 29.0 | 2088 | 5.1367 | 0.3466 |
0.0197 | 30.0 | 2160 | 5.1914 | 0.3454 |
Framework versions
- Transformers 4.29.2
- Pytorch 2.0.1
- Datasets 2.12.0
- Tokenizers 0.13.3