<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->
guten-rarity-all-end-19k-ctx-64
This model is a fine-tuned version of gpt2 on the generator dataset. It achieves the following results on the evaluation set:
- Loss: 4.4576
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0005
- train_batch_size: 64
- eval_batch_size: 64
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- lr_scheduler_warmup_steps: 1000
- num_epochs: 6
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
6.8243 | 0.15 | 500 | 5.7888 |
5.5606 | 0.29 | 1000 | 5.4446 |
5.2508 | 0.44 | 1500 | 5.2225 |
5.0772 | 0.59 | 2000 | 5.0928 |
4.9577 | 0.73 | 2500 | 5.0064 |
4.8676 | 0.88 | 3000 | 4.9375 |
4.7689 | 1.02 | 3500 | 4.8928 |
4.6483 | 1.17 | 4000 | 4.8522 |
4.6236 | 1.32 | 4500 | 4.8016 |
4.5769 | 1.46 | 5000 | 4.7621 |
4.5395 | 1.61 | 5500 | 4.7233 |
4.5035 | 1.76 | 6000 | 4.6906 |
4.4614 | 1.9 | 6500 | 4.6515 |
4.3778 | 2.05 | 7000 | 4.6380 |
4.2446 | 2.19 | 7500 | 4.6121 |
4.2402 | 2.34 | 8000 | 4.5856 |
4.221 | 2.49 | 8500 | 4.5575 |
4.2021 | 2.63 | 9000 | 4.5268 |
4.1908 | 2.78 | 9500 | 4.4977 |
4.1691 | 2.93 | 10000 | 4.4673 |
4.0317 | 3.07 | 10500 | 4.4820 |
3.931 | 3.22 | 11000 | 4.4766 |
3.9202 | 3.36 | 11500 | 4.4607 |
3.9241 | 3.51 | 12000 | 4.4389 |
3.9147 | 3.66 | 12500 | 4.4202 |
3.9027 | 3.8 | 13000 | 4.4001 |
3.8931 | 3.95 | 13500 | 4.3843 |
3.7317 | 4.1 | 14000 | 4.4054 |
3.653 | 4.24 | 14500 | 4.4036 |
3.6488 | 4.39 | 15000 | 4.3999 |
3.6513 | 4.53 | 15500 | 4.3908 |
3.6392 | 4.68 | 16000 | 4.3837 |
3.6341 | 4.83 | 16500 | 4.3767 |
3.632 | 4.97 | 17000 | 4.3707 |
3.4875 | 5.12 | 17500 | 4.3838 |
3.4673 | 5.27 | 18000 | 4.3848 |
3.4661 | 5.41 | 18500 | 4.3837 |
3.4643 | 5.56 | 19000 | 4.3829 |
3.463 | 5.71 | 19500 | 4.3827 |
3.4588 | 5.85 | 20000 | 4.3824 |
3.4591 | 6.0 | 20500 | 4.3825 |
Framework versions
- Transformers 4.26.1
- Pytorch 1.11.0+cu113
- Datasets 2.13.0
- Tokenizers 0.13.3