<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->
guten-no-merge-log-rarity
This model is a fine-tuned version of gpt2 on the generator dataset. It achieves the following results on the evaluation set:
- Loss: 4.0835
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0005
- train_batch_size: 64
- eval_batch_size: 64
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- lr_scheduler_warmup_steps: 1000
- num_epochs: 6
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
7.5185 | 0.12 | 200 | 5.9953 |
5.702 | 0.23 | 400 | 5.4671 |
5.3132 | 0.35 | 600 | 5.2102 |
5.0736 | 0.46 | 800 | 5.0308 |
4.9325 | 0.58 | 1000 | 4.9096 |
4.7986 | 0.69 | 1200 | 4.7945 |
4.6851 | 0.81 | 1400 | 4.7064 |
4.6142 | 0.93 | 1600 | 4.6351 |
4.4995 | 1.04 | 1800 | 4.5776 |
4.3886 | 1.16 | 2000 | 4.5242 |
4.3455 | 1.27 | 2200 | 4.4778 |
4.3119 | 1.39 | 2400 | 4.4343 |
4.2646 | 1.5 | 2600 | 4.3910 |
4.2227 | 1.62 | 2800 | 4.3531 |
4.1925 | 1.74 | 3000 | 4.3114 |
4.1501 | 1.85 | 3200 | 4.2712 |
4.129 | 1.97 | 3400 | 4.2403 |
3.9673 | 2.08 | 3600 | 4.2321 |
3.9145 | 2.2 | 3800 | 4.2166 |
3.9146 | 2.31 | 4000 | 4.1956 |
3.8992 | 2.43 | 4200 | 4.1732 |
3.8932 | 2.55 | 4400 | 4.1494 |
3.8646 | 2.66 | 4600 | 4.1281 |
3.8627 | 2.78 | 4800 | 4.1104 |
3.8537 | 2.89 | 5000 | 4.0890 |
3.8128 | 3.01 | 5200 | 4.0839 |
3.6101 | 3.12 | 5400 | 4.0845 |
3.611 | 3.24 | 5600 | 4.0771 |
3.6168 | 3.36 | 5800 | 4.0707 |
3.6047 | 3.47 | 6000 | 4.0544 |
3.6015 | 3.59 | 6200 | 4.0471 |
3.5941 | 3.7 | 6400 | 4.0331 |
3.5878 | 3.82 | 6600 | 4.0210 |
3.5797 | 3.94 | 6800 | 4.0098 |
3.4728 | 4.05 | 7000 | 4.0174 |
3.3399 | 4.17 | 7200 | 4.0220 |
3.3436 | 4.28 | 7400 | 4.0194 |
3.3467 | 4.4 | 7600 | 4.0145 |
3.3501 | 4.51 | 7800 | 4.0088 |
3.3493 | 4.63 | 8000 | 4.0028 |
3.3374 | 4.75 | 8200 | 3.9991 |
3.3364 | 4.86 | 8400 | 3.9946 |
3.3261 | 4.98 | 8600 | 3.9902 |
3.204 | 5.09 | 8800 | 4.0007 |
3.1715 | 5.21 | 9000 | 4.0032 |
3.1683 | 5.32 | 9200 | 4.0025 |
3.1708 | 5.44 | 9400 | 4.0026 |
3.1649 | 5.56 | 9600 | 4.0019 |
3.1773 | 5.67 | 9800 | 4.0012 |
3.1608 | 5.79 | 10000 | 4.0014 |
3.1538 | 5.9 | 10200 | 4.0012 |
Framework versions
- Transformers 4.26.1
- Pytorch 1.11.0+cu113
- Datasets 2.13.0
- Tokenizers 0.13.3