<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->
bert-trainer-8b
This model is a fine-tuned version of on the generator dataset. It achieves the following results on the evaluation set:
- Loss: 3.1639
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0005
- train_batch_size: 64
- eval_batch_size: 64
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- lr_scheduler_warmup_steps: 1000
- num_epochs: 32
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
6.5416 | 1.0 | 500 | 6.5207 |
6.393 | 1.99 | 1000 | 6.3903 |
6.2817 | 2.99 | 1500 | 6.3033 |
6.2274 | 3.98 | 2000 | 6.2671 |
6.179 | 4.98 | 2500 | 6.2431 |
6.1684 | 5.98 | 3000 | 6.2309 |
6.1244 | 6.97 | 3500 | 6.2114 |
6.0879 | 7.97 | 4000 | 6.1932 |
6.0643 | 8.96 | 4500 | 6.1791 |
6.0481 | 9.96 | 5000 | 6.1638 |
6.0231 | 10.96 | 5500 | 6.1581 |
5.9987 | 11.95 | 6000 | 6.1365 |
5.9989 | 12.95 | 6500 | 6.1194 |
5.9535 | 13.94 | 7000 | 6.1095 |
5.9139 | 14.94 | 7500 | 6.0890 |
5.8462 | 15.94 | 8000 | 6.0224 |
5.7689 | 16.93 | 8500 | 5.9266 |
5.6137 | 17.93 | 9000 | 5.7195 |
4.7163 | 18.92 | 9500 | 4.6131 |
4.0877 | 19.92 | 10000 | 4.0903 |
3.7832 | 20.92 | 10500 | 3.8340 |
3.6104 | 21.91 | 11000 | 3.6572 |
3.4615 | 22.91 | 11500 | 3.5278 |
3.3661 | 23.9 | 12000 | 3.4201 |
3.271 | 24.9 | 12500 | 3.3333 |
3.2179 | 25.9 | 13000 | 3.2720 |
3.1759 | 26.89 | 13500 | 3.2317 |
3.1419 | 27.89 | 14000 | 3.2006 |
3.1041 | 28.88 | 14500 | 3.1806 |
3.0836 | 29.88 | 15000 | 3.1693 |
3.0998 | 30.88 | 15500 | 3.1679 |
3.08 | 31.87 | 16000 | 3.1639 |
Framework versions
- Transformers 4.26.1
- Pytorch 1.13.1
- Datasets 2.9.0
- Tokenizers 0.13.2