<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->
results
This model is a fine-tuned version of google/bert_uncased_L-4_H-512_A-8 on the None dataset. It achieves the following results on the evaluation set:
- Loss: 1.6627
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 64
- seed: 42
- gradient_accumulation_steps: 8
- total_train_batch_size: 64
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- num_epochs: 20
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
No log | 0.49 | 500 | 3.5536 |
3.752 | 0.97 | 1000 | 3.0406 |
3.752 | 1.46 | 1500 | 2.7601 |
2.6844 | 1.94 | 2000 | 2.5655 |
2.6844 | 2.43 | 2500 | 2.4174 |
2.3487 | 2.91 | 3000 | 2.3163 |
2.3487 | 3.4 | 3500 | 2.2146 |
2.1554 | 3.89 | 4000 | 2.1560 |
2.1554 | 4.37 | 4500 | 2.0935 |
2.019 | 4.86 | 5000 | 2.0375 |
2.019 | 5.34 | 5500 | 1.9942 |
1.9254 | 5.83 | 6000 | 1.9530 |
1.9254 | 6.32 | 6500 | 1.9215 |
1.8506 | 6.8 | 7000 | 1.8908 |
1.8506 | 7.29 | 7500 | 1.8693 |
1.793 | 7.77 | 8000 | 1.8399 |
1.793 | 8.26 | 8500 | 1.8191 |
1.7425 | 8.75 | 9000 | 1.8016 |
1.7425 | 9.23 | 9500 | 1.7760 |
1.7093 | 9.72 | 10000 | 1.7668 |
1.7093 | 10.2 | 10500 | 1.7474 |
1.6754 | 10.69 | 11000 | 1.7365 |
1.6754 | 11.18 | 11500 | 1.7229 |
1.6501 | 11.66 | 12000 | 1.7145 |
1.6501 | 12.15 | 12500 | 1.7029 |
1.633 | 12.63 | 13000 | 1.6965 |
1.633 | 13.12 | 13500 | 1.6878 |
1.6153 | 13.61 | 14000 | 1.6810 |
1.6153 | 14.09 | 14500 | 1.6775 |
1.6043 | 14.58 | 15000 | 1.6720 |
1.6043 | 15.06 | 15500 | 1.6719 |
1.5942 | 15.55 | 16000 | 1.6602 |
1.5942 | 16.03 | 16500 | 1.6643 |
1.5869 | 16.52 | 17000 | 1.6632 |
1.5869 | 17.01 | 17500 | 1.6551 |
1.5834 | 17.49 | 18000 | 1.6557 |
1.5834 | 17.98 | 18500 | 1.6561 |
1.5755 | 18.46 | 19000 | 1.6620 |
1.5755 | 18.95 | 19500 | 1.6524 |
1.5823 | 19.44 | 20000 | 1.6536 |
1.5823 | 19.92 | 20500 | 1.6627 |
Framework versions
- Transformers 4.18.0
- Pytorch 1.12.1+cu113
- Datasets 2.4.0
- Tokenizers 0.12.1