<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->
bert-small-finer-longer
This model is a fine-tuned version of muhtasham/bert-small-finer on the None dataset. It achieves the following results on the evaluation set:
- Loss: 1.4264
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 64
- seed: 42
- gradient_accumulation_steps: 8
- total_train_batch_size: 64
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- num_epochs: 20
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
No log | 0.49 | 500 | 1.6683 |
1.5941 | 0.97 | 1000 | 1.6569 |
1.5941 | 1.46 | 1500 | 1.6436 |
1.5605 | 1.94 | 2000 | 1.6173 |
1.5605 | 2.43 | 2500 | 1.6073 |
1.5297 | 2.91 | 3000 | 1.6001 |
1.5297 | 3.4 | 3500 | 1.5815 |
1.5022 | 3.89 | 4000 | 1.5756 |
1.5022 | 4.37 | 4500 | 1.5568 |
1.4753 | 4.86 | 5000 | 1.5458 |
1.4753 | 5.34 | 5500 | 1.5399 |
1.4537 | 5.83 | 6000 | 1.5273 |
1.4537 | 6.32 | 6500 | 1.5192 |
1.433 | 6.8 | 7000 | 1.5099 |
1.433 | 7.29 | 7500 | 1.5083 |
1.4169 | 7.77 | 8000 | 1.4957 |
1.4169 | 8.26 | 8500 | 1.4914 |
1.3982 | 8.75 | 9000 | 1.4859 |
1.3982 | 9.23 | 9500 | 1.4697 |
1.3877 | 9.72 | 10000 | 1.4711 |
1.3877 | 10.2 | 10500 | 1.4608 |
1.3729 | 10.69 | 11000 | 1.4583 |
1.3729 | 11.18 | 11500 | 1.4513 |
1.3627 | 11.66 | 12000 | 1.4498 |
1.3627 | 12.15 | 12500 | 1.4396 |
1.357 | 12.63 | 13000 | 1.4415 |
1.357 | 13.12 | 13500 | 1.4347 |
1.3484 | 13.61 | 14000 | 1.4316 |
1.3484 | 14.09 | 14500 | 1.4319 |
1.3442 | 14.58 | 15000 | 1.4268 |
1.3442 | 15.06 | 15500 | 1.4293 |
1.3387 | 15.55 | 16000 | 1.4217 |
1.3387 | 16.03 | 16500 | 1.4241 |
1.3358 | 16.52 | 17000 | 1.4250 |
1.3358 | 17.01 | 17500 | 1.4196 |
1.3344 | 17.49 | 18000 | 1.4193 |
1.3344 | 17.98 | 18500 | 1.4200 |
1.3274 | 18.46 | 19000 | 1.4250 |
1.3274 | 18.95 | 19500 | 1.4168 |
1.3348 | 19.44 | 20000 | 1.4164 |
1.3348 | 19.92 | 20500 | 1.4264 |
Framework versions
- Transformers 4.18.0
- Pytorch 1.12.1+cu113
- Datasets 2.4.0
- Tokenizers 0.12.1