<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->
small-vanilla-target-glue-mnli
This model is a fine-tuned version of google/bert_uncased_L-4_H-512_A-8 on the None dataset. It achieves the following results on the evaluation set:
- Loss: 0.6020
- Accuracy: 0.7618
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 3e-05
- train_batch_size: 32
- eval_batch_size: 32
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: constant
- num_epochs: 200
Training results
Training Loss | Epoch | Step | Validation Loss | Accuracy |
---|---|---|---|---|
0.9249 | 0.04 | 500 | 0.8197 | 0.6419 |
0.8154 | 0.08 | 1000 | 0.7776 | 0.6651 |
0.7747 | 0.12 | 1500 | 0.7455 | 0.6749 |
0.7523 | 0.16 | 2000 | 0.7315 | 0.6853 |
0.7445 | 0.2 | 2500 | 0.7097 | 0.6977 |
0.7337 | 0.24 | 3000 | 0.6981 | 0.7026 |
0.7165 | 0.29 | 3500 | 0.6876 | 0.7117 |
0.704 | 0.33 | 4000 | 0.6669 | 0.7192 |
0.68 | 0.37 | 4500 | 0.6773 | 0.7249 |
0.6834 | 0.41 | 5000 | 0.6513 | 0.7228 |
0.6808 | 0.45 | 5500 | 0.6467 | 0.7260 |
0.671 | 0.49 | 6000 | 0.6416 | 0.7275 |
0.6733 | 0.53 | 6500 | 0.6434 | 0.7372 |
0.6714 | 0.57 | 7000 | 0.6358 | 0.7394 |
0.6552 | 0.61 | 7500 | 0.6243 | 0.7423 |
0.6519 | 0.65 | 8000 | 0.6272 | 0.7380 |
0.6546 | 0.69 | 8500 | 0.6207 | 0.7440 |
0.6457 | 0.73 | 9000 | 0.6293 | 0.7424 |
0.6458 | 0.77 | 9500 | 0.6084 | 0.7492 |
0.6349 | 0.81 | 10000 | 0.6248 | 0.7431 |
0.6376 | 0.86 | 10500 | 0.6035 | 0.7531 |
0.6294 | 0.9 | 11000 | 0.5969 | 0.7561 |
0.6264 | 0.94 | 11500 | 0.5981 | 0.7518 |
0.6297 | 0.98 | 12000 | 0.5953 | 0.7539 |
0.5935 | 1.02 | 12500 | 0.6093 | 0.7602 |
0.5726 | 1.06 | 13000 | 0.5991 | 0.7628 |
0.5571 | 1.1 | 13500 | 0.5921 | 0.7606 |
0.5535 | 1.14 | 14000 | 0.6011 | 0.7559 |
0.5427 | 1.18 | 14500 | 0.6020 | 0.7618 |
Framework versions
- Transformers 4.26.0.dev0
- Pytorch 1.13.0+cu116
- Datasets 2.8.1.dev0
- Tokenizers 0.13.2