<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->
t5-base_cola_mare_ar16_ex32_size-32_epochs-5_collected-stats
This model is a fine-tuned version of t5-base on the glue dataset. It achieves the following results on the evaluation set:
- Loss: 0.4754
- Accuracy: 0.8178
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 32
- eval_batch_size: 64
- seed: 0
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 20
- num_epochs: 5
Training results
Training Loss | Epoch | Step | Validation Loss | Accuracy |
---|---|---|---|---|
0.5704 | 0.19 | 50 | 0.5500 | 0.6913 |
0.4592 | 0.37 | 100 | 0.5609 | 0.7814 |
0.4641 | 0.56 | 150 | 0.4854 | 0.8121 |
0.4015 | 0.75 | 200 | 0.4908 | 0.8063 |
0.4365 | 0.93 | 250 | 0.5368 | 0.8063 |
0.3397 | 1.12 | 300 | 0.4968 | 0.8255 |
0.3187 | 1.31 | 350 | 0.4496 | 0.8236 |
0.3034 | 1.49 | 400 | 0.4710 | 0.8198 |
0.3725 | 1.68 | 450 | 0.5318 | 0.8236 |
0.4025 | 1.87 | 500 | 0.4754 | 0.8178 |
0.3018 | 2.05 | 550 | 0.5268 | 0.8274 |
0.3073 | 2.24 | 600 | 0.5359 | 0.8313 |
0.2784 | 2.43 | 650 | 0.4787 | 0.8332 |
0.2271 | 2.61 | 700 | 0.4870 | 0.8284 |
0.3142 | 2.8 | 750 | 0.5267 | 0.8360 |
0.3161 | 2.99 | 800 | 0.5216 | 0.8313 |
0.2491 | 3.17 | 850 | 0.5075 | 0.8332 |
0.3027 | 3.36 | 900 | 0.5142 | 0.8313 |
0.307 | 3.54 | 950 | 0.5031 | 0.8360 |
0.3338 | 3.73 | 1000 | 0.5035 | 0.8351 |
Framework versions
- Transformers 4.34.1
- Pytorch 2.1.0+cu118
- Datasets 2.14.6
- Tokenizers 0.14.1