<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->
20230824043245
This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:
- Loss: 0.6512
- Accuracy: 0.7473
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.003
- train_batch_size: 4
- eval_batch_size: 8
- seed: 11
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 60.0
Training results
Training Loss | Epoch | Step | Validation Loss | Accuracy |
---|---|---|---|---|
1.0514 | 1.0 | 623 | 0.7220 | 0.5054 |
0.8415 | 2.0 | 1246 | 0.6761 | 0.5415 |
0.925 | 3.0 | 1869 | 0.7140 | 0.5126 |
0.8783 | 4.0 | 2492 | 0.6604 | 0.6245 |
0.7907 | 5.0 | 3115 | 0.6059 | 0.6787 |
0.7756 | 6.0 | 3738 | 0.6058 | 0.6931 |
0.7308 | 7.0 | 4361 | 1.0272 | 0.6173 |
0.7169 | 8.0 | 4984 | 0.7565 | 0.6679 |
0.689 | 9.0 | 5607 | 0.6401 | 0.7004 |
0.6368 | 10.0 | 6230 | 0.6674 | 0.7256 |
0.5682 | 11.0 | 6853 | 0.5540 | 0.7148 |
0.5974 | 12.0 | 7476 | 0.6804 | 0.7473 |
0.5286 | 13.0 | 8099 | 0.5929 | 0.7401 |
0.5348 | 14.0 | 8722 | 0.7100 | 0.7220 |
0.4956 | 15.0 | 9345 | 0.5456 | 0.7184 |
0.4654 | 16.0 | 9968 | 0.6426 | 0.7112 |
0.4273 | 17.0 | 10591 | 0.6307 | 0.7365 |
0.4259 | 18.0 | 11214 | 0.5385 | 0.7365 |
0.4454 | 19.0 | 11837 | 0.6045 | 0.7437 |
0.4176 | 20.0 | 12460 | 0.7234 | 0.7401 |
0.3953 | 21.0 | 13083 | 0.6217 | 0.7437 |
0.3847 | 22.0 | 13706 | 0.6348 | 0.7437 |
0.3717 | 23.0 | 14329 | 0.8536 | 0.7148 |
0.3512 | 24.0 | 14952 | 0.5710 | 0.7509 |
0.3237 | 25.0 | 15575 | 0.5594 | 0.7437 |
0.3102 | 26.0 | 16198 | 0.7130 | 0.7581 |
0.3302 | 27.0 | 16821 | 0.6404 | 0.7653 |
0.3066 | 28.0 | 17444 | 0.6608 | 0.7473 |
0.305 | 29.0 | 18067 | 0.6181 | 0.7617 |
0.2894 | 30.0 | 18690 | 0.7626 | 0.7329 |
0.2891 | 31.0 | 19313 | 0.6387 | 0.7545 |
0.2836 | 32.0 | 19936 | 0.5889 | 0.7437 |
0.2682 | 33.0 | 20559 | 0.7169 | 0.7473 |
0.2625 | 34.0 | 21182 | 0.6298 | 0.7617 |
0.246 | 35.0 | 21805 | 0.6207 | 0.7617 |
0.266 | 36.0 | 22428 | 0.6256 | 0.7473 |
0.2398 | 37.0 | 23051 | 0.7504 | 0.7617 |
0.2526 | 38.0 | 23674 | 0.6578 | 0.7473 |
0.2165 | 39.0 | 24297 | 0.6624 | 0.7617 |
0.2347 | 40.0 | 24920 | 0.6133 | 0.7365 |
0.2296 | 41.0 | 25543 | 0.6224 | 0.7509 |
0.2226 | 42.0 | 26166 | 0.6971 | 0.7473 |
0.2214 | 43.0 | 26789 | 0.6280 | 0.7509 |
0.2268 | 44.0 | 27412 | 0.6562 | 0.7473 |
0.2244 | 45.0 | 28035 | 0.6726 | 0.7509 |
0.2067 | 46.0 | 28658 | 0.6554 | 0.7581 |
0.1971 | 47.0 | 29281 | 0.5949 | 0.7581 |
0.2135 | 48.0 | 29904 | 0.6618 | 0.7437 |
0.2012 | 49.0 | 30527 | 0.6752 | 0.7581 |
0.1882 | 50.0 | 31150 | 0.6223 | 0.7581 |
0.2056 | 51.0 | 31773 | 0.6487 | 0.7473 |
0.1993 | 52.0 | 32396 | 0.6544 | 0.7509 |
0.197 | 53.0 | 33019 | 0.6673 | 0.7401 |
0.1867 | 54.0 | 33642 | 0.6563 | 0.7437 |
0.1715 | 55.0 | 34265 | 0.6780 | 0.7401 |
0.1787 | 56.0 | 34888 | 0.6906 | 0.7329 |
0.19 | 57.0 | 35511 | 0.6606 | 0.7437 |
0.1819 | 58.0 | 36134 | 0.6461 | 0.7437 |
0.1879 | 59.0 | 36757 | 0.6516 | 0.7437 |
0.1773 | 60.0 | 37380 | 0.6512 | 0.7473 |
Framework versions
- Transformers 4.26.1
- Pytorch 2.0.1+cu118
- Datasets 2.12.0
- Tokenizers 0.13.3