<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->
distilbert_sa_GLUE_Experiment_stsb_96
This model is a fine-tuned version of distilbert-base-uncased on the GLUE STSB dataset. It achieves the following results on the evaluation set:
- Loss: 2.2501
- Pearson: 0.0145
- Spearmanr: 0.0164
- Combined Score: 0.0154
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 256
- eval_batch_size: 256
- seed: 10
- distributed_type: multi-GPU
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 50
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss | Pearson | Spearmanr | Combined Score |
---|---|---|---|---|---|---|
8.5764 | 1.0 | 23 | 6.5600 | -0.0093 | -0.0112 | -0.0102 |
7.7973 | 2.0 | 46 | 6.1824 | 0.0235 | 0.0229 | 0.0232 |
7.3288 | 3.0 | 69 | 5.7819 | -0.0634 | -0.0621 | -0.0628 |
6.8588 | 4.0 | 92 | 5.3627 | nan | nan | nan |
6.3722 | 5.0 | 115 | 4.9405 | nan | nan | nan |
5.8419 | 6.0 | 138 | 4.5257 | 0.0099 | 0.0107 | 0.0103 |
5.3405 | 7.0 | 161 | 4.1302 | nan | nan | nan |
4.8794 | 8.0 | 184 | 3.7607 | nan | nan | nan |
4.4156 | 9.0 | 207 | 3.4218 | -0.0075 | -0.0067 | -0.0071 |
3.991 | 10.0 | 230 | 3.1190 | 0.0246 | 0.0246 | 0.0246 |
3.6029 | 11.0 | 253 | 2.8558 | -0.0034 | -0.0006 | -0.0020 |
3.2636 | 12.0 | 276 | 2.6377 | nan | nan | nan |
2.9656 | 13.0 | 299 | 2.4660 | 0.0137 | 0.0129 | 0.0133 |
2.7028 | 14.0 | 322 | 2.3432 | nan | nan | nan |
2.4851 | 15.0 | 345 | 2.2710 | 0.0132 | 0.0145 | 0.0138 |
2.3576 | 16.0 | 368 | 2.2501 | 0.0145 | 0.0164 | 0.0154 |
2.2531 | 17.0 | 391 | 2.2773 | nan | nan | nan |
2.2045 | 18.0 | 414 | 2.3342 | -0.0082 | -0.0113 | -0.0098 |
2.1967 | 19.0 | 437 | 2.3460 | nan | nan | nan |
2.2041 | 20.0 | 460 | 2.3556 | -0.0025 | -0.0010 | -0.0017 |
2.1816 | 21.0 | 483 | 2.3715 | 0.0142 | 0.0160 | 0.0151 |
Framework versions
- Transformers 4.26.0
- Pytorch 1.14.0a0+410ce96
- Datasets 2.8.0
- Tokenizers 0.13.2