<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->
Glue_distilbert
This model is a fine-tuned version of distilbert-base-uncased on the GLUE MRPC dataset. It achieves the following results on the evaluation set:
- Loss: 1.1042
- Accuracy: 0.8505
- F1: 0.8961
- Combined Score: 0.8733
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 32
- eval_batch_size: 32
- seed: 33
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 50
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss | Accuracy | F1 | Combined Score |
---|---|---|---|---|---|---|
0.5066 | 1.0 | 115 | 0.3833 | 0.8358 | 0.8851 | 0.8604 |
0.3227 | 2.0 | 230 | 0.4336 | 0.8309 | 0.8844 | 0.8577 |
0.1764 | 3.0 | 345 | 0.4943 | 0.8309 | 0.8757 | 0.8533 |
0.0792 | 4.0 | 460 | 0.7271 | 0.8431 | 0.8861 | 0.8646 |
0.058 | 5.0 | 575 | 0.8374 | 0.8456 | 0.8945 | 0.8700 |
0.0594 | 6.0 | 690 | 0.7570 | 0.8309 | 0.8816 | 0.8563 |
0.0395 | 7.0 | 805 | 0.8640 | 0.8431 | 0.8897 | 0.8664 |
0.03 | 8.0 | 920 | 0.9007 | 0.8260 | 0.8799 | 0.8529 |
0.0283 | 9.0 | 1035 | 0.9479 | 0.8211 | 0.8685 | 0.8448 |
0.0127 | 10.0 | 1150 | 1.0686 | 0.8431 | 0.8915 | 0.8673 |
0.0097 | 11.0 | 1265 | 1.0752 | 0.8431 | 0.8919 | 0.8675 |
0.0164 | 12.0 | 1380 | 1.0627 | 0.8284 | 0.8801 | 0.8543 |
0.007 | 13.0 | 1495 | 1.1466 | 0.8333 | 0.8815 | 0.8574 |
0.0132 | 14.0 | 1610 | 1.1442 | 0.8456 | 0.8938 | 0.8697 |
0.0125 | 15.0 | 1725 | 1.1716 | 0.8235 | 0.8771 | 0.8503 |
0.0174 | 16.0 | 1840 | 1.1187 | 0.8333 | 0.8790 | 0.8562 |
0.0171 | 17.0 | 1955 | 1.1053 | 0.8456 | 0.8938 | 0.8697 |
0.0026 | 18.0 | 2070 | 1.2011 | 0.8309 | 0.8787 | 0.8548 |
0.0056 | 19.0 | 2185 | 1.3085 | 0.8260 | 0.8748 | 0.8504 |
0.0067 | 20.0 | 2300 | 1.3042 | 0.8333 | 0.8803 | 0.8568 |
0.0129 | 21.0 | 2415 | 1.1042 | 0.8505 | 0.8961 | 0.8733 |
0.0149 | 22.0 | 2530 | 1.1575 | 0.8235 | 0.8820 | 0.8527 |
0.0045 | 23.0 | 2645 | 1.2359 | 0.8407 | 0.8900 | 0.8654 |
0.0029 | 24.0 | 2760 | 1.3823 | 0.8211 | 0.8744 | 0.8477 |
0.0074 | 25.0 | 2875 | 1.2394 | 0.8431 | 0.8904 | 0.8668 |
0.002 | 26.0 | 2990 | 1.4450 | 0.8333 | 0.8859 | 0.8596 |
0.0039 | 27.0 | 3105 | 1.5102 | 0.8284 | 0.8805 | 0.8545 |
0.0015 | 28.0 | 3220 | 1.4767 | 0.8431 | 0.8915 | 0.8673 |
0.0062 | 29.0 | 3335 | 1.5101 | 0.8407 | 0.8926 | 0.8666 |
0.0054 | 30.0 | 3450 | 1.3861 | 0.8382 | 0.8893 | 0.8637 |
0.0001 | 31.0 | 3565 | 1.4101 | 0.8456 | 0.8948 | 0.8702 |
0.0 | 32.0 | 3680 | 1.4203 | 0.8480 | 0.8963 | 0.8722 |
0.002 | 33.0 | 3795 | 1.4526 | 0.8431 | 0.8923 | 0.8677 |
0.0019 | 34.0 | 3910 | 1.6265 | 0.8260 | 0.8842 | 0.8551 |
0.0029 | 35.0 | 4025 | 1.4788 | 0.8456 | 0.8945 | 0.8700 |
0.0 | 36.0 | 4140 | 1.4668 | 0.8480 | 0.8956 | 0.8718 |
0.0007 | 37.0 | 4255 | 1.5248 | 0.8456 | 0.8945 | 0.8700 |
0.0 | 38.0 | 4370 | 1.5202 | 0.8480 | 0.8960 | 0.8720 |
0.0033 | 39.0 | 4485 | 1.5541 | 0.8358 | 0.8878 | 0.8618 |
0.0017 | 40.0 | 4600 | 1.5097 | 0.8407 | 0.8904 | 0.8655 |
0.0 | 41.0 | 4715 | 1.5301 | 0.8407 | 0.8904 | 0.8655 |
0.0 | 42.0 | 4830 | 1.4974 | 0.8407 | 0.8862 | 0.8634 |
0.0018 | 43.0 | 4945 | 1.5483 | 0.8382 | 0.8896 | 0.8639 |
0.0 | 44.0 | 5060 | 1.5071 | 0.8480 | 0.8931 | 0.8706 |
0.0 | 45.0 | 5175 | 1.5104 | 0.8480 | 0.8935 | 0.8708 |
0.0011 | 46.0 | 5290 | 1.5445 | 0.8382 | 0.8896 | 0.8639 |
0.0012 | 47.0 | 5405 | 1.5378 | 0.8431 | 0.8900 | 0.8666 |
0.0 | 48.0 | 5520 | 1.5577 | 0.8407 | 0.8881 | 0.8644 |
0.0009 | 49.0 | 5635 | 1.5431 | 0.8407 | 0.8885 | 0.8646 |
0.0002 | 50.0 | 5750 | 1.5383 | 0.8431 | 0.8904 | 0.8668 |
Framework versions
- Transformers 4.25.1
- Pytorch 1.13.0+cu116
- Datasets 2.8.0
- Tokenizers 0.13.2