<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->
gpt2_winobias_finetuned
This model is a fine-tuned version of gpt2 on the None dataset. It achieves the following results on the evaluation set:
- Loss: 2.4430
- Accuracy: 0.7153
- Tp: 0.3725
- Tn: 0.3428
- Fp: 0.1572
- Fn: 0.1275
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 64
- eval_batch_size: 64
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 50
Training results
Training Loss | Epoch | Step | Validation Loss | Accuracy | Tp | Tn | Fp | Fn |
---|---|---|---|---|---|---|---|---|
0.7128 | 0.8 | 20 | 0.7014 | 0.5 | 0.0 | 0.5 | 0.0 | 0.5 |
0.7384 | 1.6 | 40 | 0.7479 | 0.5 | 0.0 | 0.5 | 0.0 | 0.5 |
0.7142 | 2.4 | 60 | 0.7035 | 0.5 | 0.0 | 0.5 | 0.0 | 0.5 |
0.7004 | 3.2 | 80 | 0.7548 | 0.5 | 0.5 | 0.0 | 0.5 | 0.0 |
0.7353 | 4.0 | 100 | 0.7191 | 0.5 | 0.0 | 0.5 | 0.0 | 0.5 |
0.7041 | 4.8 | 120 | 0.7120 | 0.5 | 0.5 | 0.0 | 0.5 | 0.0 |
0.7012 | 5.6 | 140 | 0.7019 | 0.5 | 0.0 | 0.5 | 0.0 | 0.5 |
0.695 | 6.4 | 160 | 0.7264 | 0.5 | 0.0 | 0.5 | 0.0 | 0.5 |
0.7069 | 7.2 | 180 | 0.6932 | 0.5 | 0.5 | 0.0 | 0.5 | 0.0 |
0.7208 | 8.0 | 200 | 0.7370 | 0.5 | 0.5 | 0.0 | 0.5 | 0.0 |
0.7203 | 8.8 | 220 | 0.6935 | 0.5 | 0.0 | 0.5 | 0.0 | 0.5 |
0.6968 | 9.6 | 240 | 0.6944 | 0.5 | 0.5 | 0.0 | 0.5 | 0.0 |
0.7162 | 10.4 | 260 | 0.7056 | 0.5 | 0.5 | 0.0 | 0.5 | 0.0 |
0.6966 | 11.2 | 280 | 0.6942 | 0.5 | 0.0 | 0.5 | 0.0 | 0.5 |
0.705 | 12.0 | 300 | 0.6963 | 0.5 | 0.0 | 0.5 | 0.0 | 0.5 |
0.7007 | 12.8 | 320 | 0.7010 | 0.4956 | 0.4716 | 0.0240 | 0.4760 | 0.0284 |
0.7039 | 13.6 | 340 | 0.6973 | 0.4937 | 0.4697 | 0.0240 | 0.4760 | 0.0303 |
0.6864 | 14.4 | 360 | 0.7336 | 0.4937 | 0.3946 | 0.0991 | 0.4009 | 0.1054 |
0.7068 | 15.2 | 380 | 0.7135 | 0.4987 | 0.1073 | 0.3914 | 0.1086 | 0.3927 |
0.6626 | 16.0 | 400 | 0.7278 | 0.5019 | 0.0259 | 0.4760 | 0.0240 | 0.4741 |
0.6502 | 16.8 | 420 | 0.7612 | 0.5025 | 0.2879 | 0.2146 | 0.2854 | 0.2121 |
0.5627 | 17.6 | 440 | 0.7975 | 0.5404 | 0.1932 | 0.3472 | 0.1528 | 0.3068 |
0.5473 | 18.4 | 460 | 0.7218 | 0.5972 | 0.3005 | 0.2967 | 0.2033 | 0.1995 |
0.4772 | 19.2 | 480 | 0.7505 | 0.6490 | 0.3529 | 0.2961 | 0.2039 | 0.1471 |
0.388 | 20.0 | 500 | 0.7515 | 0.6812 | 0.3624 | 0.3188 | 0.1812 | 0.1376 |
0.3102 | 20.8 | 520 | 0.9149 | 0.6894 | 0.3819 | 0.3074 | 0.1926 | 0.1181 |
0.2433 | 21.6 | 540 | 0.7770 | 0.7020 | 0.3630 | 0.3390 | 0.1610 | 0.1370 |
0.2379 | 22.4 | 560 | 0.9499 | 0.7102 | 0.3422 | 0.3681 | 0.1319 | 0.1578 |
0.1669 | 23.2 | 580 | 0.9379 | 0.7077 | 0.3794 | 0.3283 | 0.1717 | 0.1206 |
0.1622 | 24.0 | 600 | 0.9364 | 0.7077 | 0.3902 | 0.3176 | 0.1824 | 0.1098 |
0.1455 | 24.8 | 620 | 1.1195 | 0.6970 | 0.3359 | 0.3611 | 0.1389 | 0.1641 |
0.114 | 25.6 | 640 | 1.1392 | 0.7102 | 0.3580 | 0.3523 | 0.1477 | 0.1420 |
0.0714 | 26.4 | 660 | 1.4233 | 0.7165 | 0.3447 | 0.3718 | 0.1282 | 0.1553 |
0.0739 | 27.2 | 680 | 1.5302 | 0.7159 | 0.3567 | 0.3592 | 0.1408 | 0.1433 |
0.0774 | 28.0 | 700 | 2.3741 | 0.7096 | 0.3636 | 0.3460 | 0.1540 | 0.1364 |
0.0672 | 28.8 | 720 | 1.3433 | 0.7096 | 0.3567 | 0.3529 | 0.1471 | 0.1433 |
0.0616 | 29.6 | 740 | 1.6716 | 0.7172 | 0.3718 | 0.3453 | 0.1547 | 0.1282 |
0.0432 | 30.4 | 760 | 1.2928 | 0.7109 | 0.3561 | 0.3548 | 0.1452 | 0.1439 |
0.0417 | 31.2 | 780 | 1.6960 | 0.7052 | 0.3447 | 0.3605 | 0.1395 | 0.1553 |
0.0538 | 32.0 | 800 | 1.8484 | 0.7102 | 0.3516 | 0.3586 | 0.1414 | 0.1484 |
0.0513 | 32.8 | 820 | 2.0704 | 0.6963 | 0.3485 | 0.3479 | 0.1521 | 0.1515 |
0.0428 | 33.6 | 840 | 1.8172 | 0.7090 | 0.3630 | 0.3460 | 0.1540 | 0.1370 |
0.0317 | 34.4 | 860 | 1.8815 | 0.7121 | 0.3674 | 0.3447 | 0.1553 | 0.1326 |
0.0375 | 35.2 | 880 | 1.7032 | 0.7121 | 0.3561 | 0.3561 | 0.1439 | 0.1439 |
0.032 | 36.0 | 900 | 2.1972 | 0.7128 | 0.3573 | 0.3554 | 0.1446 | 0.1427 |
0.0173 | 36.8 | 920 | 2.2502 | 0.7165 | 0.3662 | 0.3504 | 0.1496 | 0.1338 |
0.0117 | 37.6 | 940 | 2.1330 | 0.7184 | 0.3687 | 0.3497 | 0.1503 | 0.1313 |
0.0206 | 38.4 | 960 | 2.0618 | 0.7191 | 0.3725 | 0.3466 | 0.1534 | 0.1275 |
0.0146 | 39.2 | 980 | 1.9688 | 0.7172 | 0.3592 | 0.3580 | 0.1420 | 0.1408 |
0.0108 | 40.0 | 1000 | 2.0846 | 0.7153 | 0.3592 | 0.3561 | 0.1439 | 0.1408 |
0.0131 | 40.8 | 1020 | 2.3518 | 0.7140 | 0.3592 | 0.3548 | 0.1452 | 0.1408 |
0.0095 | 41.6 | 1040 | 2.5874 | 0.7197 | 0.3712 | 0.3485 | 0.1515 | 0.1288 |
0.0331 | 42.4 | 1060 | 2.5151 | 0.7159 | 0.3649 | 0.3510 | 0.1490 | 0.1351 |
0.0037 | 43.2 | 1080 | 2.3016 | 0.7153 | 0.3643 | 0.3510 | 0.1490 | 0.1357 |
0.0212 | 44.0 | 1100 | 2.1693 | 0.7121 | 0.3554 | 0.3567 | 0.1433 | 0.1446 |
0.0109 | 44.8 | 1120 | 2.1769 | 0.7134 | 0.3580 | 0.3554 | 0.1446 | 0.1420 |
0.0032 | 45.6 | 1140 | 2.2651 | 0.7146 | 0.3649 | 0.3497 | 0.1503 | 0.1351 |
0.0122 | 46.4 | 1160 | 2.3623 | 0.7172 | 0.3712 | 0.3460 | 0.1540 | 0.1288 |
0.0029 | 47.2 | 1180 | 2.4197 | 0.7197 | 0.3763 | 0.3434 | 0.1566 | 0.1237 |
0.0197 | 48.0 | 1200 | 2.4860 | 0.7159 | 0.3718 | 0.3441 | 0.1559 | 0.1282 |
0.0127 | 48.8 | 1220 | 2.4478 | 0.7146 | 0.3725 | 0.3422 | 0.1578 | 0.1275 |
0.0273 | 49.6 | 1240 | 2.4430 | 0.7153 | 0.3725 | 0.3428 | 0.1572 | 0.1275 |
Framework versions
- Transformers 4.26.1
- Pytorch 1.13.1
- Datasets 2.10.1
- Tokenizers 0.13.2