<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->
vit-base_rvl_tobacco_test_entropy_large
This model is a fine-tuned version of jordyvl/vit-base_rvl-cdip on the None dataset. It achieves the following results on the evaluation set:
- Loss: 0.4296
- Accuracy: 0.9
- Brier Loss: 0.1643
- Nll: 1.3751
- F1 Micro: 0.9
- F1 Macro: 0.9013
- Ece: 0.1074
- Aurc: 0.0220
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 4
- eval_batch_size: 4
- seed: 42
- gradient_accumulation_steps: 16
- total_train_batch_size: 64
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 100
Training results
Training Loss | Epoch | Step | Validation Loss | Accuracy | Brier Loss | Nll | F1 Micro | F1 Macro | Ece | Aurc |
---|---|---|---|---|---|---|---|---|---|---|
No log | 0.96 | 12 | 2.5324 | 0.05 | 0.9010 | 18.0477 | 0.0500 | 0.0539 | 0.1568 | 0.9609 |
No log | 2.0 | 25 | 2.4440 | 0.23 | 0.8823 | 11.6657 | 0.23 | 0.1501 | 0.2626 | 0.7837 |
No log | 2.96 | 37 | 2.3026 | 0.435 | 0.8481 | 4.6649 | 0.435 | 0.2723 | 0.4061 | 0.2744 |
No log | 4.0 | 50 | 2.0897 | 0.71 | 0.7862 | 3.1934 | 0.7100 | 0.5915 | 0.5791 | 0.0865 |
No log | 4.96 | 62 | 1.8603 | 0.785 | 0.7065 | 1.8079 | 0.785 | 0.7047 | 0.6067 | 0.0508 |
No log | 6.0 | 75 | 1.6025 | 0.845 | 0.6037 | 1.3384 | 0.845 | 0.7944 | 0.5657 | 0.0393 |
No log | 6.96 | 87 | 1.3819 | 0.885 | 0.5101 | 1.1034 | 0.885 | 0.8660 | 0.5288 | 0.0355 |
No log | 8.0 | 100 | 1.1701 | 0.905 | 0.4187 | 0.9002 | 0.905 | 0.8958 | 0.4765 | 0.0302 |
No log | 8.96 | 112 | 1.0024 | 0.9 | 0.3462 | 0.8889 | 0.9 | 0.8913 | 0.3961 | 0.0259 |
No log | 10.0 | 125 | 0.8487 | 0.91 | 0.2796 | 1.0389 | 0.91 | 0.9067 | 0.3464 | 0.0225 |
No log | 10.96 | 137 | 0.7522 | 0.91 | 0.2395 | 1.0408 | 0.91 | 0.9090 | 0.2930 | 0.0225 |
No log | 12.0 | 150 | 0.6862 | 0.905 | 0.2163 | 1.2027 | 0.905 | 0.9094 | 0.2529 | 0.0250 |
No log | 12.96 | 162 | 0.6437 | 0.895 | 0.2035 | 1.0504 | 0.895 | 0.8998 | 0.2306 | 0.0230 |
No log | 14.0 | 175 | 0.5896 | 0.905 | 0.1851 | 0.8758 | 0.905 | 0.9040 | 0.2122 | 0.0201 |
No log | 14.96 | 187 | 0.5167 | 0.92 | 0.1528 | 0.7812 | 0.92 | 0.9119 | 0.1872 | 0.0143 |
No log | 16.0 | 200 | 0.4894 | 0.935 | 0.1450 | 0.9154 | 0.935 | 0.9314 | 0.1962 | 0.0186 |
No log | 16.96 | 212 | 0.4769 | 0.91 | 0.1451 | 1.1071 | 0.91 | 0.9109 | 0.1703 | 0.0196 |
No log | 18.0 | 225 | 0.4573 | 0.91 | 0.1399 | 1.0919 | 0.91 | 0.9084 | 0.1621 | 0.0182 |
No log | 18.96 | 237 | 0.4501 | 0.905 | 0.1409 | 0.9304 | 0.905 | 0.9056 | 0.1549 | 0.0177 |
No log | 20.0 | 250 | 0.4434 | 0.905 | 0.1393 | 1.2519 | 0.905 | 0.9056 | 0.1552 | 0.0180 |
No log | 20.96 | 262 | 0.4391 | 0.905 | 0.1403 | 1.2476 | 0.905 | 0.9056 | 0.1434 | 0.0178 |
No log | 22.0 | 275 | 0.4326 | 0.905 | 0.1401 | 1.2419 | 0.905 | 0.9056 | 0.1396 | 0.0180 |
No log | 22.96 | 287 | 0.4290 | 0.905 | 0.1407 | 1.2397 | 0.905 | 0.9051 | 0.1414 | 0.0181 |
No log | 24.0 | 300 | 0.4255 | 0.905 | 0.1408 | 1.2373 | 0.905 | 0.9051 | 0.1346 | 0.0182 |
No log | 24.96 | 312 | 0.4238 | 0.905 | 0.1418 | 1.2372 | 0.905 | 0.9051 | 0.1308 | 0.0183 |
No log | 26.0 | 325 | 0.4212 | 0.905 | 0.1423 | 1.2348 | 0.905 | 0.9051 | 0.1288 | 0.0184 |
No log | 26.96 | 337 | 0.4197 | 0.905 | 0.1429 | 1.2350 | 0.905 | 0.9051 | 0.1242 | 0.0187 |
No log | 28.0 | 350 | 0.4181 | 0.905 | 0.1436 | 1.2331 | 0.905 | 0.9051 | 0.1298 | 0.0187 |
No log | 28.96 | 362 | 0.4171 | 0.905 | 0.1443 | 1.2339 | 0.905 | 0.9051 | 0.1341 | 0.0188 |
No log | 30.0 | 375 | 0.4158 | 0.905 | 0.1449 | 1.2322 | 0.905 | 0.9051 | 0.1261 | 0.0190 |
No log | 30.96 | 387 | 0.4156 | 0.905 | 0.1458 | 1.2337 | 0.905 | 0.9051 | 0.1310 | 0.0190 |
No log | 32.0 | 400 | 0.4145 | 0.905 | 0.1463 | 1.2323 | 0.905 | 0.9051 | 0.1244 | 0.0192 |
No log | 32.96 | 412 | 0.4145 | 0.905 | 0.1472 | 1.2342 | 0.905 | 0.9051 | 0.1175 | 0.0193 |
No log | 34.0 | 425 | 0.4140 | 0.905 | 0.1477 | 1.2353 | 0.905 | 0.9051 | 0.1163 | 0.0194 |
No log | 34.96 | 437 | 0.4138 | 0.905 | 0.1485 | 1.2384 | 0.905 | 0.9051 | 0.1296 | 0.0195 |
No log | 36.0 | 450 | 0.4137 | 0.905 | 0.1491 | 1.3855 | 0.905 | 0.9051 | 0.1271 | 0.0195 |
No log | 36.96 | 462 | 0.4134 | 0.905 | 0.1497 | 1.3846 | 0.905 | 0.9051 | 0.1264 | 0.0196 |
No log | 38.0 | 475 | 0.4137 | 0.905 | 0.1504 | 1.3842 | 0.905 | 0.9051 | 0.1254 | 0.0196 |
No log | 38.96 | 487 | 0.4136 | 0.91 | 0.1509 | 1.3835 | 0.91 | 0.9109 | 0.1160 | 0.0196 |
0.5543 | 40.0 | 500 | 0.4140 | 0.91 | 0.1515 | 1.3831 | 0.91 | 0.9109 | 0.1190 | 0.0198 |
0.5543 | 40.96 | 512 | 0.4138 | 0.91 | 0.1519 | 1.3825 | 0.91 | 0.9109 | 0.1186 | 0.0198 |
0.5543 | 42.0 | 525 | 0.4143 | 0.91 | 0.1526 | 1.3822 | 0.91 | 0.9109 | 0.1180 | 0.0198 |
0.5543 | 42.96 | 537 | 0.4143 | 0.91 | 0.1530 | 1.3816 | 0.91 | 0.9109 | 0.1220 | 0.0199 |
0.5543 | 44.0 | 550 | 0.4148 | 0.91 | 0.1536 | 1.3815 | 0.91 | 0.9109 | 0.1214 | 0.0200 |
0.5543 | 44.96 | 562 | 0.4149 | 0.91 | 0.1539 | 1.3809 | 0.91 | 0.9109 | 0.1208 | 0.0200 |
0.5543 | 46.0 | 575 | 0.4154 | 0.91 | 0.1545 | 1.3807 | 0.91 | 0.9109 | 0.1173 | 0.0200 |
0.5543 | 46.96 | 587 | 0.4157 | 0.91 | 0.1549 | 1.3803 | 0.91 | 0.9109 | 0.1084 | 0.0202 |
0.5543 | 48.0 | 600 | 0.4162 | 0.91 | 0.1554 | 1.3801 | 0.91 | 0.9109 | 0.1080 | 0.0202 |
0.5543 | 48.96 | 612 | 0.4163 | 0.91 | 0.1557 | 1.3798 | 0.91 | 0.9109 | 0.1068 | 0.0202 |
0.5543 | 50.0 | 625 | 0.4169 | 0.91 | 0.1562 | 1.3795 | 0.91 | 0.9109 | 0.1066 | 0.0203 |
0.5543 | 50.96 | 637 | 0.4171 | 0.91 | 0.1565 | 1.3793 | 0.91 | 0.9109 | 0.1064 | 0.0203 |
0.5543 | 52.0 | 650 | 0.4177 | 0.91 | 0.1570 | 1.3791 | 0.91 | 0.9109 | 0.1120 | 0.0203 |
0.5543 | 52.96 | 662 | 0.4180 | 0.91 | 0.1573 | 1.3789 | 0.91 | 0.9109 | 0.1117 | 0.0203 |
0.5543 | 54.0 | 675 | 0.4185 | 0.91 | 0.1577 | 1.3786 | 0.91 | 0.9109 | 0.1065 | 0.0204 |
0.5543 | 54.96 | 687 | 0.4187 | 0.91 | 0.1579 | 1.3785 | 0.91 | 0.9109 | 0.1063 | 0.0204 |
0.5543 | 56.0 | 700 | 0.4193 | 0.91 | 0.1584 | 1.3782 | 0.91 | 0.9109 | 0.1062 | 0.0204 |
0.5543 | 56.96 | 712 | 0.4196 | 0.91 | 0.1586 | 1.3782 | 0.91 | 0.9109 | 0.1058 | 0.0206 |
0.5543 | 58.0 | 725 | 0.4200 | 0.91 | 0.1590 | 1.3779 | 0.91 | 0.9109 | 0.1060 | 0.0206 |
0.5543 | 58.96 | 737 | 0.4203 | 0.91 | 0.1592 | 1.3778 | 0.91 | 0.9109 | 0.1095 | 0.0207 |
0.5543 | 60.0 | 750 | 0.4209 | 0.91 | 0.1596 | 1.3776 | 0.91 | 0.9109 | 0.1055 | 0.0209 |
0.5543 | 60.96 | 762 | 0.4213 | 0.91 | 0.1598 | 1.3776 | 0.91 | 0.9109 | 0.1091 | 0.0209 |
0.5543 | 62.0 | 775 | 0.4216 | 0.91 | 0.1601 | 1.3772 | 0.91 | 0.9109 | 0.1019 | 0.0210 |
0.5543 | 62.96 | 787 | 0.4221 | 0.91 | 0.1603 | 1.3773 | 0.91 | 0.9109 | 0.1017 | 0.0211 |
0.5543 | 64.0 | 800 | 0.4225 | 0.905 | 0.1606 | 1.3769 | 0.905 | 0.9064 | 0.0997 | 0.0211 |
0.5543 | 64.96 | 812 | 0.4228 | 0.905 | 0.1608 | 1.3771 | 0.905 | 0.9064 | 0.0995 | 0.0211 |
0.5543 | 66.0 | 825 | 0.4232 | 0.9 | 0.1611 | 1.3766 | 0.9 | 0.9013 | 0.1046 | 0.0212 |
0.5543 | 66.96 | 837 | 0.4236 | 0.9 | 0.1613 | 1.3768 | 0.9 | 0.9013 | 0.1045 | 0.0213 |
0.5543 | 68.0 | 850 | 0.4240 | 0.9 | 0.1615 | 1.3764 | 0.9 | 0.9013 | 0.1026 | 0.0213 |
0.5543 | 68.96 | 862 | 0.4243 | 0.9 | 0.1617 | 1.3765 | 0.9 | 0.9013 | 0.1043 | 0.0213 |
0.5543 | 70.0 | 875 | 0.4247 | 0.9 | 0.1619 | 1.3762 | 0.9 | 0.9013 | 0.1060 | 0.0213 |
0.5543 | 70.96 | 887 | 0.4249 | 0.9 | 0.1620 | 1.3762 | 0.9 | 0.9013 | 0.1077 | 0.0214 |
0.5543 | 72.0 | 900 | 0.4254 | 0.9 | 0.1623 | 1.3760 | 0.9 | 0.9013 | 0.1057 | 0.0214 |
0.5543 | 72.96 | 912 | 0.4257 | 0.9 | 0.1624 | 1.3760 | 0.9 | 0.9013 | 0.1074 | 0.0213 |
0.5543 | 74.0 | 925 | 0.4259 | 0.9 | 0.1625 | 1.3758 | 0.9 | 0.9013 | 0.1056 | 0.0213 |
0.5543 | 74.96 | 937 | 0.4262 | 0.9 | 0.1627 | 1.3758 | 0.9 | 0.9013 | 0.1056 | 0.0214 |
0.5543 | 76.0 | 950 | 0.4266 | 0.9 | 0.1629 | 1.3757 | 0.9 | 0.9013 | 0.1058 | 0.0216 |
0.5543 | 76.96 | 962 | 0.4268 | 0.9 | 0.1630 | 1.3756 | 0.9 | 0.9013 | 0.1057 | 0.0216 |
0.5543 | 78.0 | 975 | 0.4271 | 0.9 | 0.1631 | 1.3755 | 0.9 | 0.9013 | 0.1076 | 0.0216 |
0.5543 | 78.96 | 987 | 0.4274 | 0.9 | 0.1632 | 1.3756 | 0.9 | 0.9013 | 0.1075 | 0.0216 |
0.0526 | 80.0 | 1000 | 0.4275 | 0.9 | 0.1634 | 1.3754 | 0.9 | 0.9013 | 0.1100 | 0.0217 |
0.0526 | 80.96 | 1012 | 0.4278 | 0.9 | 0.1635 | 1.3754 | 0.9 | 0.9013 | 0.1099 | 0.0218 |
0.0526 | 82.0 | 1025 | 0.4280 | 0.9 | 0.1636 | 1.3753 | 0.9 | 0.9013 | 0.1098 | 0.0218 |
0.0526 | 82.96 | 1037 | 0.4282 | 0.9 | 0.1637 | 1.3753 | 0.9 | 0.9013 | 0.1098 | 0.0218 |
0.0526 | 84.0 | 1050 | 0.4284 | 0.9 | 0.1638 | 1.3753 | 0.9 | 0.9013 | 0.1097 | 0.0218 |
0.0526 | 84.96 | 1062 | 0.4286 | 0.9 | 0.1638 | 1.3753 | 0.9 | 0.9013 | 0.1097 | 0.0218 |
0.0526 | 86.0 | 1075 | 0.4288 | 0.9 | 0.1639 | 1.3752 | 0.9 | 0.9013 | 0.1096 | 0.0218 |
0.0526 | 86.96 | 1087 | 0.4289 | 0.9 | 0.1640 | 1.3752 | 0.9 | 0.9013 | 0.1096 | 0.0218 |
0.0526 | 88.0 | 1100 | 0.4291 | 0.9 | 0.1641 | 1.3752 | 0.9 | 0.9013 | 0.1095 | 0.0218 |
0.0526 | 88.96 | 1112 | 0.4292 | 0.9 | 0.1641 | 1.3752 | 0.9 | 0.9013 | 0.1095 | 0.0218 |
0.0526 | 90.0 | 1125 | 0.4293 | 0.9 | 0.1642 | 1.3752 | 0.9 | 0.9013 | 0.1074 | 0.0219 |
0.0526 | 90.96 | 1137 | 0.4294 | 0.9 | 0.1642 | 1.3752 | 0.9 | 0.9013 | 0.1075 | 0.0219 |
0.0526 | 92.0 | 1150 | 0.4295 | 0.9 | 0.1642 | 1.3752 | 0.9 | 0.9013 | 0.1075 | 0.0220 |
0.0526 | 92.96 | 1162 | 0.4295 | 0.9 | 0.1643 | 1.3752 | 0.9 | 0.9013 | 0.1075 | 0.0220 |
0.0526 | 94.0 | 1175 | 0.4296 | 0.9 | 0.1643 | 1.3751 | 0.9 | 0.9013 | 0.1075 | 0.0220 |
0.0526 | 94.96 | 1187 | 0.4296 | 0.9 | 0.1643 | 1.3751 | 0.9 | 0.9013 | 0.1075 | 0.0220 |
0.0526 | 96.0 | 1200 | 0.4296 | 0.9 | 0.1643 | 1.3751 | 0.9 | 0.9013 | 0.1074 | 0.0220 |
Framework versions
- Transformers 4.28.0.dev0
- Pytorch 1.12.1+cu113
- Datasets 2.12.0
- Tokenizers 0.12.1