<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->
vit-base_rvl_tobacco_crl
This model is a fine-tuned version of jordyvl/vit-base_rvl-cdip on the None dataset. It achieves the following results on the evaluation set:
- Loss: 0.5075
- Accuracy: 0.92
- Brier Loss: 0.1544
- Nll: 0.6650
- F1 Micro: 0.92
- F1 Macro: 0.9150
- Ece: 0.1721
- Aurc: 0.0193
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- gradient_accumulation_steps: 16
- total_train_batch_size: 256
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 100
Training results
Training Loss | Epoch | Step | Validation Loss | Accuracy | Brier Loss | Nll | F1 Micro | F1 Macro | Ece | Aurc |
---|---|---|---|---|---|---|---|---|---|---|
No log | 0.96 | 3 | 2.3823 | 0.045 | 0.9050 | 9.6078 | 0.045 | 0.0481 | 0.1570 | 0.9673 |
No log | 1.96 | 6 | 2.3642 | 0.05 | 0.9005 | 8.5700 | 0.0500 | 0.0549 | 0.1567 | 0.9599 |
No log | 2.96 | 9 | 2.3130 | 0.095 | 0.8925 | 6.9490 | 0.095 | 0.0853 | 0.1833 | 0.9127 |
No log | 3.96 | 12 | 2.2603 | 0.265 | 0.8804 | 5.6508 | 0.265 | 0.1642 | 0.2794 | 0.7458 |
No log | 4.96 | 15 | 2.2077 | 0.38 | 0.8637 | 4.0696 | 0.38 | 0.2272 | 0.3548 | 0.4172 |
No log | 5.96 | 18 | 2.1176 | 0.47 | 0.8411 | 2.4954 | 0.47 | 0.3062 | 0.4299 | 0.2410 |
No log | 6.96 | 21 | 2.0268 | 0.64 | 0.8132 | 2.0526 | 0.64 | 0.5126 | 0.5273 | 0.1330 |
No log | 7.96 | 24 | 1.9258 | 0.735 | 0.7792 | 1.7187 | 0.735 | 0.6337 | 0.5870 | 0.0787 |
No log | 8.96 | 27 | 1.8114 | 0.77 | 0.7409 | 1.3797 | 0.7700 | 0.6746 | 0.6034 | 0.0556 |
No log | 9.96 | 30 | 1.7062 | 0.8 | 0.6999 | 1.1402 | 0.8000 | 0.7266 | 0.6005 | 0.0466 |
No log | 10.96 | 33 | 1.5916 | 0.825 | 0.6548 | 0.9516 | 0.825 | 0.7706 | 0.5882 | 0.0427 |
No log | 11.96 | 36 | 1.4855 | 0.86 | 0.6103 | 0.8848 | 0.8600 | 0.8201 | 0.5829 | 0.0388 |
No log | 12.96 | 39 | 1.3944 | 0.87 | 0.5688 | 0.7924 | 0.87 | 0.8361 | 0.5720 | 0.0349 |
No log | 13.96 | 42 | 1.3176 | 0.895 | 0.5326 | 0.6952 | 0.895 | 0.8740 | 0.5576 | 0.0324 |
No log | 14.96 | 45 | 1.2435 | 0.9 | 0.4978 | 0.6632 | 0.9 | 0.8838 | 0.5370 | 0.0293 |
No log | 15.96 | 48 | 1.1760 | 0.915 | 0.4653 | 0.6368 | 0.915 | 0.9034 | 0.5272 | 0.0257 |
No log | 16.96 | 51 | 1.1101 | 0.915 | 0.4338 | 0.6194 | 0.915 | 0.9011 | 0.4963 | 0.0241 |
No log | 17.96 | 54 | 1.0518 | 0.915 | 0.4058 | 0.6131 | 0.915 | 0.9011 | 0.4750 | 0.0231 |
No log | 18.96 | 57 | 1.0011 | 0.915 | 0.3808 | 0.6125 | 0.915 | 0.9011 | 0.4479 | 0.0222 |
No log | 19.96 | 60 | 0.9471 | 0.92 | 0.3566 | 0.5890 | 0.92 | 0.9102 | 0.4353 | 0.0203 |
No log | 20.96 | 63 | 0.8962 | 0.915 | 0.3352 | 0.5856 | 0.915 | 0.9047 | 0.4245 | 0.0185 |
No log | 21.96 | 66 | 0.8635 | 0.92 | 0.3159 | 0.5865 | 0.92 | 0.9115 | 0.3999 | 0.0192 |
No log | 22.96 | 69 | 0.8333 | 0.93 | 0.2987 | 0.5791 | 0.93 | 0.9260 | 0.3917 | 0.0189 |
No log | 23.96 | 72 | 0.8079 | 0.925 | 0.2839 | 0.5871 | 0.925 | 0.9159 | 0.3733 | 0.0173 |
No log | 24.96 | 75 | 0.7644 | 0.93 | 0.2681 | 0.5755 | 0.93 | 0.9233 | 0.3644 | 0.0198 |
No log | 25.96 | 78 | 0.7443 | 0.925 | 0.2567 | 0.5750 | 0.925 | 0.9204 | 0.3419 | 0.0193 |
No log | 26.96 | 81 | 0.7250 | 0.93 | 0.2461 | 0.5722 | 0.93 | 0.9227 | 0.3345 | 0.0176 |
No log | 27.96 | 84 | 0.6988 | 0.93 | 0.2344 | 0.5118 | 0.93 | 0.9227 | 0.3151 | 0.0172 |
No log | 28.96 | 87 | 0.6923 | 0.935 | 0.2272 | 0.5730 | 0.935 | 0.9303 | 0.3162 | 0.0175 |
No log | 29.96 | 90 | 0.6752 | 0.935 | 0.2196 | 0.5646 | 0.935 | 0.9303 | 0.3016 | 0.0179 |
No log | 30.96 | 93 | 0.6576 | 0.93 | 0.2117 | 0.5554 | 0.93 | 0.9227 | 0.2934 | 0.0188 |
No log | 31.96 | 96 | 0.6476 | 0.93 | 0.2073 | 0.5617 | 0.93 | 0.9227 | 0.2867 | 0.0193 |
No log | 32.96 | 99 | 0.6349 | 0.93 | 0.2009 | 0.5648 | 0.93 | 0.9245 | 0.2818 | 0.0178 |
No log | 33.96 | 102 | 0.6195 | 0.92 | 0.1949 | 0.6098 | 0.92 | 0.9140 | 0.2612 | 0.0185 |
No log | 34.96 | 105 | 0.6158 | 0.92 | 0.1921 | 0.6190 | 0.92 | 0.9140 | 0.2659 | 0.0184 |
No log | 35.96 | 108 | 0.6093 | 0.93 | 0.1891 | 0.6182 | 0.93 | 0.9273 | 0.2616 | 0.0187 |
No log | 36.96 | 111 | 0.6007 | 0.925 | 0.1854 | 0.6169 | 0.925 | 0.9170 | 0.2561 | 0.0182 |
No log | 37.96 | 114 | 0.5877 | 0.925 | 0.1815 | 0.5400 | 0.925 | 0.9170 | 0.2575 | 0.0179 |
No log | 38.96 | 117 | 0.5887 | 0.925 | 0.1793 | 0.6079 | 0.925 | 0.9170 | 0.2544 | 0.0188 |
No log | 39.96 | 120 | 0.5865 | 0.915 | 0.1775 | 0.6123 | 0.915 | 0.9107 | 0.2510 | 0.0192 |
No log | 40.96 | 123 | 0.5753 | 0.925 | 0.1738 | 0.5984 | 0.925 | 0.9230 | 0.2323 | 0.0190 |
No log | 41.96 | 126 | 0.5727 | 0.92 | 0.1738 | 0.5394 | 0.92 | 0.9140 | 0.2305 | 0.0184 |
No log | 42.96 | 129 | 0.5644 | 0.92 | 0.1724 | 0.5476 | 0.92 | 0.9140 | 0.2276 | 0.0186 |
No log | 43.96 | 132 | 0.5597 | 0.92 | 0.1703 | 0.6031 | 0.92 | 0.9140 | 0.2285 | 0.0194 |
No log | 44.96 | 135 | 0.5597 | 0.92 | 0.1688 | 0.6026 | 0.92 | 0.9140 | 0.2216 | 0.0187 |
No log | 45.96 | 138 | 0.5580 | 0.925 | 0.1676 | 0.6051 | 0.925 | 0.9170 | 0.2194 | 0.0187 |
No log | 46.96 | 141 | 0.5541 | 0.925 | 0.1658 | 0.6063 | 0.925 | 0.9170 | 0.2252 | 0.0184 |
No log | 47.96 | 144 | 0.5533 | 0.925 | 0.1654 | 0.6153 | 0.925 | 0.9170 | 0.2164 | 0.0183 |
No log | 48.96 | 147 | 0.5464 | 0.925 | 0.1629 | 0.6085 | 0.925 | 0.9170 | 0.2225 | 0.0183 |
No log | 49.96 | 150 | 0.5407 | 0.925 | 0.1612 | 0.5988 | 0.925 | 0.9170 | 0.2187 | 0.0179 |
No log | 50.96 | 153 | 0.5432 | 0.92 | 0.1625 | 0.6095 | 0.92 | 0.9150 | 0.2040 | 0.0177 |
No log | 51.96 | 156 | 0.5425 | 0.915 | 0.1648 | 0.6964 | 0.915 | 0.9118 | 0.1977 | 0.0182 |
No log | 52.96 | 159 | 0.5376 | 0.915 | 0.1623 | 0.6959 | 0.915 | 0.9118 | 0.2129 | 0.0192 |
No log | 53.96 | 162 | 0.5299 | 0.915 | 0.1596 | 0.6710 | 0.915 | 0.9118 | 0.2120 | 0.0194 |
No log | 54.96 | 165 | 0.5240 | 0.92 | 0.1579 | 0.6072 | 0.92 | 0.9150 | 0.2076 | 0.0183 |
No log | 55.96 | 168 | 0.5297 | 0.92 | 0.1583 | 0.6704 | 0.92 | 0.9150 | 0.1997 | 0.0182 |
No log | 56.96 | 171 | 0.5307 | 0.915 | 0.1585 | 0.6782 | 0.915 | 0.9118 | 0.2091 | 0.0187 |
No log | 57.96 | 174 | 0.5257 | 0.925 | 0.1566 | 0.6692 | 0.925 | 0.9180 | 0.1970 | 0.0193 |
No log | 58.96 | 177 | 0.5281 | 0.925 | 0.1576 | 0.6703 | 0.925 | 0.9180 | 0.2007 | 0.0182 |
No log | 59.96 | 180 | 0.5282 | 0.92 | 0.1579 | 0.6690 | 0.92 | 0.9150 | 0.1842 | 0.0185 |
No log | 60.96 | 183 | 0.5212 | 0.92 | 0.1573 | 0.6672 | 0.92 | 0.9150 | 0.1957 | 0.0189 |
No log | 61.96 | 186 | 0.5203 | 0.92 | 0.1554 | 0.6655 | 0.92 | 0.9207 | 0.1918 | 0.0199 |
No log | 62.96 | 189 | 0.5166 | 0.915 | 0.1557 | 0.6689 | 0.915 | 0.9118 | 0.1817 | 0.0195 |
No log | 63.96 | 192 | 0.5168 | 0.915 | 0.1556 | 0.6695 | 0.915 | 0.9118 | 0.1895 | 0.0191 |
No log | 64.96 | 195 | 0.5153 | 0.915 | 0.1547 | 0.6661 | 0.915 | 0.9118 | 0.1879 | 0.0188 |
No log | 65.96 | 198 | 0.5157 | 0.915 | 0.1545 | 0.6665 | 0.915 | 0.9118 | 0.1890 | 0.0191 |
No log | 66.96 | 201 | 0.5181 | 0.915 | 0.1549 | 0.6703 | 0.915 | 0.9118 | 0.1890 | 0.0191 |
No log | 67.96 | 204 | 0.5168 | 0.915 | 0.1542 | 0.6686 | 0.915 | 0.9118 | 0.1882 | 0.0193 |
No log | 68.96 | 207 | 0.5120 | 0.93 | 0.1532 | 0.6643 | 0.93 | 0.9269 | 0.1901 | 0.0195 |
No log | 69.96 | 210 | 0.5091 | 0.92 | 0.1528 | 0.6596 | 0.92 | 0.9150 | 0.1866 | 0.0194 |
No log | 70.96 | 213 | 0.5093 | 0.92 | 0.1526 | 0.6607 | 0.92 | 0.9150 | 0.1847 | 0.0182 |
No log | 71.96 | 216 | 0.5143 | 0.925 | 0.1538 | 0.6675 | 0.925 | 0.9180 | 0.1789 | 0.0180 |
No log | 72.96 | 219 | 0.5145 | 0.925 | 0.1550 | 0.6728 | 0.925 | 0.9180 | 0.1765 | 0.0187 |
No log | 73.96 | 222 | 0.5090 | 0.92 | 0.1540 | 0.6658 | 0.92 | 0.9150 | 0.1904 | 0.0191 |
No log | 74.96 | 225 | 0.5069 | 0.92 | 0.1530 | 0.6606 | 0.92 | 0.9150 | 0.1840 | 0.0189 |
No log | 75.96 | 228 | 0.5051 | 0.92 | 0.1524 | 0.6624 | 0.92 | 0.9150 | 0.1925 | 0.0186 |
No log | 76.96 | 231 | 0.5089 | 0.92 | 0.1539 | 0.6698 | 0.92 | 0.9150 | 0.1759 | 0.0189 |
No log | 77.96 | 234 | 0.5053 | 0.92 | 0.1528 | 0.6647 | 0.92 | 0.9150 | 0.1748 | 0.0188 |
No log | 78.96 | 237 | 0.5028 | 0.92 | 0.1524 | 0.6598 | 0.92 | 0.9150 | 0.1821 | 0.0182 |
No log | 79.96 | 240 | 0.5043 | 0.92 | 0.1527 | 0.6615 | 0.92 | 0.9150 | 0.1810 | 0.0181 |
No log | 80.96 | 243 | 0.5014 | 0.92 | 0.1523 | 0.6622 | 0.92 | 0.9150 | 0.1733 | 0.0184 |
No log | 81.96 | 246 | 0.5035 | 0.92 | 0.1531 | 0.6635 | 0.92 | 0.9150 | 0.1791 | 0.0183 |
No log | 82.96 | 249 | 0.5052 | 0.92 | 0.1538 | 0.6669 | 0.92 | 0.9150 | 0.1799 | 0.0186 |
No log | 83.96 | 252 | 0.5040 | 0.92 | 0.1533 | 0.6640 | 0.92 | 0.9150 | 0.1833 | 0.0188 |
No log | 84.96 | 255 | 0.5008 | 0.92 | 0.1530 | 0.6588 | 0.92 | 0.9150 | 0.1735 | 0.0188 |
No log | 85.96 | 258 | 0.5027 | 0.915 | 0.1538 | 0.6599 | 0.915 | 0.9121 | 0.1751 | 0.0187 |
No log | 86.96 | 261 | 0.5075 | 0.915 | 0.1551 | 0.6661 | 0.915 | 0.9121 | 0.1684 | 0.0187 |
No log | 87.96 | 264 | 0.5107 | 0.92 | 0.1555 | 0.6734 | 0.92 | 0.9150 | 0.1748 | 0.0186 |
No log | 88.96 | 267 | 0.5035 | 0.92 | 0.1534 | 0.6676 | 0.92 | 0.9150 | 0.1810 | 0.0192 |
No log | 89.96 | 270 | 0.5006 | 0.92 | 0.1523 | 0.6624 | 0.92 | 0.9150 | 0.1867 | 0.0200 |
No log | 90.96 | 273 | 0.4984 | 0.92 | 0.1521 | 0.6605 | 0.92 | 0.9150 | 0.1704 | 0.0201 |
No log | 91.96 | 276 | 0.4976 | 0.92 | 0.1518 | 0.6586 | 0.92 | 0.9150 | 0.1702 | 0.0201 |
No log | 92.96 | 279 | 0.4986 | 0.92 | 0.1520 | 0.6584 | 0.92 | 0.9150 | 0.1701 | 0.0201 |
No log | 93.96 | 282 | 0.5005 | 0.92 | 0.1526 | 0.6596 | 0.92 | 0.9150 | 0.1714 | 0.0201 |
No log | 94.96 | 285 | 0.5025 | 0.92 | 0.1533 | 0.6614 | 0.92 | 0.9150 | 0.1820 | 0.0202 |
No log | 95.96 | 288 | 0.5043 | 0.92 | 0.1539 | 0.6634 | 0.92 | 0.9150 | 0.1721 | 0.0195 |
No log | 96.96 | 291 | 0.5056 | 0.92 | 0.1542 | 0.6644 | 0.92 | 0.9150 | 0.1783 | 0.0194 |
No log | 97.96 | 294 | 0.5075 | 0.92 | 0.1544 | 0.6648 | 0.92 | 0.9150 | 0.1723 | 0.0194 |
No log | 98.96 | 297 | 0.5077 | 0.92 | 0.1544 | 0.6649 | 0.92 | 0.9150 | 0.1722 | 0.0194 |
No log | 99.96 | 300 | 0.5075 | 0.92 | 0.1544 | 0.6650 | 0.92 | 0.9150 | 0.1721 | 0.0193 |
Framework versions
- Transformers 4.26.1
- Pytorch 1.13.1.post200
- Datasets 2.9.0
- Tokenizers 0.13.2