<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->
vit-small_rvl_cdip_100_examples_per_class_kd_NKD_t1.0_g1.5
This model is a fine-tuned version of WinKawaks/vit-small-patch16-224 on the None dataset. It achieves the following results on the evaluation set:
- Loss: 5.2014
- Accuracy: 0.635
- Brier Loss: 0.5252
- Nll: 2.1069
- F1 Micro: 0.635
- F1 Macro: 0.6363
- Ece: 0.1836
- Aurc: 0.1520
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 128
- eval_batch_size: 128
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 100
Training results
Training Loss | Epoch | Step | Validation Loss | Accuracy | Brier Loss | Nll | F1 Micro | F1 Macro | Ece | Aurc |
---|---|---|---|---|---|---|---|---|---|---|
No log | 1.0 | 7 | 7.0792 | 0.0825 | 0.9657 | 10.6330 | 0.0825 | 0.0736 | 0.1618 | 0.9054 |
No log | 2.0 | 14 | 6.4691 | 0.07 | 0.9514 | 8.1662 | 0.07 | 0.0643 | 0.1779 | 0.9461 |
No log | 3.0 | 21 | 5.8986 | 0.2975 | 0.8596 | 5.3260 | 0.2975 | 0.2944 | 0.2211 | 0.5304 |
No log | 4.0 | 28 | 5.5468 | 0.3925 | 0.7531 | 3.5791 | 0.3925 | 0.3860 | 0.2145 | 0.3645 |
No log | 5.0 | 35 | 5.2678 | 0.46 | 0.6755 | 3.2144 | 0.46 | 0.4517 | 0.1901 | 0.2901 |
No log | 6.0 | 42 | 5.1237 | 0.5075 | 0.6334 | 2.9369 | 0.5075 | 0.4888 | 0.1985 | 0.2550 |
No log | 7.0 | 49 | 5.1530 | 0.5125 | 0.6131 | 3.1196 | 0.5125 | 0.4843 | 0.1634 | 0.2417 |
No log | 8.0 | 56 | 5.0462 | 0.545 | 0.5898 | 2.7596 | 0.545 | 0.5376 | 0.1792 | 0.2232 |
No log | 9.0 | 63 | 5.1437 | 0.565 | 0.5759 | 2.9426 | 0.565 | 0.5660 | 0.1715 | 0.2208 |
No log | 10.0 | 70 | 4.9658 | 0.605 | 0.5382 | 2.4096 | 0.605 | 0.5945 | 0.1828 | 0.1734 |
No log | 11.0 | 77 | 5.1189 | 0.57 | 0.5592 | 2.5892 | 0.57 | 0.5677 | 0.1381 | 0.1952 |
No log | 12.0 | 84 | 5.2082 | 0.54 | 0.5774 | 2.7250 | 0.54 | 0.5323 | 0.1578 | 0.2144 |
No log | 13.0 | 91 | 4.9674 | 0.5775 | 0.5365 | 2.5469 | 0.5775 | 0.5654 | 0.1603 | 0.1824 |
No log | 14.0 | 98 | 5.0007 | 0.5875 | 0.5299 | 2.6635 | 0.5875 | 0.5778 | 0.1567 | 0.1701 |
No log | 15.0 | 105 | 4.9925 | 0.585 | 0.5417 | 2.6416 | 0.585 | 0.5760 | 0.1731 | 0.1896 |
No log | 16.0 | 112 | 4.8314 | 0.6425 | 0.4939 | 2.4601 | 0.6425 | 0.6444 | 0.1492 | 0.1506 |
No log | 17.0 | 119 | 4.8729 | 0.6075 | 0.5197 | 2.4297 | 0.6075 | 0.6054 | 0.1511 | 0.1702 |
No log | 18.0 | 126 | 4.8960 | 0.61 | 0.5085 | 2.2405 | 0.61 | 0.6197 | 0.1664 | 0.1657 |
No log | 19.0 | 133 | 4.8227 | 0.62 | 0.5032 | 2.4320 | 0.62 | 0.6177 | 0.1399 | 0.1615 |
No log | 20.0 | 140 | 4.9420 | 0.61 | 0.5160 | 2.3051 | 0.61 | 0.6119 | 0.1460 | 0.1722 |
No log | 21.0 | 147 | 4.8779 | 0.6125 | 0.5132 | 2.3564 | 0.6125 | 0.6080 | 0.1549 | 0.1639 |
No log | 22.0 | 154 | 4.9454 | 0.6125 | 0.5261 | 2.4064 | 0.6125 | 0.6155 | 0.1792 | 0.1733 |
No log | 23.0 | 161 | 4.8659 | 0.5925 | 0.5018 | 2.5961 | 0.5925 | 0.5897 | 0.1537 | 0.1607 |
No log | 24.0 | 168 | 4.8150 | 0.605 | 0.4996 | 2.2624 | 0.605 | 0.6050 | 0.1525 | 0.1588 |
No log | 25.0 | 175 | 4.8303 | 0.6175 | 0.4970 | 2.1999 | 0.6175 | 0.6204 | 0.1284 | 0.1515 |
No log | 26.0 | 182 | 4.8442 | 0.6225 | 0.5060 | 2.2842 | 0.6225 | 0.6251 | 0.1639 | 0.1614 |
No log | 27.0 | 189 | 4.8260 | 0.63 | 0.4953 | 2.2666 | 0.63 | 0.6345 | 0.1638 | 0.1531 |
No log | 28.0 | 196 | 4.8421 | 0.6375 | 0.4979 | 2.3173 | 0.6375 | 0.6344 | 0.1430 | 0.1525 |
No log | 29.0 | 203 | 4.9011 | 0.62 | 0.5066 | 2.2663 | 0.62 | 0.6221 | 0.1596 | 0.1602 |
No log | 30.0 | 210 | 4.8689 | 0.62 | 0.4994 | 2.1498 | 0.62 | 0.6260 | 0.1581 | 0.1567 |
No log | 31.0 | 217 | 4.8681 | 0.6075 | 0.5143 | 2.0979 | 0.6075 | 0.6080 | 0.1673 | 0.1641 |
No log | 32.0 | 224 | 4.8489 | 0.6 | 0.5074 | 2.1485 | 0.6 | 0.5913 | 0.1579 | 0.1613 |
No log | 33.0 | 231 | 4.8669 | 0.63 | 0.5037 | 2.3142 | 0.63 | 0.6272 | 0.1512 | 0.1519 |
No log | 34.0 | 238 | 4.8382 | 0.6075 | 0.5005 | 2.1817 | 0.6075 | 0.6038 | 0.1683 | 0.1552 |
No log | 35.0 | 245 | 4.8406 | 0.61 | 0.5012 | 2.2132 | 0.61 | 0.6019 | 0.1443 | 0.1518 |
No log | 36.0 | 252 | 4.8241 | 0.6275 | 0.5040 | 2.2466 | 0.6275 | 0.6182 | 0.1511 | 0.1563 |
No log | 37.0 | 259 | 4.8359 | 0.6225 | 0.4993 | 2.1727 | 0.6225 | 0.6201 | 0.1665 | 0.1570 |
No log | 38.0 | 266 | 4.8812 | 0.6025 | 0.5155 | 2.2712 | 0.6025 | 0.5990 | 0.1634 | 0.1649 |
No log | 39.0 | 273 | 4.8672 | 0.61 | 0.5075 | 2.1626 | 0.61 | 0.6073 | 0.1603 | 0.1592 |
No log | 40.0 | 280 | 4.9083 | 0.6175 | 0.5098 | 2.1507 | 0.6175 | 0.6204 | 0.1524 | 0.1594 |
No log | 41.0 | 287 | 4.8942 | 0.61 | 0.5132 | 2.2443 | 0.61 | 0.6070 | 0.1574 | 0.1618 |
No log | 42.0 | 294 | 4.9435 | 0.62 | 0.5177 | 2.1770 | 0.62 | 0.6186 | 0.1567 | 0.1664 |
No log | 43.0 | 301 | 4.8836 | 0.63 | 0.5089 | 2.1922 | 0.63 | 0.6300 | 0.1612 | 0.1553 |
No log | 44.0 | 308 | 4.9806 | 0.6225 | 0.5205 | 2.1855 | 0.6225 | 0.6213 | 0.1715 | 0.1631 |
No log | 45.0 | 315 | 4.9314 | 0.6225 | 0.5185 | 2.1783 | 0.6225 | 0.6182 | 0.1743 | 0.1631 |
No log | 46.0 | 322 | 4.8615 | 0.6275 | 0.4984 | 2.2407 | 0.6275 | 0.6259 | 0.1529 | 0.1497 |
No log | 47.0 | 329 | 4.8550 | 0.625 | 0.4985 | 2.1229 | 0.625 | 0.6261 | 0.1517 | 0.1531 |
No log | 48.0 | 336 | 4.9218 | 0.6125 | 0.5113 | 2.2200 | 0.6125 | 0.6114 | 0.1627 | 0.1588 |
No log | 49.0 | 343 | 4.9067 | 0.63 | 0.5102 | 2.2177 | 0.63 | 0.6299 | 0.1534 | 0.1567 |
No log | 50.0 | 350 | 4.9040 | 0.6125 | 0.5110 | 2.1105 | 0.6125 | 0.6136 | 0.1731 | 0.1559 |
No log | 51.0 | 357 | 4.9557 | 0.615 | 0.5180 | 2.2031 | 0.615 | 0.6157 | 0.1726 | 0.1602 |
No log | 52.0 | 364 | 4.9409 | 0.61 | 0.5195 | 2.2616 | 0.61 | 0.6079 | 0.1627 | 0.1618 |
No log | 53.0 | 371 | 4.9290 | 0.6225 | 0.5125 | 2.1352 | 0.6225 | 0.6227 | 0.1873 | 0.1549 |
No log | 54.0 | 378 | 4.9297 | 0.6225 | 0.5075 | 2.1558 | 0.6225 | 0.6216 | 0.1724 | 0.1530 |
No log | 55.0 | 385 | 4.9192 | 0.6225 | 0.5131 | 2.1572 | 0.6225 | 0.6220 | 0.1655 | 0.1578 |
No log | 56.0 | 392 | 4.9760 | 0.61 | 0.5203 | 2.1227 | 0.61 | 0.6092 | 0.1852 | 0.1594 |
No log | 57.0 | 399 | 4.9860 | 0.6125 | 0.5208 | 2.1996 | 0.6125 | 0.6154 | 0.1812 | 0.1608 |
No log | 58.0 | 406 | 4.9418 | 0.62 | 0.5176 | 2.1034 | 0.62 | 0.6220 | 0.1635 | 0.1549 |
No log | 59.0 | 413 | 4.9462 | 0.62 | 0.5143 | 2.1095 | 0.62 | 0.6221 | 0.1855 | 0.1553 |
No log | 60.0 | 420 | 4.9447 | 0.6175 | 0.5142 | 2.0731 | 0.6175 | 0.6180 | 0.1571 | 0.1533 |
No log | 61.0 | 427 | 4.9677 | 0.63 | 0.5091 | 2.1491 | 0.63 | 0.6346 | 0.1693 | 0.1498 |
No log | 62.0 | 434 | 4.9567 | 0.62 | 0.5089 | 2.1222 | 0.62 | 0.6242 | 0.1609 | 0.1546 |
No log | 63.0 | 441 | 4.9378 | 0.6325 | 0.5030 | 2.1787 | 0.6325 | 0.6310 | 0.1558 | 0.1471 |
No log | 64.0 | 448 | 4.9764 | 0.6175 | 0.5154 | 2.0751 | 0.6175 | 0.6192 | 0.1835 | 0.1549 |
No log | 65.0 | 455 | 4.9520 | 0.6325 | 0.5069 | 2.1067 | 0.6325 | 0.6352 | 0.1670 | 0.1499 |
No log | 66.0 | 462 | 4.9649 | 0.6375 | 0.5109 | 2.1016 | 0.6375 | 0.6361 | 0.1665 | 0.1506 |
No log | 67.0 | 469 | 5.0023 | 0.635 | 0.5174 | 2.2166 | 0.635 | 0.6350 | 0.1653 | 0.1543 |
No log | 68.0 | 476 | 5.0084 | 0.63 | 0.5187 | 2.1238 | 0.63 | 0.6302 | 0.1674 | 0.1535 |
No log | 69.0 | 483 | 4.9875 | 0.6325 | 0.5096 | 2.1744 | 0.6325 | 0.6345 | 0.1822 | 0.1510 |
No log | 70.0 | 490 | 5.0129 | 0.6325 | 0.5151 | 2.1042 | 0.6325 | 0.6335 | 0.1691 | 0.1535 |
No log | 71.0 | 497 | 5.0389 | 0.6275 | 0.5201 | 2.0941 | 0.6275 | 0.6283 | 0.1765 | 0.1550 |
3.4121 | 72.0 | 504 | 5.0288 | 0.6325 | 0.5168 | 2.1299 | 0.6325 | 0.6314 | 0.1802 | 0.1529 |
3.4121 | 73.0 | 511 | 5.0181 | 0.625 | 0.5121 | 2.1690 | 0.625 | 0.6236 | 0.1683 | 0.1511 |
3.4121 | 74.0 | 518 | 5.0422 | 0.625 | 0.5139 | 2.1323 | 0.625 | 0.6264 | 0.1873 | 0.1517 |
3.4121 | 75.0 | 525 | 5.0557 | 0.6325 | 0.5177 | 2.1695 | 0.6325 | 0.6342 | 0.1677 | 0.1503 |
3.4121 | 76.0 | 532 | 5.0440 | 0.6375 | 0.5113 | 2.1384 | 0.6375 | 0.6384 | 0.1714 | 0.1489 |
3.4121 | 77.0 | 539 | 5.0710 | 0.6375 | 0.5163 | 2.1017 | 0.6375 | 0.6397 | 0.1785 | 0.1508 |
3.4121 | 78.0 | 546 | 5.1024 | 0.63 | 0.5218 | 2.0905 | 0.63 | 0.6280 | 0.1724 | 0.1538 |
3.4121 | 79.0 | 553 | 5.0906 | 0.635 | 0.5186 | 2.1293 | 0.635 | 0.6358 | 0.1908 | 0.1509 |
3.4121 | 80.0 | 560 | 5.1027 | 0.63 | 0.5206 | 2.1292 | 0.63 | 0.6299 | 0.1850 | 0.1525 |
3.4121 | 81.0 | 567 | 5.1063 | 0.64 | 0.5161 | 2.1620 | 0.64 | 0.6404 | 0.1754 | 0.1489 |
3.4121 | 82.0 | 574 | 5.1267 | 0.64 | 0.5207 | 2.1291 | 0.64 | 0.6400 | 0.1849 | 0.1516 |
3.4121 | 83.0 | 581 | 5.1332 | 0.63 | 0.5224 | 2.1338 | 0.63 | 0.6322 | 0.1750 | 0.1522 |
3.4121 | 84.0 | 588 | 5.1408 | 0.6325 | 0.5233 | 2.1333 | 0.6325 | 0.6334 | 0.1797 | 0.1522 |
3.4121 | 85.0 | 595 | 5.1510 | 0.63 | 0.5224 | 2.1635 | 0.63 | 0.6301 | 0.1755 | 0.1522 |
3.4121 | 86.0 | 602 | 5.1536 | 0.6375 | 0.5215 | 2.1628 | 0.6375 | 0.6382 | 0.1683 | 0.1511 |
3.4121 | 87.0 | 609 | 5.1580 | 0.6325 | 0.5228 | 2.1348 | 0.6325 | 0.6328 | 0.1779 | 0.1523 |
3.4121 | 88.0 | 616 | 5.1701 | 0.64 | 0.5235 | 2.1352 | 0.64 | 0.6417 | 0.1818 | 0.1515 |
3.4121 | 89.0 | 623 | 5.1734 | 0.6375 | 0.5235 | 2.1354 | 0.6375 | 0.6385 | 0.1775 | 0.1515 |
3.4121 | 90.0 | 630 | 5.1779 | 0.635 | 0.5243 | 2.1334 | 0.635 | 0.6360 | 0.1842 | 0.1519 |
3.4121 | 91.0 | 637 | 5.1834 | 0.635 | 0.5241 | 2.1344 | 0.635 | 0.6363 | 0.1813 | 0.1521 |
3.4121 | 92.0 | 644 | 5.1877 | 0.6375 | 0.5247 | 2.1356 | 0.6375 | 0.6385 | 0.1871 | 0.1517 |
3.4121 | 93.0 | 651 | 5.1906 | 0.635 | 0.5245 | 2.1389 | 0.635 | 0.6360 | 0.1888 | 0.1520 |
3.4121 | 94.0 | 658 | 5.1935 | 0.635 | 0.5248 | 2.1083 | 0.635 | 0.6363 | 0.1831 | 0.1521 |
3.4121 | 95.0 | 665 | 5.1955 | 0.635 | 0.5249 | 2.1098 | 0.635 | 0.6363 | 0.1795 | 0.1521 |
3.4121 | 96.0 | 672 | 5.1978 | 0.635 | 0.5250 | 2.1079 | 0.635 | 0.6363 | 0.1820 | 0.1521 |
3.4121 | 97.0 | 679 | 5.1995 | 0.635 | 0.5251 | 2.1073 | 0.635 | 0.6363 | 0.1834 | 0.1521 |
3.4121 | 98.0 | 686 | 5.2004 | 0.635 | 0.5251 | 2.1072 | 0.635 | 0.6360 | 0.1834 | 0.1520 |
3.4121 | 99.0 | 693 | 5.2012 | 0.635 | 0.5252 | 2.1071 | 0.635 | 0.6360 | 0.1836 | 0.1520 |
3.4121 | 100.0 | 700 | 5.2014 | 0.635 | 0.5252 | 2.1069 | 0.635 | 0.6363 | 0.1836 | 0.1520 |
Framework versions
- Transformers 4.26.1
- Pytorch 1.13.1.post200
- Datasets 2.9.0
- Tokenizers 0.13.2