<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->
vit-base_rvl-cdip-small_rvl_cdip-NK1000_kd
This model is a fine-tuned version of google/vit-base-patch16-224-in21k on the None dataset. It achieves the following results on the evaluation set:
- Loss: 1.6000
- Accuracy: 0.5805
- Brier Loss: 0.6398
- Nll: 2.9515
- F1 Micro: 0.5805
- F1 Macro: 0.5810
- Ece: 0.2379
- Aurc: 0.2036
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 128
- eval_batch_size: 128
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 50
Training results
Training Loss | Epoch | Step | Validation Loss | Accuracy | Brier Loss | Nll | F1 Micro | F1 Macro | Ece | Aurc |
---|---|---|---|---|---|---|---|---|---|---|
No log | 1.0 | 125 | 2.4623 | 0.1792 | 0.9035 | 7.3776 | 0.1792 | 0.1152 | 0.0726 | 0.7119 |
No log | 2.0 | 250 | 2.2456 | 0.218 | 0.8696 | 3.5941 | 0.218 | 0.1605 | 0.0646 | 0.6549 |
No log | 3.0 | 375 | 2.0626 | 0.285 | 0.8239 | 3.2463 | 0.285 | 0.2178 | 0.0561 | 0.5455 |
2.2649 | 4.0 | 500 | 1.8007 | 0.3902 | 0.7487 | 3.1874 | 0.3902 | 0.3500 | 0.0567 | 0.4166 |
2.2649 | 5.0 | 625 | 1.6948 | 0.4228 | 0.7156 | 3.1334 | 0.4228 | 0.3809 | 0.0669 | 0.3709 |
2.2649 | 6.0 | 750 | 1.5414 | 0.4725 | 0.6656 | 2.8978 | 0.4725 | 0.4410 | 0.0621 | 0.3146 |
2.2649 | 7.0 | 875 | 1.4740 | 0.4848 | 0.6464 | 2.6879 | 0.4848 | 0.4556 | 0.0645 | 0.2942 |
1.4861 | 8.0 | 1000 | 1.3662 | 0.5198 | 0.6079 | 2.6641 | 0.5198 | 0.4973 | 0.0653 | 0.2514 |
1.4861 | 9.0 | 1125 | 1.3400 | 0.5417 | 0.5949 | 2.6876 | 0.5417 | 0.5364 | 0.0613 | 0.2381 |
1.4861 | 10.0 | 1250 | 1.3414 | 0.542 | 0.5968 | 2.6382 | 0.542 | 0.5267 | 0.0917 | 0.2336 |
1.4861 | 11.0 | 1375 | 1.3402 | 0.5395 | 0.5935 | 2.6955 | 0.5395 | 0.5418 | 0.0774 | 0.2303 |
1.0134 | 12.0 | 1500 | 1.3721 | 0.537 | 0.6035 | 2.6887 | 0.537 | 0.5271 | 0.1148 | 0.2301 |
1.0134 | 13.0 | 1625 | 1.3683 | 0.5455 | 0.6005 | 2.7328 | 0.5455 | 0.5383 | 0.1229 | 0.2270 |
1.0134 | 14.0 | 1750 | 1.4969 | 0.5363 | 0.6360 | 2.9430 | 0.5363 | 0.5293 | 0.1733 | 0.2346 |
1.0134 | 15.0 | 1875 | 1.5422 | 0.5295 | 0.6487 | 2.9876 | 0.5295 | 0.5341 | 0.1774 | 0.2442 |
0.594 | 16.0 | 2000 | 1.5237 | 0.5543 | 0.6329 | 2.9785 | 0.5543 | 0.5550 | 0.1900 | 0.2242 |
0.594 | 17.0 | 2125 | 1.6365 | 0.5298 | 0.6667 | 3.1126 | 0.5298 | 0.5332 | 0.2148 | 0.2498 |
0.594 | 18.0 | 2250 | 1.6367 | 0.5413 | 0.6663 | 3.0856 | 0.5413 | 0.5429 | 0.2332 | 0.2313 |
0.594 | 19.0 | 2375 | 1.7407 | 0.543 | 0.6811 | 3.2768 | 0.543 | 0.5379 | 0.2478 | 0.2327 |
0.3116 | 20.0 | 2500 | 1.7899 | 0.5535 | 0.6816 | 3.4174 | 0.5535 | 0.5459 | 0.2524 | 0.2308 |
0.3116 | 21.0 | 2625 | 1.8270 | 0.545 | 0.6990 | 3.2131 | 0.545 | 0.5401 | 0.2683 | 0.2459 |
0.3116 | 22.0 | 2750 | 1.8178 | 0.538 | 0.7029 | 3.3342 | 0.538 | 0.5392 | 0.2646 | 0.2471 |
0.3116 | 23.0 | 2875 | 1.8589 | 0.5337 | 0.7086 | 3.4584 | 0.5337 | 0.5332 | 0.2668 | 0.2505 |
0.1975 | 24.0 | 3000 | 1.8554 | 0.5363 | 0.7072 | 3.3578 | 0.5363 | 0.5360 | 0.2754 | 0.2448 |
0.1975 | 25.0 | 3125 | 1.8389 | 0.5397 | 0.7023 | 3.2630 | 0.5397 | 0.5377 | 0.2724 | 0.2457 |
0.1975 | 26.0 | 3250 | 1.8596 | 0.5423 | 0.7076 | 3.3014 | 0.5423 | 0.5463 | 0.2804 | 0.2355 |
0.1975 | 27.0 | 3375 | 1.8342 | 0.55 | 0.6890 | 3.3997 | 0.55 | 0.5451 | 0.2646 | 0.2286 |
0.1448 | 28.0 | 3500 | 1.8707 | 0.548 | 0.7045 | 3.3058 | 0.548 | 0.5428 | 0.2805 | 0.2372 |
0.1448 | 29.0 | 3625 | 1.8214 | 0.546 | 0.6979 | 3.2599 | 0.546 | 0.5455 | 0.2674 | 0.2372 |
0.1448 | 30.0 | 3750 | 1.8021 | 0.5537 | 0.6896 | 3.2681 | 0.5537 | 0.5549 | 0.2664 | 0.2307 |
0.1448 | 31.0 | 3875 | 1.8335 | 0.551 | 0.6938 | 3.3393 | 0.551 | 0.5522 | 0.2740 | 0.2262 |
0.1165 | 32.0 | 4000 | 1.7620 | 0.5473 | 0.6851 | 3.1437 | 0.5473 | 0.5463 | 0.2626 | 0.2328 |
0.1165 | 33.0 | 4125 | 1.7496 | 0.5527 | 0.6850 | 3.1206 | 0.5527 | 0.5515 | 0.2678 | 0.2257 |
0.1165 | 34.0 | 4250 | 1.7095 | 0.56 | 0.6691 | 3.1142 | 0.56 | 0.5631 | 0.2511 | 0.2232 |
0.1165 | 35.0 | 4375 | 1.7775 | 0.543 | 0.6943 | 3.2500 | 0.543 | 0.5428 | 0.2719 | 0.2309 |
0.0964 | 36.0 | 4500 | 1.7212 | 0.5653 | 0.6715 | 3.1218 | 0.5653 | 0.5642 | 0.2513 | 0.2212 |
0.0964 | 37.0 | 4625 | 1.6819 | 0.5633 | 0.6612 | 3.0858 | 0.5633 | 0.5605 | 0.2447 | 0.2172 |
0.0964 | 38.0 | 4750 | 1.7017 | 0.5617 | 0.6726 | 3.0501 | 0.5617 | 0.5636 | 0.2596 | 0.2218 |
0.0964 | 39.0 | 4875 | 1.6995 | 0.564 | 0.6690 | 3.1110 | 0.564 | 0.5656 | 0.2471 | 0.2209 |
0.0805 | 40.0 | 5000 | 1.6639 | 0.566 | 0.6594 | 3.1202 | 0.566 | 0.5677 | 0.2405 | 0.2180 |
0.0805 | 41.0 | 5125 | 1.6265 | 0.57 | 0.6504 | 3.0491 | 0.57 | 0.5725 | 0.2368 | 0.2125 |
0.0805 | 42.0 | 5250 | 1.6325 | 0.568 | 0.6534 | 3.0176 | 0.568 | 0.5696 | 0.2429 | 0.2090 |
0.0805 | 43.0 | 5375 | 1.6029 | 0.5775 | 0.6418 | 2.9852 | 0.5775 | 0.5778 | 0.2330 | 0.2072 |
0.0678 | 44.0 | 5500 | 1.5963 | 0.5725 | 0.6417 | 2.9674 | 0.5725 | 0.5720 | 0.2378 | 0.2080 |
0.0678 | 45.0 | 5625 | 1.5820 | 0.58 | 0.6365 | 2.9070 | 0.58 | 0.5793 | 0.2312 | 0.2033 |
0.0678 | 46.0 | 5750 | 1.5828 | 0.5773 | 0.6368 | 2.9425 | 0.5773 | 0.5766 | 0.2367 | 0.2028 |
0.0678 | 47.0 | 5875 | 1.5854 | 0.5807 | 0.6368 | 2.9375 | 0.5807 | 0.5816 | 0.2341 | 0.2035 |
0.0566 | 48.0 | 6000 | 1.5948 | 0.58 | 0.6396 | 2.9457 | 0.58 | 0.5812 | 0.2372 | 0.2037 |
0.0566 | 49.0 | 6125 | 1.5972 | 0.5813 | 0.6393 | 2.9527 | 0.5813 | 0.5817 | 0.2360 | 0.2038 |
0.0566 | 50.0 | 6250 | 1.6000 | 0.5805 | 0.6398 | 2.9515 | 0.5805 | 0.5810 | 0.2379 | 0.2036 |
Framework versions
- Transformers 4.26.1
- Pytorch 1.13.1.post200
- Datasets 2.9.0
- Tokenizers 0.13.2