<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->
vit-base_rvl-cdip-tiny_rvl_cdip-NK1000_kd_MSE
This model is a fine-tuned version of google/vit-base-patch16-224-in21k on the None dataset. It achieves the following results on the evaluation set:
- Loss: 2.1927
- Accuracy: 0.5835
- Brier Loss: 0.6740
- Nll: 3.1975
- F1 Micro: 0.5835
- F1 Macro: 0.5865
- Ece: 0.2742
- Aurc: 0.2074
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 64
- eval_batch_size: 64
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 50
Training results
Training Loss | Epoch | Step | Validation Loss | Accuracy | Brier Loss | Nll | F1 Micro | F1 Macro | Ece | Aurc |
---|---|---|---|---|---|---|---|---|---|---|
No log | 1.0 | 250 | 4.2227 | 0.1325 | 0.9130 | 6.8924 | 0.1325 | 0.0728 | 0.0573 | 0.7519 |
4.2305 | 2.0 | 500 | 3.9645 | 0.1638 | 0.8922 | 5.8361 | 0.1638 | 0.1235 | 0.0588 | 0.7012 |
4.2305 | 3.0 | 750 | 3.6177 | 0.285 | 0.8227 | 4.3429 | 0.285 | 0.2289 | 0.0627 | 0.5424 |
3.6208 | 4.0 | 1000 | 3.2220 | 0.3733 | 0.7617 | 3.5860 | 0.3733 | 0.3356 | 0.0606 | 0.4322 |
3.6208 | 5.0 | 1250 | 3.0177 | 0.4045 | 0.7308 | 3.7807 | 0.4045 | 0.3770 | 0.0721 | 0.3835 |
2.9674 | 6.0 | 1500 | 2.8203 | 0.4365 | 0.7032 | 3.3569 | 0.4365 | 0.4130 | 0.0969 | 0.3443 |
2.9674 | 7.0 | 1750 | 2.6164 | 0.4557 | 0.6762 | 3.4281 | 0.4557 | 0.4413 | 0.0810 | 0.3058 |
2.5154 | 8.0 | 2000 | 2.4991 | 0.472 | 0.6651 | 3.3938 | 0.472 | 0.4524 | 0.1092 | 0.2846 |
2.5154 | 9.0 | 2250 | 2.4375 | 0.4878 | 0.6826 | 3.1749 | 0.4878 | 0.4603 | 0.1631 | 0.2872 |
2.2165 | 10.0 | 2500 | 2.3537 | 0.5018 | 0.6686 | 3.1767 | 0.5018 | 0.4855 | 0.1589 | 0.2743 |
2.2165 | 11.0 | 2750 | 2.2613 | 0.515 | 0.6276 | 3.1281 | 0.515 | 0.5141 | 0.1101 | 0.2457 |
1.9636 | 12.0 | 3000 | 2.2592 | 0.5242 | 0.6624 | 3.1164 | 0.5242 | 0.5131 | 0.1840 | 0.2515 |
1.9636 | 13.0 | 3250 | 2.1751 | 0.5315 | 0.6190 | 3.2643 | 0.5315 | 0.5268 | 0.1349 | 0.2288 |
1.7526 | 14.0 | 3500 | 2.2171 | 0.5248 | 0.6546 | 3.1179 | 0.5248 | 0.5162 | 0.1889 | 0.2537 |
1.7526 | 15.0 | 3750 | 2.1185 | 0.5507 | 0.6126 | 3.1117 | 0.5507 | 0.5496 | 0.1578 | 0.2219 |
1.5673 | 16.0 | 4000 | 2.0807 | 0.5537 | 0.6208 | 3.2624 | 0.5537 | 0.5459 | 0.1735 | 0.2151 |
1.5673 | 17.0 | 4250 | 2.0743 | 0.5677 | 0.6095 | 3.2650 | 0.5677 | 0.5683 | 0.1628 | 0.2090 |
1.3823 | 18.0 | 4500 | 2.1201 | 0.5605 | 0.6454 | 3.1499 | 0.5605 | 0.5558 | 0.2130 | 0.2316 |
1.3823 | 19.0 | 4750 | 2.0835 | 0.5655 | 0.6312 | 3.2920 | 0.5655 | 0.5666 | 0.2015 | 0.2149 |
1.2113 | 20.0 | 5000 | 2.0809 | 0.5675 | 0.6284 | 3.2923 | 0.5675 | 0.5675 | 0.2180 | 0.2047 |
1.2113 | 21.0 | 5250 | 2.1507 | 0.5633 | 0.6608 | 3.2713 | 0.5633 | 0.5668 | 0.2380 | 0.2183 |
1.0543 | 22.0 | 5500 | 2.1295 | 0.5683 | 0.6476 | 3.5120 | 0.5683 | 0.5672 | 0.2369 | 0.2105 |
1.0543 | 23.0 | 5750 | 2.1610 | 0.5675 | 0.6564 | 3.3818 | 0.5675 | 0.5625 | 0.2393 | 0.2166 |
0.9098 | 24.0 | 6000 | 2.0862 | 0.5735 | 0.6562 | 3.3228 | 0.5735 | 0.5782 | 0.2528 | 0.2047 |
0.9098 | 25.0 | 6250 | 2.0680 | 0.5727 | 0.6439 | 3.2971 | 0.5727 | 0.5767 | 0.2357 | 0.2050 |
0.7832 | 26.0 | 6500 | 2.1829 | 0.5763 | 0.6667 | 3.3547 | 0.5763 | 0.5792 | 0.2627 | 0.2084 |
0.7832 | 27.0 | 6750 | 2.1163 | 0.586 | 0.6479 | 3.2468 | 0.586 | 0.5894 | 0.2509 | 0.2016 |
0.6572 | 28.0 | 7000 | 2.1492 | 0.5715 | 0.6612 | 3.4268 | 0.5715 | 0.5780 | 0.2642 | 0.2114 |
0.6572 | 29.0 | 7250 | 2.1975 | 0.5723 | 0.6777 | 3.4662 | 0.5723 | 0.5739 | 0.2749 | 0.2202 |
0.5632 | 30.0 | 7500 | 2.1733 | 0.5693 | 0.6767 | 3.3743 | 0.5693 | 0.5745 | 0.2737 | 0.2170 |
0.5632 | 31.0 | 7750 | 2.1694 | 0.5807 | 0.6661 | 3.3917 | 0.5807 | 0.5814 | 0.2645 | 0.2193 |
0.4827 | 32.0 | 8000 | 2.1585 | 0.5805 | 0.6671 | 3.3811 | 0.5805 | 0.5812 | 0.2692 | 0.2150 |
0.4827 | 33.0 | 8250 | 2.1963 | 0.5767 | 0.6754 | 3.4575 | 0.5767 | 0.5835 | 0.2710 | 0.2160 |
0.4134 | 34.0 | 8500 | 2.1720 | 0.581 | 0.6694 | 3.3663 | 0.581 | 0.5811 | 0.2672 | 0.2131 |
0.4134 | 35.0 | 8750 | 2.1880 | 0.575 | 0.6759 | 3.4587 | 0.575 | 0.5790 | 0.2783 | 0.2105 |
0.3541 | 36.0 | 9000 | 2.1482 | 0.581 | 0.6628 | 3.2956 | 0.581 | 0.5842 | 0.2712 | 0.2056 |
0.3541 | 37.0 | 9250 | 2.1631 | 0.5885 | 0.6652 | 3.3217 | 0.5885 | 0.5915 | 0.2671 | 0.2069 |
0.3078 | 38.0 | 9500 | 2.2036 | 0.577 | 0.6811 | 3.3564 | 0.577 | 0.5803 | 0.2849 | 0.2141 |
0.3078 | 39.0 | 9750 | 2.1904 | 0.5753 | 0.6756 | 3.2783 | 0.5753 | 0.5765 | 0.2756 | 0.2135 |
0.2671 | 40.0 | 10000 | 2.1774 | 0.5775 | 0.6685 | 3.3109 | 0.5775 | 0.5813 | 0.2700 | 0.2084 |
0.2671 | 41.0 | 10250 | 2.1822 | 0.5807 | 0.6730 | 3.2139 | 0.5807 | 0.5842 | 0.2770 | 0.2100 |
0.2331 | 42.0 | 10500 | 2.1673 | 0.5817 | 0.6705 | 3.2960 | 0.5817 | 0.5864 | 0.2757 | 0.2070 |
0.2331 | 43.0 | 10750 | 2.1730 | 0.5765 | 0.6705 | 3.2195 | 0.5765 | 0.5807 | 0.2784 | 0.2072 |
0.2038 | 44.0 | 11000 | 2.1709 | 0.585 | 0.6649 | 3.1928 | 0.585 | 0.5893 | 0.2627 | 0.2055 |
0.2038 | 45.0 | 11250 | 2.1745 | 0.5783 | 0.6678 | 3.1900 | 0.5783 | 0.5811 | 0.2736 | 0.2061 |
0.1792 | 46.0 | 11500 | 2.1824 | 0.5835 | 0.6682 | 3.1909 | 0.5835 | 0.5858 | 0.2719 | 0.2070 |
0.1792 | 47.0 | 11750 | 2.1892 | 0.584 | 0.6716 | 3.2457 | 0.584 | 0.5864 | 0.2706 | 0.2082 |
0.16 | 48.0 | 12000 | 2.1820 | 0.5835 | 0.6716 | 3.2011 | 0.5835 | 0.5857 | 0.2743 | 0.2073 |
0.16 | 49.0 | 12250 | 2.1884 | 0.582 | 0.6736 | 3.2114 | 0.582 | 0.5856 | 0.2755 | 0.2073 |
0.1465 | 50.0 | 12500 | 2.1927 | 0.5835 | 0.6740 | 3.1975 | 0.5835 | 0.5865 | 0.2742 | 0.2074 |
Framework versions
- Transformers 4.26.1
- Pytorch 1.13.1.post200
- Datasets 2.9.0
- Tokenizers 0.13.2