<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->
vit-base_rvl-cdip-tiny_rvl_cdip-NK1000_kd_NKD_t1.0_g1.5
This model is a fine-tuned version of google/vit-base-patch16-224-in21k on the None dataset. It achieves the following results on the evaluation set:
- Loss: 6.1672
- Accuracy: 0.5797
- Brier Loss: 0.6731
- Nll: 3.2703
- F1 Micro: 0.5797
- F1 Macro: 0.5841
- Ece: 0.2854
- Aurc: 0.1976
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 64
- eval_batch_size: 64
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 50
Training results
Training Loss | Epoch | Step | Validation Loss | Accuracy | Brier Loss | Nll | F1 Micro | F1 Macro | Ece | Aurc |
---|---|---|---|---|---|---|---|---|---|---|
No log | 1.0 | 250 | 6.2188 | 0.1343 | 0.9122 | 5.7464 | 0.1343 | 0.0834 | 0.0536 | 0.7565 |
6.3619 | 2.0 | 500 | 6.0878 | 0.1565 | 0.8959 | 5.2310 | 0.1565 | 0.1126 | 0.0670 | 0.7122 |
6.3619 | 3.0 | 750 | 5.7358 | 0.2963 | 0.8276 | 3.6085 | 0.2963 | 0.2563 | 0.0948 | 0.5210 |
5.9224 | 4.0 | 1000 | 5.5272 | 0.382 | 0.7742 | 3.2631 | 0.382 | 0.3481 | 0.1205 | 0.4212 |
5.9224 | 5.0 | 1250 | 5.3271 | 0.4235 | 0.7257 | 3.1338 | 0.4235 | 0.4033 | 0.1172 | 0.3609 |
5.4818 | 6.0 | 1500 | 5.2958 | 0.4343 | 0.7063 | 3.0800 | 0.4343 | 0.4119 | 0.0915 | 0.3431 |
5.4818 | 7.0 | 1750 | 5.1042 | 0.4865 | 0.6655 | 2.9165 | 0.4865 | 0.4753 | 0.1281 | 0.2790 |
5.1995 | 8.0 | 2000 | 5.0990 | 0.4868 | 0.6566 | 2.9361 | 0.4868 | 0.4782 | 0.1000 | 0.2775 |
5.1995 | 9.0 | 2250 | 4.9973 | 0.5008 | 0.6235 | 2.7450 | 0.5008 | 0.4878 | 0.0901 | 0.2533 |
5.0048 | 10.0 | 2500 | 4.9471 | 0.516 | 0.6182 | 2.7522 | 0.516 | 0.5141 | 0.0855 | 0.2455 |
5.0048 | 11.0 | 2750 | 4.9331 | 0.5225 | 0.6072 | 2.7517 | 0.5225 | 0.5198 | 0.0724 | 0.2397 |
4.8157 | 12.0 | 3000 | 4.9154 | 0.5343 | 0.5948 | 2.8289 | 0.5343 | 0.5274 | 0.0614 | 0.2331 |
4.8157 | 13.0 | 3250 | 4.9063 | 0.5252 | 0.5985 | 2.8356 | 0.5252 | 0.5193 | 0.0565 | 0.2343 |
4.6678 | 14.0 | 3500 | 4.9772 | 0.536 | 0.5988 | 2.8902 | 0.536 | 0.5233 | 0.0580 | 0.2359 |
4.6678 | 15.0 | 3750 | 4.8401 | 0.5517 | 0.5759 | 2.7486 | 0.5517 | 0.5526 | 0.0618 | 0.2150 |
4.5289 | 16.0 | 4000 | 4.8798 | 0.5617 | 0.5704 | 2.7557 | 0.5617 | 0.5581 | 0.0618 | 0.2134 |
4.5289 | 17.0 | 4250 | 4.8518 | 0.5527 | 0.5710 | 2.8619 | 0.5527 | 0.5556 | 0.0451 | 0.2103 |
4.3805 | 18.0 | 4500 | 4.8751 | 0.5623 | 0.5696 | 2.7950 | 0.5623 | 0.5607 | 0.0577 | 0.2081 |
4.3805 | 19.0 | 4750 | 4.9057 | 0.5593 | 0.5767 | 2.9991 | 0.5593 | 0.5611 | 0.0608 | 0.2145 |
4.2463 | 20.0 | 5000 | 4.9515 | 0.5595 | 0.5730 | 2.9144 | 0.5595 | 0.5578 | 0.0792 | 0.2119 |
4.2463 | 21.0 | 5250 | 4.9867 | 0.5625 | 0.5742 | 2.8184 | 0.5625 | 0.5635 | 0.0896 | 0.2121 |
4.1211 | 22.0 | 5500 | 4.9772 | 0.5683 | 0.5703 | 3.0845 | 0.5683 | 0.5682 | 0.0771 | 0.2050 |
4.1211 | 23.0 | 5750 | 4.9923 | 0.5667 | 0.5767 | 3.0160 | 0.5667 | 0.5699 | 0.1001 | 0.2041 |
3.9862 | 24.0 | 6000 | 5.0275 | 0.5687 | 0.5772 | 3.0111 | 0.5687 | 0.5705 | 0.1119 | 0.2012 |
3.9862 | 25.0 | 6250 | 5.1046 | 0.5607 | 0.5890 | 3.2599 | 0.5607 | 0.5623 | 0.1284 | 0.2060 |
3.8573 | 26.0 | 6500 | 5.1868 | 0.5607 | 0.6002 | 3.1568 | 0.5607 | 0.5669 | 0.1427 | 0.2085 |
3.8573 | 27.0 | 6750 | 5.1975 | 0.569 | 0.5962 | 3.1893 | 0.569 | 0.5729 | 0.1442 | 0.2037 |
3.7598 | 28.0 | 7000 | 5.2735 | 0.561 | 0.6090 | 3.3290 | 0.561 | 0.5674 | 0.1608 | 0.2087 |
3.7598 | 29.0 | 7250 | 5.2898 | 0.5695 | 0.6063 | 3.2247 | 0.5695 | 0.5719 | 0.1744 | 0.2025 |
3.6544 | 30.0 | 7500 | 5.3092 | 0.566 | 0.6142 | 3.2588 | 0.566 | 0.5725 | 0.1776 | 0.2064 |
3.6544 | 31.0 | 7750 | 5.4251 | 0.564 | 0.6214 | 3.2408 | 0.564 | 0.5641 | 0.1938 | 0.2066 |
3.5698 | 32.0 | 8000 | 5.4274 | 0.573 | 0.6217 | 3.3516 | 0.573 | 0.5780 | 0.1959 | 0.2036 |
3.5698 | 33.0 | 8250 | 5.4650 | 0.5665 | 0.6301 | 3.3685 | 0.5665 | 0.5765 | 0.2088 | 0.2054 |
3.4966 | 34.0 | 8500 | 5.4854 | 0.5733 | 0.6250 | 3.2985 | 0.5733 | 0.5754 | 0.2079 | 0.2027 |
3.4966 | 35.0 | 8750 | 5.5474 | 0.5837 | 0.6261 | 3.2816 | 0.5837 | 0.5860 | 0.2134 | 0.1990 |
3.4285 | 36.0 | 9000 | 5.5979 | 0.5725 | 0.6371 | 3.3105 | 0.5725 | 0.5763 | 0.2248 | 0.2023 |
3.4285 | 37.0 | 9250 | 5.7002 | 0.576 | 0.6452 | 3.2637 | 0.576 | 0.5771 | 0.2396 | 0.2034 |
3.377 | 38.0 | 9500 | 5.6932 | 0.5777 | 0.6448 | 3.3403 | 0.5777 | 0.5825 | 0.2362 | 0.2023 |
3.377 | 39.0 | 9750 | 5.7180 | 0.5795 | 0.6409 | 3.2664 | 0.5795 | 0.5848 | 0.2382 | 0.1990 |
3.3344 | 40.0 | 10000 | 5.7943 | 0.5765 | 0.6502 | 3.4052 | 0.5765 | 0.5810 | 0.2524 | 0.2001 |
3.3344 | 41.0 | 10250 | 5.8347 | 0.5737 | 0.6562 | 3.3472 | 0.5737 | 0.5793 | 0.2555 | 0.2006 |
3.2925 | 42.0 | 10500 | 5.9010 | 0.5835 | 0.6529 | 3.2352 | 0.5835 | 0.5867 | 0.2563 | 0.1987 |
3.2925 | 43.0 | 10750 | 5.9119 | 0.5787 | 0.6550 | 3.2640 | 0.5787 | 0.5829 | 0.2611 | 0.1976 |
3.2573 | 44.0 | 11000 | 5.9355 | 0.5765 | 0.6609 | 3.2903 | 0.5765 | 0.5811 | 0.2620 | 0.2004 |
3.2573 | 45.0 | 11250 | 6.0046 | 0.58 | 0.6643 | 3.2248 | 0.58 | 0.5843 | 0.2691 | 0.1992 |
3.2269 | 46.0 | 11500 | 6.0610 | 0.5847 | 0.6659 | 3.2719 | 0.5847 | 0.5888 | 0.2705 | 0.1974 |
3.2269 | 47.0 | 11750 | 6.0938 | 0.5787 | 0.6718 | 3.2559 | 0.5787 | 0.5840 | 0.2801 | 0.1989 |
3.2025 | 48.0 | 12000 | 6.1306 | 0.5787 | 0.6711 | 3.2546 | 0.5787 | 0.5823 | 0.2830 | 0.1974 |
3.2025 | 49.0 | 12250 | 6.1521 | 0.5823 | 0.6725 | 3.2590 | 0.5823 | 0.5867 | 0.2822 | 0.1976 |
3.1849 | 50.0 | 12500 | 6.1672 | 0.5797 | 0.6731 | 3.2703 | 0.5797 | 0.5841 | 0.2854 | 0.1976 |
Framework versions
- Transformers 4.26.1
- Pytorch 1.13.1.post200
- Datasets 2.9.0
- Tokenizers 0.13.2