<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->
dit-base-finetuned-rvlcdip-small_rvl_cdip-NK1000_kd_CEKD_t2.5_a0.5
This model is a fine-tuned version of WinKawaks/vit-small-patch16-224 on the None dataset. It achieves the following results on the evaluation set:
- Loss: 0.5443
- Accuracy: 0.8293
- Brier Loss: 0.2695
- Nll: 1.2500
- F1 Micro: 0.8293
- F1 Macro: 0.8301
- Ece: 0.0874
- Aurc: 0.0693
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 96
- eval_batch_size: 96
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 50
Training results
Training Loss | Epoch | Step | Validation Loss | Accuracy | Brier Loss | Nll | F1 Micro | F1 Macro | Ece | Aurc |
---|---|---|---|---|---|---|---|---|---|---|
No log | 1.0 | 167 | 1.1672 | 0.6082 | 0.5198 | 2.4714 | 0.6082 | 0.6021 | 0.0681 | 0.1731 |
No log | 2.0 | 334 | 0.8928 | 0.7177 | 0.4028 | 2.0687 | 0.7178 | 0.7161 | 0.0715 | 0.0997 |
1.1617 | 3.0 | 501 | 0.7584 | 0.7602 | 0.3454 | 1.8041 | 0.7602 | 0.7637 | 0.0762 | 0.0747 |
1.1617 | 4.0 | 668 | 0.7048 | 0.768 | 0.3262 | 1.6695 | 0.768 | 0.7687 | 0.0550 | 0.0687 |
1.1617 | 5.0 | 835 | 0.6921 | 0.7745 | 0.3214 | 1.6428 | 0.7745 | 0.7683 | 0.0477 | 0.0738 |
0.4396 | 6.0 | 1002 | 0.6616 | 0.789 | 0.3105 | 1.5655 | 0.7890 | 0.7897 | 0.0586 | 0.0714 |
0.4396 | 7.0 | 1169 | 0.6363 | 0.794 | 0.3001 | 1.5694 | 0.7940 | 0.7970 | 0.0549 | 0.0674 |
0.4396 | 8.0 | 1336 | 0.6753 | 0.7792 | 0.3259 | 1.4975 | 0.7792 | 0.7804 | 0.0647 | 0.0777 |
0.2291 | 9.0 | 1503 | 0.6247 | 0.8025 | 0.2979 | 1.4968 | 0.8025 | 0.8037 | 0.0705 | 0.0669 |
0.2291 | 10.0 | 1670 | 0.6347 | 0.799 | 0.3032 | 1.4834 | 0.799 | 0.8011 | 0.0720 | 0.0743 |
0.2291 | 11.0 | 1837 | 0.6328 | 0.7975 | 0.3045 | 1.4998 | 0.7975 | 0.8031 | 0.0773 | 0.0659 |
0.1575 | 12.0 | 2004 | 0.6442 | 0.7965 | 0.3097 | 1.4447 | 0.7965 | 0.7979 | 0.0714 | 0.0824 |
0.1575 | 13.0 | 2171 | 0.6354 | 0.8013 | 0.3043 | 1.4874 | 0.8013 | 0.8035 | 0.0712 | 0.0741 |
0.1575 | 14.0 | 2338 | 0.6443 | 0.799 | 0.3091 | 1.5848 | 0.799 | 0.8022 | 0.0791 | 0.0859 |
0.1285 | 15.0 | 2505 | 0.6357 | 0.8017 | 0.3042 | 1.5670 | 0.8017 | 0.8002 | 0.0799 | 0.0685 |
0.1285 | 16.0 | 2672 | 0.6166 | 0.807 | 0.2965 | 1.4806 | 0.807 | 0.8056 | 0.0720 | 0.0745 |
0.1285 | 17.0 | 2839 | 0.6433 | 0.7993 | 0.3159 | 1.5024 | 0.7993 | 0.8023 | 0.0805 | 0.0857 |
0.1121 | 18.0 | 3006 | 0.6102 | 0.8147 | 0.2960 | 1.4550 | 0.8148 | 0.8144 | 0.0775 | 0.0698 |
0.1121 | 19.0 | 3173 | 0.6616 | 0.7995 | 0.3146 | 1.6009 | 0.7995 | 0.7962 | 0.0892 | 0.0883 |
0.1121 | 20.0 | 3340 | 0.6163 | 0.8037 | 0.3029 | 1.4525 | 0.8037 | 0.8059 | 0.0920 | 0.0771 |
0.1012 | 21.0 | 3507 | 0.6186 | 0.8093 | 0.3017 | 1.5539 | 0.8093 | 0.8111 | 0.0920 | 0.0712 |
0.1012 | 22.0 | 3674 | 0.5982 | 0.8137 | 0.2930 | 1.4533 | 0.8137 | 0.8140 | 0.0815 | 0.0668 |
0.1012 | 23.0 | 3841 | 0.5928 | 0.822 | 0.2864 | 1.4312 | 0.822 | 0.8218 | 0.0723 | 0.0818 |
0.0888 | 24.0 | 4008 | 0.5931 | 0.8135 | 0.2900 | 1.4129 | 0.8135 | 0.8143 | 0.0894 | 0.0706 |
0.0888 | 25.0 | 4175 | 0.5807 | 0.8183 | 0.2849 | 1.4241 | 0.8183 | 0.8203 | 0.0903 | 0.0683 |
0.0888 | 26.0 | 4342 | 0.5859 | 0.8193 | 0.2869 | 1.4385 | 0.8193 | 0.8194 | 0.0879 | 0.0698 |
0.0828 | 27.0 | 4509 | 0.5957 | 0.8147 | 0.2941 | 1.4132 | 0.8148 | 0.8151 | 0.0847 | 0.0732 |
0.0828 | 28.0 | 4676 | 0.5791 | 0.818 | 0.2852 | 1.4231 | 0.818 | 0.8185 | 0.0896 | 0.0612 |
0.0828 | 29.0 | 4843 | 0.5888 | 0.8137 | 0.2895 | 1.3998 | 0.8137 | 0.8148 | 0.0925 | 0.0740 |
0.0776 | 30.0 | 5010 | 0.5633 | 0.8225 | 0.2798 | 1.3391 | 0.8225 | 0.8234 | 0.0878 | 0.0760 |
0.0776 | 31.0 | 5177 | 0.5635 | 0.8247 | 0.2785 | 1.3193 | 0.8247 | 0.8256 | 0.0900 | 0.0587 |
0.0776 | 32.0 | 5344 | 0.5580 | 0.8223 | 0.2784 | 1.2970 | 0.8223 | 0.8241 | 0.0905 | 0.0704 |
0.0727 | 33.0 | 5511 | 0.5502 | 0.826 | 0.2724 | 1.2733 | 0.826 | 0.8268 | 0.0865 | 0.0619 |
0.0727 | 34.0 | 5678 | 0.5448 | 0.8293 | 0.2720 | 1.2237 | 0.8293 | 0.8303 | 0.0820 | 0.0639 |
0.0727 | 35.0 | 5845 | 0.5480 | 0.8257 | 0.2729 | 1.2867 | 0.8257 | 0.8271 | 0.0928 | 0.0586 |
0.0696 | 36.0 | 6012 | 0.5437 | 0.8293 | 0.2703 | 1.2427 | 0.8293 | 0.8298 | 0.0871 | 0.0630 |
0.0696 | 37.0 | 6179 | 0.5460 | 0.8253 | 0.2712 | 1.2629 | 0.8253 | 0.8262 | 0.0912 | 0.0598 |
0.0696 | 38.0 | 6346 | 0.5425 | 0.8295 | 0.2703 | 1.2440 | 0.8295 | 0.8303 | 0.0899 | 0.0611 |
0.0677 | 39.0 | 6513 | 0.5421 | 0.8307 | 0.2690 | 1.2453 | 0.8308 | 0.8319 | 0.0835 | 0.0665 |
0.0677 | 40.0 | 6680 | 0.5406 | 0.8287 | 0.2689 | 1.2465 | 0.8287 | 0.8296 | 0.0895 | 0.0612 |
0.0677 | 41.0 | 6847 | 0.5423 | 0.8277 | 0.2696 | 1.2735 | 0.8277 | 0.8284 | 0.0893 | 0.0604 |
0.0663 | 42.0 | 7014 | 0.5406 | 0.8297 | 0.2676 | 1.2403 | 0.8297 | 0.8306 | 0.0894 | 0.0657 |
0.0663 | 43.0 | 7181 | 0.5410 | 0.8313 | 0.2686 | 1.2359 | 0.8313 | 0.8323 | 0.0895 | 0.0635 |
0.0663 | 44.0 | 7348 | 0.5416 | 0.8287 | 0.2685 | 1.2308 | 0.8287 | 0.8295 | 0.0883 | 0.0647 |
0.0652 | 45.0 | 7515 | 0.5431 | 0.8275 | 0.2697 | 1.2374 | 0.8275 | 0.8282 | 0.0932 | 0.0648 |
0.0652 | 46.0 | 7682 | 0.5433 | 0.8295 | 0.2693 | 1.2347 | 0.8295 | 0.8303 | 0.0891 | 0.0681 |
0.0652 | 47.0 | 7849 | 0.5441 | 0.8277 | 0.2696 | 1.2433 | 0.8277 | 0.8286 | 0.0882 | 0.0681 |
0.0651 | 48.0 | 8016 | 0.5439 | 0.8293 | 0.2695 | 1.2358 | 0.8293 | 0.8301 | 0.0888 | 0.0692 |
0.0651 | 49.0 | 8183 | 0.5445 | 0.8287 | 0.2696 | 1.2499 | 0.8287 | 0.8296 | 0.0882 | 0.0695 |
0.0651 | 50.0 | 8350 | 0.5443 | 0.8293 | 0.2695 | 1.2500 | 0.8293 | 0.8301 | 0.0874 | 0.0693 |
Framework versions
- Transformers 4.26.1
- Pytorch 1.13.1.post200
- Datasets 2.9.0
- Tokenizers 0.13.2