<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->
vit-base_rvl-cdip-small_rvl_cdip-NK1000_kd_MSE
This model is a fine-tuned version of WinKawaks/vit-small-patch16-224 on the None dataset. It achieves the following results on the evaluation set:
- Loss: 0.3508
- Accuracy: 0.861
- Brier Loss: 0.2072
- Nll: 1.3138
- F1 Micro: 0.861
- F1 Macro: 0.8630
- Ece: 0.0470
- Aurc: 0.0290
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 96
- eval_batch_size: 96
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 50
Training results
Training Loss | Epoch | Step | Validation Loss | Accuracy | Brier Loss | Nll | F1 Micro | F1 Macro | Ece | Aurc |
---|---|---|---|---|---|---|---|---|---|---|
No log | 1.0 | 167 | 2.0205 | 0.6145 | 0.5068 | 2.2570 | 0.6145 | 0.6027 | 0.0547 | 0.1646 |
No log | 2.0 | 334 | 1.3347 | 0.7055 | 0.3976 | 1.8932 | 0.7055 | 0.7001 | 0.0615 | 0.1010 |
2.0481 | 3.0 | 501 | 0.9336 | 0.764 | 0.3401 | 1.6693 | 0.764 | 0.7665 | 0.0772 | 0.0686 |
2.0481 | 4.0 | 668 | 0.7982 | 0.7895 | 0.3047 | 1.5439 | 0.7895 | 0.7930 | 0.0670 | 0.0569 |
2.0481 | 5.0 | 835 | 0.7154 | 0.7973 | 0.3037 | 1.5600 | 0.7973 | 0.7969 | 0.0836 | 0.0571 |
0.5656 | 6.0 | 1002 | 0.6158 | 0.8113 | 0.2903 | 1.4591 | 0.8113 | 0.8139 | 0.0921 | 0.0493 |
0.5656 | 7.0 | 1169 | 0.5531 | 0.8207 | 0.2707 | 1.4410 | 0.8207 | 0.8236 | 0.0832 | 0.0431 |
0.5656 | 8.0 | 1336 | 0.5706 | 0.815 | 0.2826 | 1.4722 | 0.815 | 0.8208 | 0.0881 | 0.0465 |
0.2988 | 9.0 | 1503 | 0.4654 | 0.8355 | 0.2488 | 1.3791 | 0.8355 | 0.8368 | 0.0745 | 0.0382 |
0.2988 | 10.0 | 1670 | 0.4695 | 0.8315 | 0.2579 | 1.3701 | 0.8315 | 0.8333 | 0.0813 | 0.0403 |
0.2988 | 11.0 | 1837 | 0.4358 | 0.8405 | 0.2424 | 1.3500 | 0.8405 | 0.8424 | 0.0725 | 0.0361 |
0.1829 | 12.0 | 2004 | 0.4333 | 0.8425 | 0.2402 | 1.3740 | 0.8425 | 0.8446 | 0.0662 | 0.0362 |
0.1829 | 13.0 | 2171 | 0.4239 | 0.8462 | 0.2326 | 1.3541 | 0.8462 | 0.8477 | 0.0648 | 0.0335 |
0.1829 | 14.0 | 2338 | 0.3902 | 0.8488 | 0.2263 | 1.2996 | 0.8488 | 0.8512 | 0.0642 | 0.0318 |
0.1215 | 15.0 | 2505 | 0.3740 | 0.8522 | 0.2194 | 1.3374 | 0.8522 | 0.8543 | 0.0595 | 0.0313 |
0.1215 | 16.0 | 2672 | 0.3735 | 0.8548 | 0.2189 | 1.3420 | 0.8547 | 0.8553 | 0.0525 | 0.0320 |
0.1215 | 17.0 | 2839 | 0.3700 | 0.8538 | 0.2161 | 1.3217 | 0.8537 | 0.8561 | 0.0521 | 0.0304 |
0.082 | 18.0 | 3006 | 0.3574 | 0.8548 | 0.2164 | 1.3245 | 0.8547 | 0.8561 | 0.0583 | 0.0301 |
0.082 | 19.0 | 3173 | 0.3669 | 0.8555 | 0.2140 | 1.3197 | 0.8555 | 0.8572 | 0.0538 | 0.0304 |
0.082 | 20.0 | 3340 | 0.3561 | 0.8548 | 0.2125 | 1.3367 | 0.8547 | 0.8560 | 0.0540 | 0.0296 |
0.0535 | 21.0 | 3507 | 0.3495 | 0.854 | 0.2116 | 1.3422 | 0.854 | 0.8558 | 0.0556 | 0.0294 |
0.0535 | 22.0 | 3674 | 0.3412 | 0.8602 | 0.2092 | 1.2970 | 0.8602 | 0.8621 | 0.0527 | 0.0293 |
0.0535 | 23.0 | 3841 | 0.3445 | 0.8595 | 0.2086 | 1.2979 | 0.8595 | 0.8613 | 0.0500 | 0.0286 |
0.0309 | 24.0 | 4008 | 0.3456 | 0.8585 | 0.2105 | 1.3220 | 0.8585 | 0.8601 | 0.0507 | 0.0292 |
0.0309 | 25.0 | 4175 | 0.3451 | 0.862 | 0.2091 | 1.3080 | 0.8620 | 0.8640 | 0.0465 | 0.0290 |
0.0309 | 26.0 | 4342 | 0.3484 | 0.8578 | 0.2090 | 1.3165 | 0.8578 | 0.8596 | 0.0527 | 0.0290 |
0.019 | 27.0 | 4509 | 0.3452 | 0.8612 | 0.2072 | 1.3133 | 0.8612 | 0.8634 | 0.0494 | 0.0288 |
0.019 | 28.0 | 4676 | 0.3451 | 0.8598 | 0.2089 | 1.3197 | 0.8598 | 0.8619 | 0.0515 | 0.0295 |
0.019 | 29.0 | 4843 | 0.3445 | 0.8618 | 0.2072 | 1.3057 | 0.8618 | 0.8633 | 0.0496 | 0.0294 |
0.0137 | 30.0 | 5010 | 0.3452 | 0.8592 | 0.2078 | 1.3108 | 0.8592 | 0.8609 | 0.0499 | 0.0292 |
0.0137 | 31.0 | 5177 | 0.3439 | 0.8615 | 0.2074 | 1.2960 | 0.8615 | 0.8631 | 0.0495 | 0.0286 |
0.0137 | 32.0 | 5344 | 0.3475 | 0.8618 | 0.2080 | 1.3146 | 0.8618 | 0.8638 | 0.0468 | 0.0288 |
0.01 | 33.0 | 5511 | 0.3468 | 0.8605 | 0.2080 | 1.3095 | 0.8605 | 0.8624 | 0.0470 | 0.0291 |
0.01 | 34.0 | 5678 | 0.3454 | 0.8638 | 0.2060 | 1.3094 | 0.8638 | 0.8653 | 0.0465 | 0.0285 |
0.01 | 35.0 | 5845 | 0.3463 | 0.8612 | 0.2067 | 1.3145 | 0.8612 | 0.8632 | 0.0479 | 0.0287 |
0.0071 | 36.0 | 6012 | 0.3466 | 0.8615 | 0.2070 | 1.3189 | 0.8615 | 0.8634 | 0.0449 | 0.0289 |
0.0071 | 37.0 | 6179 | 0.3457 | 0.8635 | 0.2065 | 1.3085 | 0.8635 | 0.8653 | 0.0487 | 0.0287 |
0.0071 | 38.0 | 6346 | 0.3471 | 0.8618 | 0.2066 | 1.3132 | 0.8618 | 0.8637 | 0.0488 | 0.0286 |
0.0047 | 39.0 | 6513 | 0.3481 | 0.8615 | 0.2067 | 1.3116 | 0.8615 | 0.8632 | 0.0485 | 0.0288 |
0.0047 | 40.0 | 6680 | 0.3482 | 0.8618 | 0.2074 | 1.3149 | 0.8618 | 0.8638 | 0.0512 | 0.0290 |
0.0047 | 41.0 | 6847 | 0.3488 | 0.862 | 0.2072 | 1.3162 | 0.8620 | 0.8640 | 0.0467 | 0.0287 |
0.0029 | 42.0 | 7014 | 0.3485 | 0.862 | 0.2069 | 1.3136 | 0.8620 | 0.8640 | 0.0466 | 0.0288 |
0.0029 | 43.0 | 7181 | 0.3492 | 0.8612 | 0.2072 | 1.3151 | 0.8612 | 0.8633 | 0.0470 | 0.0288 |
0.0029 | 44.0 | 7348 | 0.3492 | 0.8615 | 0.2070 | 1.3117 | 0.8615 | 0.8634 | 0.0459 | 0.0289 |
0.0019 | 45.0 | 7515 | 0.3502 | 0.8612 | 0.2073 | 1.3153 | 0.8612 | 0.8632 | 0.0460 | 0.0289 |
0.0019 | 46.0 | 7682 | 0.3500 | 0.8615 | 0.2072 | 1.3136 | 0.8615 | 0.8634 | 0.0474 | 0.0290 |
0.0019 | 47.0 | 7849 | 0.3505 | 0.862 | 0.2072 | 1.3153 | 0.8620 | 0.8640 | 0.0457 | 0.0289 |
0.0014 | 48.0 | 8016 | 0.3507 | 0.861 | 0.2072 | 1.3113 | 0.861 | 0.8630 | 0.0475 | 0.0290 |
0.0014 | 49.0 | 8183 | 0.3508 | 0.861 | 0.2071 | 1.3111 | 0.861 | 0.8630 | 0.0474 | 0.0290 |
0.0014 | 50.0 | 8350 | 0.3508 | 0.861 | 0.2072 | 1.3138 | 0.861 | 0.8630 | 0.0470 | 0.0290 |
Framework versions
- Transformers 4.26.1
- Pytorch 1.13.1.post200
- Datasets 2.9.0
- Tokenizers 0.13.2