<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->
vit-base_rvl_cdip_crl
This model is a fine-tuned version of jordyvl/vit-base_rvl-cdip on the None dataset. It achieves the following results on the evaluation set:
- Loss: 0.6238
- Accuracy: 0.8956
- Brier Loss: 0.1819
- Nll: 1.1791
- F1 Micro: 0.8957
- F1 Macro: 0.8958
- Ece: 0.0846
- Aurc: 0.0210
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- gradient_accumulation_steps: 16
- total_train_batch_size: 256
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 10
Training results
Training Loss | Epoch | Step | Validation Loss | Accuracy | Brier Loss | Nll | F1 Micro | F1 Macro | Ece | Aurc |
---|---|---|---|---|---|---|---|---|---|---|
0.1844 | 1.0 | 1250 | 0.4411 | 0.8961 | 0.1614 | 1.1240 | 0.8961 | 0.8963 | 0.0528 | 0.0161 |
0.1394 | 2.0 | 2500 | 0.4830 | 0.8927 | 0.1716 | 1.1324 | 0.8927 | 0.8927 | 0.0646 | 0.0175 |
0.1 | 3.0 | 3750 | 0.5257 | 0.8911 | 0.1791 | 1.1569 | 0.8911 | 0.8912 | 0.0737 | 0.0187 |
0.068 | 4.0 | 5000 | 0.5497 | 0.8913 | 0.1806 | 1.1705 | 0.8913 | 0.8913 | 0.0770 | 0.0192 |
0.048 | 5.0 | 6250 | 0.5762 | 0.8915 | 0.1834 | 1.1906 | 0.8915 | 0.8914 | 0.0808 | 0.0195 |
0.033 | 6.0 | 7500 | 0.5877 | 0.8936 | 0.1822 | 1.1690 | 0.8936 | 0.8938 | 0.0817 | 0.0196 |
0.0231 | 7.0 | 8750 | 0.6000 | 0.8938 | 0.1822 | 1.1867 | 0.8938 | 0.8939 | 0.0833 | 0.0206 |
0.0162 | 8.0 | 10000 | 0.6187 | 0.8948 | 0.1834 | 1.1827 | 0.8948 | 0.8949 | 0.0841 | 0.0208 |
0.0123 | 9.0 | 11250 | 0.6191 | 0.8953 | 0.1824 | 1.1868 | 0.8953 | 0.8955 | 0.0836 | 0.0207 |
0.0102 | 10.0 | 12500 | 0.6238 | 0.8956 | 0.1819 | 1.1791 | 0.8957 | 0.8958 | 0.0846 | 0.0210 |
Framework versions
- Transformers 4.26.1
- Pytorch 1.13.1.post200
- Datasets 2.9.0
- Tokenizers 0.13.2