<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->
dit-finetuned_rvl_tobacco_crl
This model is a fine-tuned version of microsoft/dit-base-finetuned-rvlcdip on the None dataset. It achieves the following results on the evaluation set:
- Loss: 0.5587
- Accuracy: 0.935
- Brier Loss: 0.1614
- Nll: 0.7604
- F1 Micro: 0.935
- F1 Macro: 0.9244
- Ece: 0.2320
- Aurc: 0.0099
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- gradient_accumulation_steps: 16
- total_train_batch_size: 256
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 100
Training results
Training Loss | Epoch | Step | Validation Loss | Accuracy | Brier Loss | Nll | F1 Micro | F1 Macro | Ece | Aurc |
---|---|---|---|---|---|---|---|---|---|---|
No log | 0.96 | 3 | 2.3702 | 0.005 | 0.9059 | 8.8946 | 0.005 | 0.0048 | 0.1391 | 0.9945 |
No log | 1.96 | 6 | 2.3612 | 0.005 | 0.9052 | 8.8523 | 0.005 | 0.0048 | 0.1390 | 0.9942 |
No log | 2.96 | 9 | 2.3406 | 0.005 | 0.9038 | 8.5993 | 0.005 | 0.0048 | 0.1389 | 0.9939 |
No log | 3.96 | 12 | 2.3415 | 0.01 | 0.9018 | 7.9436 | 0.01 | 0.0091 | 0.1413 | 0.9925 |
No log | 4.96 | 15 | 2.3297 | 0.02 | 0.8991 | 7.1590 | 0.02 | 0.0303 | 0.1461 | 0.9874 |
No log | 5.96 | 18 | 2.3164 | 0.05 | 0.8958 | 6.6598 | 0.0500 | 0.0527 | 0.1638 | 0.9769 |
No log | 6.96 | 21 | 2.3021 | 0.16 | 0.8918 | 6.3655 | 0.16 | 0.1028 | 0.2371 | 0.9458 |
No log | 7.96 | 24 | 2.2776 | 0.18 | 0.8869 | 6.0316 | 0.18 | 0.1076 | 0.2436 | 0.9253 |
No log | 8.96 | 27 | 2.2639 | 0.195 | 0.8811 | 4.6385 | 0.195 | 0.1154 | 0.2533 | 0.8971 |
No log | 9.96 | 30 | 2.2388 | 0.215 | 0.8736 | 3.3475 | 0.2150 | 0.1273 | 0.2506 | 0.8034 |
No log | 10.96 | 33 | 2.2053 | 0.3 | 0.8635 | 2.6087 | 0.3 | 0.1896 | 0.2977 | 0.6413 |
No log | 11.96 | 36 | 2.1526 | 0.39 | 0.8496 | 2.2967 | 0.39 | 0.2387 | 0.3672 | 0.4311 |
No log | 12.96 | 39 | 2.1007 | 0.475 | 0.8335 | 1.8576 | 0.4750 | 0.3168 | 0.4171 | 0.3033 |
No log | 13.96 | 42 | 2.0444 | 0.575 | 0.8173 | 1.4725 | 0.575 | 0.3985 | 0.4782 | 0.2201 |
No log | 14.96 | 45 | 1.9806 | 0.6 | 0.7977 | 1.2973 | 0.6 | 0.4402 | 0.5055 | 0.1902 |
No log | 15.96 | 48 | 1.9183 | 0.645 | 0.7791 | 1.2239 | 0.645 | 0.4909 | 0.5235 | 0.1422 |
No log | 16.96 | 51 | 1.8671 | 0.705 | 0.7619 | 1.1838 | 0.705 | 0.5916 | 0.5620 | 0.1137 |
No log | 17.96 | 54 | 1.8104 | 0.785 | 0.7434 | 1.0809 | 0.785 | 0.6794 | 0.6135 | 0.0684 |
No log | 18.96 | 57 | 1.7607 | 0.805 | 0.7265 | 1.0477 | 0.805 | 0.6970 | 0.6211 | 0.0550 |
No log | 19.96 | 60 | 1.7100 | 0.825 | 0.7079 | 1.0221 | 0.825 | 0.7236 | 0.6304 | 0.0478 |
No log | 20.96 | 63 | 1.6615 | 0.825 | 0.6892 | 1.0108 | 0.825 | 0.7236 | 0.6201 | 0.0445 |
No log | 21.96 | 66 | 1.6118 | 0.845 | 0.6705 | 0.9972 | 0.845 | 0.7428 | 0.6329 | 0.0364 |
No log | 22.96 | 69 | 1.5710 | 0.845 | 0.6520 | 0.9833 | 0.845 | 0.7417 | 0.6179 | 0.0337 |
No log | 23.96 | 72 | 1.5225 | 0.85 | 0.6329 | 0.9634 | 0.85 | 0.7421 | 0.6078 | 0.0306 |
No log | 24.96 | 75 | 1.4797 | 0.865 | 0.6132 | 0.9521 | 0.865 | 0.7643 | 0.6058 | 0.0256 |
No log | 25.96 | 78 | 1.4397 | 0.865 | 0.5934 | 0.9487 | 0.865 | 0.7643 | 0.5917 | 0.0249 |
No log | 26.96 | 81 | 1.4028 | 0.87 | 0.5735 | 0.9501 | 0.87 | 0.7708 | 0.5875 | 0.0229 |
No log | 27.96 | 84 | 1.3612 | 0.87 | 0.5537 | 0.9488 | 0.87 | 0.7708 | 0.5747 | 0.0225 |
No log | 28.96 | 87 | 1.3246 | 0.87 | 0.5347 | 0.9884 | 0.87 | 0.7756 | 0.5592 | 0.0220 |
No log | 29.96 | 90 | 1.2879 | 0.87 | 0.5163 | 0.9824 | 0.87 | 0.7759 | 0.5428 | 0.0225 |
No log | 30.96 | 93 | 1.2546 | 0.87 | 0.4993 | 0.9798 | 0.87 | 0.7752 | 0.5337 | 0.0225 |
No log | 31.96 | 96 | 1.2207 | 0.875 | 0.4815 | 0.9755 | 0.875 | 0.7822 | 0.5094 | 0.0218 |
No log | 32.96 | 99 | 1.1855 | 0.88 | 0.4628 | 0.9779 | 0.88 | 0.8016 | 0.5062 | 0.0206 |
No log | 33.96 | 102 | 1.1557 | 0.875 | 0.4467 | 1.0389 | 0.875 | 0.7946 | 0.4862 | 0.0210 |
No log | 34.96 | 105 | 1.1322 | 0.885 | 0.4327 | 0.9684 | 0.885 | 0.8176 | 0.4812 | 0.0210 |
No log | 35.96 | 108 | 1.1061 | 0.895 | 0.4176 | 0.9561 | 0.895 | 0.8405 | 0.4697 | 0.0206 |
No log | 36.96 | 111 | 1.0796 | 0.9 | 0.4027 | 0.9468 | 0.9 | 0.8513 | 0.4678 | 0.0203 |
No log | 37.96 | 114 | 1.0579 | 0.91 | 0.3907 | 0.8753 | 0.91 | 0.8753 | 0.4617 | 0.0195 |
No log | 38.96 | 117 | 1.0277 | 0.91 | 0.3774 | 0.8706 | 0.91 | 0.8772 | 0.4447 | 0.0187 |
No log | 39.96 | 120 | 1.0031 | 0.915 | 0.3647 | 0.8547 | 0.915 | 0.8837 | 0.4374 | 0.0175 |
No log | 40.96 | 123 | 0.9803 | 0.925 | 0.3535 | 0.8474 | 0.925 | 0.9037 | 0.4327 | 0.0172 |
No log | 41.96 | 126 | 0.9621 | 0.92 | 0.3440 | 0.8505 | 0.92 | 0.8985 | 0.4129 | 0.0182 |
No log | 42.96 | 129 | 0.9428 | 0.91 | 0.3347 | 0.8515 | 0.91 | 0.8846 | 0.3943 | 0.0191 |
No log | 43.96 | 132 | 0.9231 | 0.92 | 0.3249 | 0.8403 | 0.92 | 0.9003 | 0.4079 | 0.0177 |
No log | 44.96 | 135 | 0.9075 | 0.93 | 0.3159 | 0.8224 | 0.93 | 0.9139 | 0.4073 | 0.0167 |
No log | 45.96 | 138 | 0.8876 | 0.925 | 0.3073 | 0.8091 | 0.925 | 0.9096 | 0.3878 | 0.0173 |
No log | 46.96 | 141 | 0.8799 | 0.93 | 0.2977 | 0.8091 | 0.93 | 0.9148 | 0.3785 | 0.0161 |
No log | 47.96 | 144 | 0.8567 | 0.915 | 0.2901 | 0.8123 | 0.915 | 0.8922 | 0.3693 | 0.0173 |
No log | 48.96 | 147 | 0.8430 | 0.92 | 0.2837 | 0.8045 | 0.92 | 0.9055 | 0.3525 | 0.0177 |
No log | 49.96 | 150 | 0.8270 | 0.925 | 0.2764 | 0.7970 | 0.925 | 0.9132 | 0.3499 | 0.0172 |
No log | 50.96 | 153 | 0.8168 | 0.925 | 0.2685 | 0.7991 | 0.925 | 0.9132 | 0.3417 | 0.0164 |
No log | 51.96 | 156 | 0.7975 | 0.93 | 0.2598 | 0.7987 | 0.93 | 0.9184 | 0.3379 | 0.0148 |
No log | 52.96 | 159 | 0.7821 | 0.935 | 0.2522 | 0.7911 | 0.935 | 0.9245 | 0.3345 | 0.0137 |
No log | 53.96 | 162 | 0.7693 | 0.935 | 0.2468 | 0.7805 | 0.935 | 0.9245 | 0.3423 | 0.0135 |
No log | 54.96 | 165 | 0.7486 | 0.93 | 0.2416 | 0.7829 | 0.93 | 0.9195 | 0.3272 | 0.0142 |
No log | 55.96 | 168 | 0.7409 | 0.93 | 0.2381 | 0.7833 | 0.93 | 0.9195 | 0.3216 | 0.0144 |
No log | 56.96 | 171 | 0.7290 | 0.93 | 0.2327 | 0.7815 | 0.93 | 0.9195 | 0.3055 | 0.0138 |
No log | 57.96 | 174 | 0.7137 | 0.935 | 0.2268 | 0.7757 | 0.935 | 0.9245 | 0.3039 | 0.0128 |
No log | 58.96 | 177 | 0.7026 | 0.935 | 0.2214 | 0.7698 | 0.935 | 0.9245 | 0.2911 | 0.0122 |
No log | 59.96 | 180 | 0.6935 | 0.935 | 0.2168 | 0.7604 | 0.935 | 0.9245 | 0.2853 | 0.0119 |
No log | 60.96 | 183 | 0.6855 | 0.935 | 0.2134 | 0.7605 | 0.935 | 0.9245 | 0.2895 | 0.0117 |
No log | 61.96 | 186 | 0.6755 | 0.94 | 0.2094 | 0.8161 | 0.94 | 0.9330 | 0.2902 | 0.0114 |
No log | 62.96 | 189 | 0.6641 | 0.94 | 0.2046 | 0.8131 | 0.94 | 0.9330 | 0.2761 | 0.0108 |
No log | 63.96 | 192 | 0.6536 | 0.94 | 0.2000 | 0.8113 | 0.94 | 0.9330 | 0.2865 | 0.0104 |
No log | 64.96 | 195 | 0.6441 | 0.94 | 0.1964 | 0.8071 | 0.94 | 0.9330 | 0.2739 | 0.0103 |
No log | 65.96 | 198 | 0.6395 | 0.94 | 0.1937 | 0.7997 | 0.94 | 0.9330 | 0.2771 | 0.0102 |
No log | 66.96 | 201 | 0.6345 | 0.94 | 0.1915 | 0.7930 | 0.94 | 0.9330 | 0.2764 | 0.0104 |
No log | 67.96 | 204 | 0.6355 | 0.94 | 0.1901 | 0.7901 | 0.94 | 0.9330 | 0.2763 | 0.0105 |
No log | 68.96 | 207 | 0.6302 | 0.94 | 0.1880 | 0.7887 | 0.94 | 0.9330 | 0.2631 | 0.0108 |
No log | 69.96 | 210 | 0.6242 | 0.94 | 0.1858 | 0.7887 | 0.94 | 0.9330 | 0.2595 | 0.0109 |
No log | 70.96 | 213 | 0.6182 | 0.94 | 0.1837 | 0.7898 | 0.94 | 0.9330 | 0.2628 | 0.0105 |
No log | 71.96 | 216 | 0.6129 | 0.94 | 0.1816 | 0.7910 | 0.94 | 0.9330 | 0.2597 | 0.0103 |
No log | 72.96 | 219 | 0.6085 | 0.94 | 0.1795 | 0.7878 | 0.94 | 0.9330 | 0.2572 | 0.0101 |
No log | 73.96 | 222 | 0.6049 | 0.94 | 0.1777 | 0.7837 | 0.94 | 0.9330 | 0.2561 | 0.0099 |
No log | 74.96 | 225 | 0.6004 | 0.94 | 0.1756 | 0.7824 | 0.94 | 0.9330 | 0.2372 | 0.0093 |
No log | 75.96 | 228 | 0.5966 | 0.94 | 0.1740 | 0.7799 | 0.94 | 0.9330 | 0.2436 | 0.0093 |
No log | 76.96 | 231 | 0.5934 | 0.94 | 0.1731 | 0.7803 | 0.94 | 0.9330 | 0.2395 | 0.0094 |
No log | 77.96 | 234 | 0.5902 | 0.94 | 0.1722 | 0.7779 | 0.94 | 0.9330 | 0.2470 | 0.0096 |
No log | 78.96 | 237 | 0.5866 | 0.94 | 0.1709 | 0.7737 | 0.94 | 0.9330 | 0.2456 | 0.0097 |
No log | 79.96 | 240 | 0.5833 | 0.94 | 0.1696 | 0.7702 | 0.94 | 0.9330 | 0.2361 | 0.0099 |
No log | 80.96 | 243 | 0.5809 | 0.94 | 0.1687 | 0.7693 | 0.94 | 0.9330 | 0.2346 | 0.0099 |
No log | 81.96 | 246 | 0.5786 | 0.94 | 0.1679 | 0.7701 | 0.94 | 0.9330 | 0.2333 | 0.0100 |
No log | 82.96 | 249 | 0.5772 | 0.935 | 0.1675 | 0.7703 | 0.935 | 0.9244 | 0.2296 | 0.0099 |
No log | 83.96 | 252 | 0.5758 | 0.935 | 0.1671 | 0.7703 | 0.935 | 0.9244 | 0.2373 | 0.0101 |
No log | 84.96 | 255 | 0.5741 | 0.935 | 0.1664 | 0.7686 | 0.935 | 0.9244 | 0.2362 | 0.0100 |
No log | 85.96 | 258 | 0.5725 | 0.935 | 0.1657 | 0.7665 | 0.935 | 0.9244 | 0.2328 | 0.0099 |
No log | 86.96 | 261 | 0.5710 | 0.935 | 0.1651 | 0.7650 | 0.935 | 0.9244 | 0.2313 | 0.0101 |
No log | 87.96 | 264 | 0.5692 | 0.935 | 0.1646 | 0.7641 | 0.935 | 0.9244 | 0.2323 | 0.0101 |
No log | 88.96 | 267 | 0.5674 | 0.935 | 0.1641 | 0.7641 | 0.935 | 0.9244 | 0.2303 | 0.0100 |
No log | 89.96 | 270 | 0.5659 | 0.935 | 0.1636 | 0.7640 | 0.935 | 0.9244 | 0.2290 | 0.0098 |
No log | 90.96 | 273 | 0.5648 | 0.935 | 0.1633 | 0.7636 | 0.935 | 0.9244 | 0.2281 | 0.0098 |
No log | 91.96 | 276 | 0.5639 | 0.935 | 0.1630 | 0.7632 | 0.935 | 0.9244 | 0.2345 | 0.0100 |
No log | 92.96 | 279 | 0.5628 | 0.935 | 0.1626 | 0.7628 | 0.935 | 0.9244 | 0.2340 | 0.0100 |
No log | 93.96 | 282 | 0.5619 | 0.935 | 0.1623 | 0.7623 | 0.935 | 0.9244 | 0.2334 | 0.0101 |
No log | 94.96 | 285 | 0.5609 | 0.935 | 0.1620 | 0.7615 | 0.935 | 0.9244 | 0.2328 | 0.0100 |
No log | 95.96 | 288 | 0.5602 | 0.935 | 0.1617 | 0.7610 | 0.935 | 0.9244 | 0.2324 | 0.0099 |
No log | 96.96 | 291 | 0.5596 | 0.935 | 0.1616 | 0.7605 | 0.935 | 0.9244 | 0.2322 | 0.0099 |
No log | 97.96 | 294 | 0.5589 | 0.935 | 0.1615 | 0.7604 | 0.935 | 0.9244 | 0.2321 | 0.0099 |
No log | 98.96 | 297 | 0.5589 | 0.935 | 0.1614 | 0.7604 | 0.935 | 0.9244 | 0.2320 | 0.0099 |
No log | 99.96 | 300 | 0.5587 | 0.935 | 0.1614 | 0.7604 | 0.935 | 0.9244 | 0.2320 | 0.0099 |
Framework versions
- Transformers 4.26.1
- Pytorch 1.13.1.post200
- Datasets 2.9.0
- Tokenizers 0.13.2