<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->
test_crl_entropy_large
This model is a fine-tuned version of jordyvl/vit-base_rvl-cdip on the None dataset. It achieves the following results on the evaluation set:
- Loss: 0.4581
- Accuracy: 0.915
- Brier Loss: 0.1532
- Nll: 1.2415
- F1 Micro: 0.915
- F1 Macro: 0.9138
- Ece: 0.1052
- Aurc: 0.0187
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 4
- eval_batch_size: 4
- seed: 42
- gradient_accumulation_steps: 16
- total_train_batch_size: 64
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 100
Training results
Training Loss | Epoch | Step | Validation Loss | Accuracy | Brier Loss | Nll | F1 Micro | F1 Macro | Ece | Aurc |
---|---|---|---|---|---|---|---|---|---|---|
No log | 0.96 | 12 | 2.5859 | 0.05 | 0.9010 | 18.0478 | 0.0500 | 0.0539 | 0.1568 | 0.9609 |
No log | 2.0 | 25 | 2.5164 | 0.225 | 0.8825 | 11.6684 | 0.225 | 0.1480 | 0.2600 | 0.7882 |
No log | 2.96 | 37 | 2.3841 | 0.435 | 0.8485 | 4.9677 | 0.435 | 0.2723 | 0.4030 | 0.2751 |
No log | 4.0 | 50 | 2.1765 | 0.72 | 0.7878 | 3.1818 | 0.72 | 0.6092 | 0.5962 | 0.0863 |
No log | 4.96 | 62 | 1.9498 | 0.8 | 0.7104 | 1.5065 | 0.8000 | 0.7202 | 0.6069 | 0.0453 |
No log | 6.0 | 75 | 1.6936 | 0.85 | 0.6116 | 1.3333 | 0.85 | 0.8023 | 0.5897 | 0.0375 |
No log | 6.96 | 87 | 1.4794 | 0.885 | 0.5207 | 1.1014 | 0.885 | 0.8689 | 0.5429 | 0.0360 |
No log | 8.0 | 100 | 1.2753 | 0.905 | 0.4293 | 0.8992 | 0.905 | 0.8958 | 0.4920 | 0.0289 |
No log | 8.96 | 112 | 1.1140 | 0.91 | 0.3546 | 1.0480 | 0.91 | 0.8979 | 0.4226 | 0.0229 |
No log | 10.0 | 125 | 0.9650 | 0.91 | 0.2872 | 0.8855 | 0.91 | 0.9076 | 0.3683 | 0.0210 |
No log | 10.96 | 137 | 0.8788 | 0.915 | 0.2493 | 1.0637 | 0.915 | 0.9156 | 0.3070 | 0.0202 |
No log | 12.0 | 150 | 0.7997 | 0.9 | 0.2207 | 1.0602 | 0.9 | 0.9020 | 0.2929 | 0.0201 |
No log | 12.96 | 162 | 0.7460 | 0.915 | 0.2033 | 0.8874 | 0.915 | 0.9106 | 0.2675 | 0.0175 |
No log | 14.0 | 175 | 0.6631 | 0.925 | 0.1713 | 0.8091 | 0.925 | 0.9178 | 0.2271 | 0.0137 |
No log | 14.96 | 187 | 0.6219 | 0.925 | 0.1593 | 0.9488 | 0.925 | 0.9169 | 0.2183 | 0.0177 |
No log | 16.0 | 200 | 0.5861 | 0.93 | 0.1531 | 0.9181 | 0.93 | 0.9345 | 0.2110 | 0.0193 |
No log | 16.96 | 212 | 0.5557 | 0.93 | 0.1407 | 0.9323 | 0.93 | 0.9247 | 0.1725 | 0.0186 |
No log | 18.0 | 225 | 0.5394 | 0.92 | 0.1446 | 0.7790 | 0.92 | 0.9150 | 0.1847 | 0.0165 |
No log | 18.96 | 237 | 0.5170 | 0.93 | 0.1345 | 0.7822 | 0.93 | 0.9269 | 0.1598 | 0.0157 |
No log | 20.0 | 250 | 0.5079 | 0.91 | 0.1356 | 0.9286 | 0.91 | 0.9084 | 0.1598 | 0.0161 |
No log | 20.96 | 262 | 0.4945 | 0.92 | 0.1342 | 0.7583 | 0.92 | 0.9150 | 0.1470 | 0.0153 |
No log | 22.0 | 275 | 0.4850 | 0.91 | 0.1330 | 0.7760 | 0.91 | 0.9084 | 0.1398 | 0.0155 |
No log | 22.96 | 287 | 0.4828 | 0.91 | 0.1334 | 0.9411 | 0.91 | 0.9084 | 0.1487 | 0.0154 |
No log | 24.0 | 300 | 0.4758 | 0.91 | 0.1324 | 0.9241 | 0.91 | 0.9084 | 0.1294 | 0.0153 |
No log | 24.96 | 312 | 0.4712 | 0.91 | 0.1327 | 1.0746 | 0.91 | 0.9084 | 0.1322 | 0.0156 |
No log | 26.0 | 325 | 0.4672 | 0.91 | 0.1321 | 1.0726 | 0.91 | 0.9084 | 0.1248 | 0.0153 |
No log | 26.96 | 337 | 0.4659 | 0.91 | 0.1331 | 1.0712 | 0.91 | 0.9084 | 0.1320 | 0.0153 |
No log | 28.0 | 350 | 0.4618 | 0.91 | 0.1323 | 1.0693 | 0.91 | 0.9084 | 0.1276 | 0.0151 |
No log | 28.96 | 362 | 0.4564 | 0.91 | 0.1315 | 1.0707 | 0.91 | 0.9084 | 0.1284 | 0.0156 |
No log | 30.0 | 375 | 0.4587 | 0.91 | 0.1348 | 1.0736 | 0.91 | 0.9080 | 0.1334 | 0.0160 |
No log | 30.96 | 387 | 0.4554 | 0.91 | 0.1334 | 1.0663 | 0.91 | 0.9080 | 0.1104 | 0.0155 |
No log | 32.0 | 400 | 0.4512 | 0.91 | 0.1331 | 1.0691 | 0.91 | 0.9080 | 0.1079 | 0.0158 |
No log | 32.96 | 412 | 0.4569 | 0.91 | 0.1385 | 1.2348 | 0.91 | 0.9080 | 0.1184 | 0.0166 |
No log | 34.0 | 425 | 0.4516 | 0.91 | 0.1362 | 1.0761 | 0.91 | 0.9080 | 0.1178 | 0.0160 |
No log | 34.96 | 437 | 0.4531 | 0.905 | 0.1376 | 1.0636 | 0.905 | 0.9051 | 0.1252 | 0.0156 |
No log | 36.0 | 450 | 0.4526 | 0.91 | 0.1384 | 1.1004 | 0.91 | 0.9080 | 0.1128 | 0.0164 |
No log | 36.96 | 462 | 0.4460 | 0.91 | 0.1345 | 1.0779 | 0.91 | 0.9080 | 0.1093 | 0.0173 |
No log | 38.0 | 475 | 0.4422 | 0.905 | 0.1347 | 1.0680 | 0.905 | 0.9047 | 0.1237 | 0.0163 |
No log | 38.96 | 487 | 0.4510 | 0.905 | 0.1410 | 1.2254 | 0.905 | 0.9065 | 0.1138 | 0.0176 |
0.6142 | 40.0 | 500 | 0.4426 | 0.91 | 0.1370 | 1.2232 | 0.91 | 0.9080 | 0.1067 | 0.0163 |
0.6142 | 40.96 | 512 | 0.4469 | 0.91 | 0.1407 | 1.2227 | 0.91 | 0.9080 | 0.1285 | 0.0159 |
0.6142 | 42.0 | 525 | 0.4456 | 0.91 | 0.1403 | 1.2315 | 0.91 | 0.9080 | 0.1099 | 0.0163 |
0.6142 | 42.96 | 537 | 0.4336 | 0.905 | 0.1319 | 1.2228 | 0.905 | 0.9028 | 0.1132 | 0.0180 |
0.6142 | 44.0 | 550 | 0.4435 | 0.915 | 0.1357 | 1.0493 | 0.915 | 0.9133 | 0.1055 | 0.0146 |
0.6142 | 44.96 | 562 | 0.5008 | 0.905 | 0.1585 | 1.4460 | 0.905 | 0.9070 | 0.1219 | 0.0234 |
0.6142 | 46.0 | 575 | 0.4496 | 0.915 | 0.1428 | 1.3785 | 0.915 | 0.9138 | 0.1162 | 0.0164 |
0.6142 | 46.96 | 587 | 0.4627 | 0.915 | 0.1476 | 1.3925 | 0.915 | 0.9138 | 0.1166 | 0.0178 |
0.6142 | 48.0 | 600 | 0.4545 | 0.915 | 0.1459 | 1.3840 | 0.915 | 0.9138 | 0.1145 | 0.0172 |
0.6142 | 48.96 | 612 | 0.4528 | 0.915 | 0.1457 | 1.3862 | 0.915 | 0.9138 | 0.1088 | 0.0175 |
0.6142 | 50.0 | 625 | 0.4578 | 0.915 | 0.1480 | 1.3905 | 0.915 | 0.9138 | 0.1079 | 0.0176 |
0.6142 | 50.96 | 637 | 0.4574 | 0.91 | 0.1473 | 1.3863 | 0.91 | 0.9105 | 0.1135 | 0.0178 |
0.6142 | 52.0 | 650 | 0.4497 | 0.915 | 0.1455 | 1.3807 | 0.915 | 0.9138 | 0.1072 | 0.0172 |
0.6142 | 52.96 | 662 | 0.4614 | 0.905 | 0.1503 | 1.3893 | 0.905 | 0.9076 | 0.1144 | 0.0174 |
0.6142 | 54.0 | 675 | 0.4560 | 0.915 | 0.1480 | 1.3884 | 0.915 | 0.9138 | 0.1134 | 0.0179 |
0.6142 | 54.96 | 687 | 0.4460 | 0.905 | 0.1457 | 1.2136 | 0.905 | 0.9076 | 0.1129 | 0.0163 |
0.6142 | 56.0 | 700 | 0.4499 | 0.91 | 0.1480 | 1.3847 | 0.91 | 0.9109 | 0.1075 | 0.0176 |
0.6142 | 56.96 | 712 | 0.4520 | 0.915 | 0.1477 | 1.3829 | 0.915 | 0.9138 | 0.1113 | 0.0175 |
0.6142 | 58.0 | 725 | 0.4519 | 0.915 | 0.1481 | 1.5477 | 0.915 | 0.9138 | 0.1059 | 0.0187 |
0.6142 | 58.96 | 737 | 0.4399 | 0.915 | 0.1440 | 1.0570 | 0.915 | 0.9138 | 0.1110 | 0.0167 |
0.6142 | 60.0 | 750 | 0.4451 | 0.91 | 0.1473 | 1.2166 | 0.91 | 0.9080 | 0.1071 | 0.0172 |
0.6142 | 60.96 | 762 | 0.4538 | 0.915 | 0.1502 | 1.2374 | 0.915 | 0.9138 | 0.1115 | 0.0183 |
0.6142 | 62.0 | 775 | 0.4489 | 0.91 | 0.1487 | 1.2152 | 0.91 | 0.9105 | 0.1049 | 0.0171 |
0.6142 | 62.96 | 787 | 0.4489 | 0.915 | 0.1488 | 1.2265 | 0.915 | 0.9138 | 0.1048 | 0.0183 |
0.6142 | 64.0 | 800 | 0.4480 | 0.915 | 0.1481 | 1.2384 | 0.915 | 0.9138 | 0.1092 | 0.0176 |
0.6142 | 64.96 | 812 | 0.4422 | 0.91 | 0.1465 | 1.0633 | 0.91 | 0.9080 | 0.1064 | 0.0164 |
0.6142 | 66.0 | 825 | 0.4410 | 0.91 | 0.1462 | 1.2394 | 0.91 | 0.9080 | 0.1060 | 0.0166 |
0.6142 | 66.96 | 837 | 0.4380 | 0.91 | 0.1460 | 1.2227 | 0.91 | 0.9080 | 0.1006 | 0.0165 |
0.6142 | 68.0 | 850 | 0.4434 | 0.91 | 0.1478 | 1.3838 | 0.91 | 0.9080 | 0.1025 | 0.0171 |
0.6142 | 68.96 | 862 | 0.4511 | 0.91 | 0.1504 | 1.3874 | 0.91 | 0.9080 | 0.1023 | 0.0177 |
0.6142 | 70.0 | 875 | 0.4405 | 0.91 | 0.1460 | 1.3857 | 0.91 | 0.9080 | 0.1076 | 0.0185 |
0.6142 | 70.96 | 887 | 0.4558 | 0.915 | 0.1518 | 1.3987 | 0.915 | 0.9138 | 0.1061 | 0.0191 |
0.6142 | 72.0 | 900 | 0.4434 | 0.91 | 0.1480 | 1.2224 | 0.91 | 0.9080 | 0.1021 | 0.0176 |
0.6142 | 72.96 | 912 | 0.4455 | 0.91 | 0.1491 | 1.2312 | 0.91 | 0.9080 | 0.1031 | 0.0177 |
0.6142 | 74.0 | 925 | 0.4549 | 0.915 | 0.1521 | 1.2332 | 0.915 | 0.9138 | 0.1066 | 0.0183 |
0.6142 | 74.96 | 937 | 0.4567 | 0.91 | 0.1516 | 1.2475 | 0.91 | 0.9109 | 0.0993 | 0.0194 |
0.6142 | 76.0 | 950 | 0.4465 | 0.905 | 0.1490 | 1.2255 | 0.905 | 0.9047 | 0.1050 | 0.0182 |
0.6142 | 76.96 | 962 | 0.4425 | 0.91 | 0.1476 | 1.2256 | 0.91 | 0.9080 | 0.1037 | 0.0180 |
0.6142 | 78.0 | 975 | 0.4535 | 0.91 | 0.1519 | 1.2320 | 0.91 | 0.9080 | 0.1070 | 0.0184 |
0.6142 | 78.96 | 987 | 0.4509 | 0.915 | 0.1507 | 1.2352 | 0.915 | 0.9138 | 0.1036 | 0.0185 |
0.0861 | 80.0 | 1000 | 0.4566 | 0.91 | 0.1527 | 1.2324 | 0.91 | 0.9109 | 0.1061 | 0.0185 |
0.0861 | 80.96 | 1012 | 0.4544 | 0.915 | 0.1513 | 1.3928 | 0.915 | 0.9138 | 0.1074 | 0.0183 |
0.0861 | 82.0 | 1025 | 0.4544 | 0.915 | 0.1512 | 1.2383 | 0.915 | 0.9138 | 0.1034 | 0.0186 |
0.0861 | 82.96 | 1037 | 0.4597 | 0.915 | 0.1532 | 1.3974 | 0.915 | 0.9138 | 0.1014 | 0.0188 |
0.0861 | 84.0 | 1050 | 0.4508 | 0.905 | 0.1515 | 1.2298 | 0.905 | 0.9047 | 0.1029 | 0.0180 |
0.0861 | 84.96 | 1062 | 0.4507 | 0.915 | 0.1508 | 1.2297 | 0.915 | 0.9138 | 0.1058 | 0.0178 |
0.0861 | 86.0 | 1075 | 0.4557 | 0.915 | 0.1524 | 1.3915 | 0.915 | 0.9138 | 0.1082 | 0.0183 |
0.0861 | 86.96 | 1087 | 0.4532 | 0.915 | 0.1516 | 1.2274 | 0.915 | 0.9138 | 0.1077 | 0.0184 |
0.0861 | 88.0 | 1100 | 0.4492 | 0.91 | 0.1506 | 1.2265 | 0.91 | 0.9080 | 0.1018 | 0.0181 |
0.0861 | 88.96 | 1112 | 0.4502 | 0.91 | 0.1511 | 1.2304 | 0.91 | 0.9080 | 0.1062 | 0.0184 |
0.0861 | 90.0 | 1125 | 0.4582 | 0.91 | 0.1536 | 1.2361 | 0.91 | 0.9080 | 0.1046 | 0.0186 |
0.0861 | 90.96 | 1137 | 0.4576 | 0.915 | 0.1531 | 1.2504 | 0.915 | 0.9138 | 0.1046 | 0.0187 |
0.0861 | 92.0 | 1150 | 0.4589 | 0.915 | 0.1533 | 1.3958 | 0.915 | 0.9138 | 0.1052 | 0.0188 |
0.0861 | 92.96 | 1162 | 0.4594 | 0.915 | 0.1536 | 1.2503 | 0.915 | 0.9138 | 0.1114 | 0.0188 |
0.0861 | 94.0 | 1175 | 0.4570 | 0.915 | 0.1529 | 1.2441 | 0.915 | 0.9138 | 0.1078 | 0.0185 |
0.0861 | 94.96 | 1187 | 0.4577 | 0.915 | 0.1531 | 1.2402 | 0.915 | 0.9138 | 0.1052 | 0.0186 |
0.0861 | 96.0 | 1200 | 0.4581 | 0.915 | 0.1532 | 1.2415 | 0.915 | 0.9138 | 0.1052 | 0.0187 |
Framework versions
- Transformers 4.28.0.dev0
- Pytorch 1.12.1+cu113
- Datasets 2.12.0
- Tokenizers 0.12.1