<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->
Public100_MobileBERT_20epoch_notweettokenizer
This model is a fine-tuned version of Youssef320/Public100_MobileBERT_5epoch_2hidden_notweettokenizer on the None dataset. It achieves the following results on the evaluation set:
- Loss: 3.5035
- Top 1 Macro F1 Score: 0.0718
- Top 1 Weighted F1score: 0.1188
- Top 3 Macro F1 Score: 0.1751
- Top3 3 Weighted F1 Score : 0.2652
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0002
- train_batch_size: 64
- eval_batch_size: 64
- seed: 42
- gradient_accumulation_steps: 32
- total_train_batch_size: 2048
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: constant
- num_epochs: 15.0
Training results
Training Loss | Epoch | Step | Validation Loss | Top 1 Macro F1 Score | Top 1 Weighted F1score | Top 3 Macro F1 Score | Top3 3 Weighted F1 Score |
---|---|---|---|---|---|---|---|
3.6108 | 0.12 | 64 | 3.6491 | 0.0460 | 0.0857 | 0.1326 | 0.2206 |
3.623 | 0.25 | 128 | 3.6510 | 0.0476 | 0.0879 | 0.1333 | 0.2190 |
3.6082 | 0.38 | 192 | 3.6414 | 0.0481 | 0.0884 | 0.1363 | 0.2239 |
3.6153 | 0.5 | 256 | 3.6450 | 0.0464 | 0.0870 | 0.1330 | 0.2219 |
3.6232 | 0.62 | 320 | 3.6390 | 0.0487 | 0.0904 | 0.1336 | 0.2226 |
3.6417 | 0.75 | 384 | 3.6367 | 0.0464 | 0.0865 | 0.1333 | 0.2206 |
3.6587 | 0.88 | 448 | 3.6313 | 0.0486 | 0.0902 | 0.1365 | 0.2238 |
3.6255 | 1.0 | 512 | 3.6313 | 0.0490 | 0.0902 | 0.1368 | 0.2252 |
3.6402 | 1.12 | 576 | 3.6298 | 0.0493 | 0.0907 | 0.1369 | 0.2235 |
3.6114 | 1.25 | 640 | 3.6287 | 0.0501 | 0.0919 | 0.1361 | 0.2246 |
3.6172 | 1.38 | 704 | 3.6249 | 0.0520 | 0.0943 | 0.1359 | 0.2230 |
3.6161 | 1.5 | 768 | 3.6256 | 0.0501 | 0.0927 | 0.1364 | 0.2255 |
3.615 | 1.62 | 832 | 3.6216 | 0.0512 | 0.0933 | 0.1411 | 0.2292 |
3.6177 | 1.75 | 896 | 3.6191 | 0.0521 | 0.0942 | 0.1369 | 0.2263 |
3.5927 | 1.88 | 960 | 3.6179 | 0.0503 | 0.0921 | 0.1378 | 0.2261 |
3.6205 | 2.0 | 1024 | 3.6176 | 0.0518 | 0.0939 | 0.1423 | 0.2323 |
3.6113 | 2.12 | 1088 | 3.6143 | 0.0538 | 0.0963 | 0.1417 | 0.2302 |
3.5706 | 2.25 | 1152 | 3.6199 | 0.0514 | 0.0918 | 0.1408 | 0.2258 |
3.6033 | 2.38 | 1216 | 3.6109 | 0.0521 | 0.0935 | 0.1391 | 0.2282 |
3.6152 | 2.5 | 1280 | 3.6094 | 0.0536 | 0.0969 | 0.1434 | 0.2307 |
3.5987 | 2.62 | 1344 | 3.6071 | 0.0557 | 0.0985 | 0.1477 | 0.2352 |
3.5949 | 2.75 | 1408 | 3.6046 | 0.0537 | 0.0955 | 0.1440 | 0.2302 |
3.6053 | 2.88 | 1472 | 3.6024 | 0.0565 | 0.1005 | 0.1418 | 0.2325 |
3.6039 | 3.0 | 1536 | 3.6019 | 0.0540 | 0.0969 | 0.1455 | 0.2343 |
3.5723 | 3.12 | 1600 | 3.6026 | 0.0549 | 0.0978 | 0.1473 | 0.2345 |
3.5799 | 3.25 | 1664 | 3.5989 | 0.0566 | 0.0995 | 0.1452 | 0.2333 |
3.5675 | 3.38 | 1728 | 3.5996 | 0.0542 | 0.0964 | 0.1467 | 0.2341 |
3.5797 | 3.5 | 1792 | 3.5957 | 0.0565 | 0.1003 | 0.1503 | 0.2382 |
3.5864 | 3.62 | 1856 | 3.5945 | 0.0569 | 0.1003 | 0.1483 | 0.2350 |
3.5545 | 3.75 | 1920 | 3.5892 | 0.0582 | 0.1016 | 0.1526 | 0.2405 |
3.5809 | 3.88 | 1984 | 3.5883 | 0.0582 | 0.1023 | 0.1511 | 0.2412 |
3.57 | 4.0 | 2048 | 3.5862 | 0.0557 | 0.0996 | 0.1485 | 0.2387 |
3.549 | 4.12 | 2112 | 3.5883 | 0.0601 | 0.1038 | 0.1518 | 0.2409 |
3.5567 | 4.25 | 2176 | 3.5855 | 0.0589 | 0.1034 | 0.1491 | 0.2391 |
3.5657 | 4.38 | 2240 | 3.5847 | 0.0582 | 0.1012 | 0.1515 | 0.2417 |
3.5727 | 4.5 | 2304 | 3.5792 | 0.0591 | 0.1033 | 0.1518 | 0.2418 |
3.5482 | 4.62 | 2368 | 3.5821 | 0.0583 | 0.1031 | 0.1540 | 0.2425 |
3.5766 | 4.75 | 2432 | 3.5783 | 0.0585 | 0.1026 | 0.1479 | 0.2394 |
3.5395 | 4.88 | 2496 | 3.5756 | 0.0607 | 0.1046 | 0.1567 | 0.2436 |
3.5614 | 5.0 | 2560 | 3.5715 | 0.0610 | 0.1062 | 0.1560 | 0.2472 |
3.5559 | 5.12 | 2624 | 3.5714 | 0.0622 | 0.1070 | 0.1580 | 0.2476 |
3.5387 | 5.25 | 2688 | 3.5777 | 0.0622 | 0.1067 | 0.1554 | 0.2453 |
3.5466 | 5.38 | 2752 | 3.5704 | 0.0614 | 0.1061 | 0.1577 | 0.2460 |
3.5567 | 5.5 | 2816 | 3.5701 | 0.0622 | 0.1067 | 0.1560 | 0.2456 |
3.5367 | 5.62 | 2880 | 3.5722 | 0.0630 | 0.1081 | 0.1542 | 0.2456 |
3.5276 | 5.75 | 2944 | 3.5662 | 0.0625 | 0.1065 | 0.1596 | 0.2479 |
3.5505 | 5.88 | 3008 | 3.5632 | 0.0629 | 0.1079 | 0.1572 | 0.2480 |
3.5482 | 6.0 | 3072 | 3.5659 | 0.0636 | 0.1097 | 0.1574 | 0.2497 |
3.5075 | 6.12 | 3136 | 3.5632 | 0.0631 | 0.1081 | 0.1577 | 0.2469 |
3.5373 | 6.25 | 3200 | 3.5617 | 0.0627 | 0.1081 | 0.1586 | 0.2485 |
3.5316 | 6.38 | 3264 | 3.5616 | 0.0636 | 0.1100 | 0.1578 | 0.2491 |
3.5259 | 6.5 | 3328 | 3.5591 | 0.0654 | 0.1102 | 0.1607 | 0.2501 |
3.5378 | 6.62 | 3392 | 3.5572 | 0.0635 | 0.1088 | 0.1602 | 0.2501 |
3.5438 | 6.75 | 3456 | 3.5615 | 0.0639 | 0.1086 | 0.1604 | 0.2503 |
3.5255 | 6.88 | 3520 | 3.5538 | 0.0625 | 0.1079 | 0.1608 | 0.2511 |
3.5187 | 7.0 | 3584 | 3.5533 | 0.0645 | 0.1089 | 0.1606 | 0.2510 |
3.5366 | 7.12 | 3648 | 3.5568 | 0.0656 | 0.1108 | 0.1640 | 0.2519 |
3.5333 | 7.25 | 3712 | 3.5520 | 0.0664 | 0.1120 | 0.1626 | 0.2536 |
3.5214 | 7.38 | 3776 | 3.5585 | 0.0652 | 0.1105 | 0.1611 | 0.2514 |
3.5543 | 7.5 | 3840 | 3.5544 | 0.0642 | 0.1107 | 0.1607 | 0.2522 |
3.5317 | 7.62 | 3904 | 3.5499 | 0.0656 | 0.1115 | 0.1602 | 0.2515 |
3.5196 | 7.75 | 3968 | 3.5472 | 0.0685 | 0.1161 | 0.1669 | 0.2561 |
3.5596 | 7.88 | 4032 | 3.5505 | 0.0644 | 0.1090 | 0.1627 | 0.2509 |
3.5157 | 8.0 | 4096 | 3.5524 | 0.0655 | 0.1124 | 0.1643 | 0.2528 |
3.5271 | 8.12 | 4160 | 3.5479 | 0.0686 | 0.1148 | 0.1654 | 0.2559 |
3.502 | 8.25 | 4224 | 3.5503 | 0.0672 | 0.1135 | 0.1632 | 0.2521 |
3.505 | 8.38 | 4288 | 3.5473 | 0.0693 | 0.1160 | 0.1676 | 0.2564 |
3.5303 | 8.5 | 4352 | 3.5470 | 0.0687 | 0.1140 | 0.1681 | 0.2570 |
3.5138 | 8.62 | 4416 | 3.5407 | 0.0684 | 0.1140 | 0.1670 | 0.2566 |
3.5142 | 8.75 | 4480 | 3.5414 | 0.0670 | 0.1130 | 0.1671 | 0.2551 |
3.5404 | 8.88 | 4544 | 3.5448 | 0.0674 | 0.1133 | 0.1663 | 0.2561 |
3.5134 | 9.0 | 4608 | 3.5430 | 0.0675 | 0.1124 | 0.1644 | 0.2545 |
3.5133 | 9.12 | 4672 | 3.5376 | 0.0688 | 0.1148 | 0.1680 | 0.2575 |
3.5285 | 9.25 | 4736 | 3.5426 | 0.0659 | 0.1116 | 0.1637 | 0.2539 |
3.5025 | 9.38 | 4800 | 3.5395 | 0.0704 | 0.1168 | 0.1658 | 0.2558 |
3.4903 | 9.5 | 4864 | 3.5386 | 0.0670 | 0.1131 | 0.1679 | 0.2554 |
3.4992 | 9.62 | 4928 | 3.5355 | 0.0676 | 0.1136 | 0.1665 | 0.2561 |
3.5021 | 9.75 | 4992 | 3.5397 | 0.0686 | 0.1135 | 0.1694 | 0.2583 |
3.5202 | 9.88 | 5056 | 3.5347 | 0.0708 | 0.1165 | 0.1727 | 0.2610 |
3.506 | 10.0 | 5120 | 3.5407 | 0.0675 | 0.1129 | 0.1668 | 0.2556 |
3.505 | 10.12 | 5184 | 3.5366 | 0.0690 | 0.1151 | 0.1677 | 0.2576 |
3.4971 | 10.25 | 5248 | 3.5370 | 0.0693 | 0.1157 | 0.1695 | 0.2597 |
3.4993 | 10.38 | 5312 | 3.5344 | 0.0688 | 0.1154 | 0.1677 | 0.2585 |
3.5125 | 10.5 | 5376 | 3.5300 | 0.0704 | 0.1175 | 0.1707 | 0.2606 |
3.4975 | 10.62 | 5440 | 3.5339 | 0.0708 | 0.1172 | 0.1689 | 0.2580 |
3.5193 | 10.75 | 5504 | 3.5292 | 0.0716 | 0.1185 | 0.1718 | 0.2616 |
3.4898 | 10.88 | 5568 | 3.5301 | 0.0715 | 0.1181 | 0.1702 | 0.2607 |
3.4971 | 11.0 | 5632 | 3.5312 | 0.0701 | 0.1173 | 0.1698 | 0.2600 |
3.4909 | 11.12 | 5696 | 3.5280 | 0.0695 | 0.1167 | 0.1705 | 0.2609 |
3.485 | 11.25 | 5760 | 3.5312 | 0.0704 | 0.1152 | 0.1730 | 0.2591 |
3.4916 | 11.38 | 5824 | 3.5272 | 0.0716 | 0.1186 | 0.1710 | 0.2612 |
3.5049 | 11.5 | 5888 | 3.5269 | 0.0710 | 0.1170 | 0.1728 | 0.2608 |
3.5037 | 11.62 | 5952 | 3.5234 | 0.0727 | 0.1185 | 0.1749 | 0.2634 |
3.508 | 11.75 | 6016 | 3.5232 | 0.0725 | 0.1204 | 0.1749 | 0.2636 |
3.492 | 11.88 | 6080 | 3.5205 | 0.0728 | 0.1197 | 0.1738 | 0.2642 |
3.471 | 12.0 | 6144 | 3.5253 | 0.0730 | 0.1203 | 0.1734 | 0.2625 |
3.478 | 12.12 | 6208 | 3.5231 | 0.0730 | 0.1208 | 0.1750 | 0.2662 |
3.4745 | 12.25 | 6272 | 3.5231 | 0.0722 | 0.1197 | 0.1751 | 0.2644 |
3.4906 | 12.38 | 6336 | 3.5234 | 0.0707 | 0.1168 | 0.1715 | 0.2602 |
3.4625 | 12.5 | 6400 | 3.5206 | 0.0722 | 0.1196 | 0.1720 | 0.2620 |
3.5299 | 12.62 | 6464 | 3.5193 | 0.0732 | 0.1181 | 0.1766 | 0.2622 |
3.4727 | 12.75 | 6528 | 3.5210 | 0.0713 | 0.1174 | 0.1760 | 0.2628 |
3.4615 | 12.88 | 6592 | 3.5209 | 0.0711 | 0.1168 | 0.1723 | 0.2608 |
3.5003 | 13.0 | 6656 | 3.5176 | 0.0729 | 0.1203 | 0.1736 | 0.2635 |
3.467 | 13.12 | 6720 | 3.5196 | 0.0717 | 0.1182 | 0.1735 | 0.2618 |
3.4596 | 13.25 | 6784 | 3.5208 | 0.0738 | 0.1202 | 0.1755 | 0.2644 |
3.4484 | 13.38 | 6848 | 3.5176 | 0.0731 | 0.1197 | 0.1750 | 0.2629 |
3.4744 | 13.5 | 6912 | 3.5185 | 0.0740 | 0.1200 | 0.1748 | 0.2639 |
3.4826 | 13.62 | 6976 | 3.5146 | 0.0729 | 0.1203 | 0.1754 | 0.2657 |
3.4905 | 13.75 | 7040 | 3.5138 | 0.0750 | 0.1214 | 0.1768 | 0.2669 |
3.4898 | 13.88 | 7104 | 3.5144 | 0.0750 | 0.1224 | 0.1771 | 0.2653 |
3.515 | 14.0 | 7168 | 3.5119 | 0.0724 | 0.1196 | 0.1795 | 0.2668 |
3.4785 | 14.12 | 7232 | 3.5126 | 0.0750 | 0.1237 | 0.1773 | 0.2683 |
3.4617 | 14.25 | 7296 | 3.5121 | 0.0744 | 0.1224 | 0.1766 | 0.2660 |
3.4653 | 14.38 | 7360 | 3.5126 | 0.0737 | 0.1217 | 0.1777 | 0.2679 |
3.4669 | 14.5 | 7424 | 3.5098 | 0.0722 | 0.1187 | 0.1731 | 0.2616 |
3.4712 | 14.62 | 7488 | 3.5101 | 0.0734 | 0.1209 | 0.1765 | 0.2657 |
3.4592 | 14.75 | 7552 | 3.5091 | 0.0768 | 0.1229 | 0.1802 | 0.2700 |
3.4568 | 14.88 | 7616 | 3.5067 | 0.0746 | 0.1221 | 0.1788 | 0.2688 |
3.4643 | 15.0 | 7680 | 3.5035 | 0.0718 | 0.1188 | 0.1751 | 0.2652 |
Framework versions
- Transformers 4.20.1
- Pytorch 1.12.1+cu102
- Datasets 2.0.0
- Tokenizers 0.11.0