<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->
k40-B128-klue-roberta-large-finetuned-train_dataset
This model is a fine-tuned version of klue/roberta-large on the None dataset. It achieves the following results on the evaluation set:
- Exact Match: 59.5833
- F1: 67.7399
- Loss: 2.0625
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 1e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- gradient_accumulation_steps: 16
- total_train_batch_size: 128
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 30
- mixed_precision_training: Native AMP
- label_smoothing_factor: 0.1
Training results
Training Loss | Epoch | Step | Exact Match | F1 | Validation Loss |
---|---|---|---|---|---|
5.3337 | 0.42 | 20 | 0.8333 | 3.3629 | 4.3672 |
3.4151 | 0.83 | 40 | 54.5833 | 63.9656 | 2.3438 |
1.6499 | 1.25 | 60 | 59.5833 | 69.6270 | 1.4004 |
1.2309 | 1.67 | 80 | 64.5833 | 75.2648 | 1.1523 |
0.9588 | 2.08 | 100 | 62.9167 | 75.0527 | 1.2793 |
0.7368 | 2.5 | 120 | 65.8333 | 75.2080 | 1.0957 |
0.7607 | 2.92 | 140 | 62.9167 | 72.4981 | 1.2559 |
0.6658 | 3.33 | 160 | 61.25 | 69.8265 | 1.3887 |
0.7594 | 3.75 | 180 | 64.5833 | 74.1011 | 1.2891 |
0.6627 | 4.17 | 200 | 60.8333 | 69.9682 | 1.2480 |
0.5145 | 4.58 | 220 | 62.5 | 72.4746 | 1.3770 |
0.4045 | 5.0 | 240 | 67.0833 | 76.2119 | 1.1836 |
0.3877 | 5.42 | 260 | 62.5 | 73.4728 | 1.3740 |
0.265 | 5.83 | 280 | 65.4167 | 75.0515 | 1.2695 |
0.1806 | 6.25 | 300 | 66.6667 | 76.6871 | 1.2969 |
0.1215 | 6.67 | 320 | 67.5 | 77.1973 | 1.4023 |
0.1149 | 7.08 | 340 | 62.9167 | 73.2902 | 1.5371 |
0.1462 | 7.5 | 360 | 64.5833 | 74.8781 | 1.5 |
0.2145 | 7.92 | 380 | 62.9167 | 72.5644 | 1.5156 |
0.2666 | 8.33 | 400 | 64.1667 | 73.7049 | 1.2617 |
0.2852 | 8.75 | 420 | 62.9167 | 72.7485 | 1.5898 |
0.4549 | 9.17 | 440 | 64.1667 | 73.7479 | 1.3828 |
0.4197 | 9.58 | 460 | 60.4167 | 70.6652 | 1.5967 |
0.4102 | 10.0 | 480 | 60.0 | 68.9411 | 1.4824 |
0.3265 | 10.42 | 500 | 58.3333 | 68.6892 | 1.5420 |
0.282 | 10.83 | 520 | 64.1667 | 73.6281 | 1.4746 |
0.1944 | 11.25 | 540 | 59.5833 | 69.5715 | 1.8154 |
0.1885 | 11.67 | 560 | 66.6667 | 75.0492 | 1.5430 |
0.1429 | 12.08 | 580 | 67.9167 | 75.9797 | 1.5938 |
0.055 | 12.5 | 600 | 66.25 | 74.7943 | 1.8848 |
0.0699 | 12.92 | 620 | 68.3333 | 75.8125 | 1.7393 |
0.049 | 13.33 | 640 | 65.4167 | 74.6337 | 1.9912 |
0.0892 | 13.75 | 660 | 65.0 | 73.7429 | 1.9355 |
0.0951 | 14.17 | 680 | 65.4167 | 72.9353 | 1.7695 |
0.1836 | 14.58 | 700 | 60.4167 | 67.7400 | 1.8242 |
0.1577 | 15.0 | 720 | 62.9167 | 73.3646 | 2.0352 |
0.1808 | 15.42 | 740 | 58.3333 | 68.1992 | 1.9990 |
0.2799 | 15.83 | 760 | 58.75 | 68.8545 | 1.7031 |
0.1927 | 16.25 | 780 | 63.3333 | 71.2111 | 1.8945 |
0.217 | 16.67 | 800 | 63.3333 | 72.2130 | 1.5957 |
0.1768 | 17.08 | 820 | 62.9167 | 72.2659 | 1.7617 |
0.122 | 17.5 | 840 | 60.0 | 68.5236 | 1.9043 |
0.1132 | 17.92 | 860 | 62.0833 | 72.1359 | 1.7256 |
0.0574 | 18.33 | 880 | 62.5 | 72.6974 | 1.7656 |
0.0516 | 18.75 | 900 | 64.1667 | 74.5338 | 1.7256 |
0.0302 | 19.17 | 920 | 68.75 | 77.9872 | 1.8203 |
0.023 | 19.58 | 940 | 64.1667 | 73.2770 | 2.0469 |
0.0567 | 20.0 | 960 | 66.25 | 73.3724 | 2.0586 |
0.0949 | 20.42 | 980 | 63.75 | 73.3629 | 2.0215 |
0.0937 | 20.83 | 1000 | 67.5 | 76.6887 | 1.4590 |
0.1035 | 21.25 | 1020 | 60.0 | 69.8886 | 2.0293 |
0.1576 | 21.67 | 1040 | 60.8333 | 69.5714 | 1.8809 |
0.2365 | 22.08 | 1060 | 62.9167 | 72.0051 | 1.8057 |
0.1557 | 22.5 | 1080 | 63.75 | 72.1278 | 1.8008 |
0.1286 | 22.92 | 1100 | 66.6667 | 74.1009 | 2.0176 |
0.1205 | 23.33 | 1120 | 60.4167 | 68.4010 | 2.2031 |
0.0916 | 23.75 | 1140 | 63.3333 | 70.3330 | 2.2461 |
0.0715 | 24.17 | 1160 | 65.4167 | 72.7984 | 1.9834 |
0.049 | 24.58 | 1180 | 66.6667 | 74.0249 | 1.8330 |
0.038 | 25.0 | 1200 | 67.0833 | 74.7748 | 1.9727 |
0.0199 | 25.42 | 1220 | 67.5 | 75.0162 | 2.0703 |
0.0202 | 25.83 | 1240 | 66.6667 | 74.4894 | 2.1953 |
0.0737 | 26.25 | 1260 | 65.0 | 72.1118 | 1.9307 |
0.0711 | 26.67 | 1280 | 65.4167 | 71.8644 | 2.1289 |
0.0919 | 27.08 | 1300 | 63.75 | 71.8534 | 1.8633 |
0.1059 | 27.5 | 1320 | 67.5 | 74.1489 | 1.6973 |
0.1114 | 27.92 | 1340 | 62.0833 | 70.8896 | 1.7539 |
0.1717 | 28.33 | 1360 | 60.8333 | 70.2865 | 2.125 |
0.1708 | 28.75 | 1380 | 65.4167 | 72.7123 | 2.125 |
0.1359 | 29.17 | 1400 | 60.8333 | 69.9801 | 2.0508 |
0.0985 | 29.58 | 1420 | 57.0833 | 64.9947 | 2.4922 |
0.1041 | 30.0 | 1440 | 59.5833 | 67.7399 | 2.0625 |
Framework versions
- Transformers 4.25.1
- Pytorch 1.13.0+cu117
- Datasets 2.7.1
- Tokenizers 0.13.2