<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->
8_BigBird_train_korquad-1-2_aihub_final
This model is a fine-tuned version of monologg/kobigbird-bert-base on the None dataset. It achieves the following results on the evaluation set:
- Exact Match: 69.4165
- F1: 84.3396
- Loss: 0.6682
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 1e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- gradient_accumulation_steps: 4
- total_train_batch_size: 32
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 10
- mixed_precision_training: Native AMP
- label_smoothing_factor: 0.1
Training results
Training Loss | Epoch | Step | Exact Match | F1 | Validation Loss |
---|---|---|---|---|---|
2.2126 | 0.15 | 1000 | 45.3591 | 61.7326 | 1.7300 |
1.2225 | 0.3 | 2000 | 58.8824 | 75.9193 | 1.0819 |
1.0341 | 0.46 | 3000 | 62.4417 | 78.6100 | 0.9518 |
0.9669 | 0.61 | 4000 | 63.1104 | 79.4809 | 0.9092 |
0.9701 | 0.76 | 5000 | 63.6355 | 79.8112 | 0.8893 |
0.942 | 0.91 | 6000 | 64.7531 | 80.8430 | 0.8519 |
0.9174 | 1.07 | 7000 | 65.4937 | 81.2925 | 0.8199 |
0.8669 | 1.22 | 8000 | 65.7316 | 81.4858 | 0.8086 |
0.877 | 1.37 | 9000 | 66.0368 | 81.7286 | 0.8033 |
0.8191 | 1.52 | 10000 | 66.3465 | 81.9348 | 0.7912 |
0.815 | 1.68 | 11000 | 66.4991 | 81.9664 | 0.7767 |
0.827 | 1.83 | 12000 | 66.8357 | 82.2967 | 0.7648 |
0.7817 | 1.98 | 13000 | 66.9031 | 82.3438 | 0.7636 |
0.8217 | 2.13 | 14000 | 67.0242 | 82.4987 | 0.7512 |
0.7624 | 2.28 | 15000 | 67.3609 | 82.6260 | 0.7452 |
0.8055 | 2.44 | 16000 | 67.2980 | 82.6597 | 0.7414 |
0.7582 | 2.59 | 17000 | 67.3609 | 82.7631 | 0.7380 |
0.7335 | 2.74 | 18000 | 67.7379 | 82.9105 | 0.7403 |
0.7316 | 2.89 | 19000 | 67.8321 | 83.0111 | 0.7332 |
0.7711 | 3.05 | 20000 | 67.9713 | 83.1762 | 0.7223 |
0.7457 | 3.2 | 21000 | 67.7469 | 83.1017 | 0.7317 |
0.7474 | 3.35 | 22000 | 67.9758 | 83.2512 | 0.7218 |
0.7197 | 3.5 | 23000 | 68.0431 | 83.1601 | 0.7158 |
0.7314 | 3.66 | 24000 | 68.2496 | 83.3966 | 0.7106 |
0.7215 | 3.81 | 25000 | 68.2585 | 83.4286 | 0.7102 |
0.7122 | 3.96 | 26000 | 68.3842 | 83.4214 | 0.7112 |
0.6783 | 4.11 | 27000 | 68.5009 | 83.5003 | 0.7086 |
0.6702 | 4.27 | 28000 | 68.3393 | 83.4976 | 0.7059 |
0.6927 | 4.42 | 29000 | 68.4740 | 83.5456 | 0.7106 |
0.7001 | 4.57 | 30000 | 68.4605 | 83.5271 | 0.7064 |
0.7046 | 4.72 | 31000 | 68.4919 | 83.5903 | 0.7024 |
0.7109 | 4.87 | 32000 | 68.6804 | 83.6991 | 0.6959 |
0.669 | 5.03 | 33000 | 68.6311 | 83.7599 | 0.6978 |
0.6838 | 5.18 | 34000 | 68.4022 | 83.5342 | 0.7013 |
0.7297 | 5.33 | 35000 | 68.7478 | 83.6654 | 0.6917 |
0.6427 | 5.48 | 36000 | 68.7837 | 83.8393 | 0.6899 |
0.6631 | 5.64 | 37000 | 68.8330 | 83.8576 | 0.6945 |
0.6358 | 5.79 | 38000 | 68.8600 | 83.7743 | 0.6998 |
0.6466 | 5.94 | 39000 | 68.9138 | 83.8577 | 0.6893 |
0.6745 | 6.09 | 40000 | 68.9677 | 83.9212 | 0.6863 |
0.6499 | 6.25 | 41000 | 68.8375 | 83.8774 | 0.6897 |
0.6682 | 6.4 | 42000 | 68.9946 | 83.9670 | 0.6835 |
0.6455 | 6.55 | 43000 | 69.0395 | 83.9357 | 0.6849 |
0.6606 | 6.7 | 44000 | 69.1158 | 84.1033 | 0.6803 |
0.6946 | 6.85 | 45000 | 69.0440 | 83.9837 | 0.6783 |
0.6454 | 7.01 | 46000 | 68.9004 | 83.8873 | 0.6860 |
0.6426 | 7.16 | 47000 | 69.0575 | 84.0540 | 0.6847 |
0.6693 | 7.31 | 48000 | 69.1697 | 84.1046 | 0.6776 |
0.6485 | 7.46 | 49000 | 69.1562 | 84.0952 | 0.6855 |
0.6574 | 7.62 | 50000 | 69.1472 | 84.0841 | 0.6738 |
0.6419 | 7.77 | 51000 | 69.0754 | 84.1166 | 0.6807 |
0.633 | 7.92 | 52000 | 69.2729 | 84.1880 | 0.6719 |
0.6217 | 8.07 | 53000 | 69.3402 | 84.1996 | 0.6783 |
0.627 | 8.23 | 54000 | 69.2684 | 84.1698 | 0.6829 |
0.6259 | 8.38 | 55000 | 69.1697 | 84.1268 | 0.6842 |
0.6009 | 8.53 | 56000 | 69.2011 | 84.1144 | 0.6759 |
0.5852 | 8.68 | 57000 | 69.3178 | 84.2026 | 0.6842 |
0.6258 | 8.83 | 58000 | 68.9048 | 84.0519 | 0.6780 |
0.6517 | 8.99 | 59000 | 69.2774 | 84.2748 | 0.6686 |
0.6044 | 9.14 | 60000 | 69.4614 | 84.2718 | 0.6735 |
0.6255 | 9.29 | 61000 | 69.4659 | 84.3243 | 0.6726 |
0.6003 | 9.44 | 62000 | 69.3178 | 84.2107 | 0.6694 |
0.6053 | 9.6 | 63000 | 69.6095 | 84.3137 | 0.6723 |
0.6105 | 9.75 | 64000 | 69.3357 | 84.2198 | 0.6704 |
0.6116 | 9.9 | 65000 | 69.4165 | 84.3396 | 0.6682 |
Framework versions
- Transformers 4.25.1
- Pytorch 1.13.0+cu117
- Datasets 2.7.1
- Tokenizers 0.13.2