<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->
8_roberta-large_train_korquad-1_2_aihubf
This model is a fine-tuned version of klue/roberta-large on the None dataset. It achieves the following results on the evaluation set:
- Exact Match: 70.8573
- F1: 85.1795
- Loss: 0.6914
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 1e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 15
- mixed_precision_training: Native AMP
- label_smoothing_factor: 0.1
Training results
Training Loss | Epoch | Step | Exact Match | F1 | Validation Loss |
---|---|---|---|---|---|
1.7528 | 0.55 | 20000 | 66.9569 | 82.7721 | 0.7109 |
0.6827 | 1.1 | 40000 | 69.1023 | 84.0073 | 0.6357 |
0.6049 | 1.65 | 60000 | 70.0135 | 84.8732 | 0.5845 |
0.5649 | 2.21 | 80000 | 70.4758 | 85.0331 | 0.5737 |
0.5425 | 2.76 | 100000 | 70.3456 | 85.0041 | 0.5879 |
0.5395 | 3.31 | 120000 | 70.5072 | 85.0318 | 0.5742 |
0.5279 | 3.86 | 140000 | 70.7226 | 85.3219 | 0.5708 |
0.4925 | 4.41 | 160000 | 70.9425 | 85.3718 | 0.5713 |
0.4861 | 4.96 | 180000 | 71.0144 | 85.4729 | 0.5630 |
0.4813 | 5.51 | 200000 | 70.7496 | 85.3388 | 0.5757 |
0.4819 | 6.06 | 220000 | 71.1580 | 85.4708 | 0.5884 |
0.4481 | 6.62 | 240000 | 71.1311 | 85.4844 | 0.5850 |
0.4404 | 7.17 | 260000 | 71.2118 | 85.4463 | 0.5986 |
0.4452 | 7.72 | 280000 | 71.0009 | 85.3122 | 0.5947 |
0.4338 | 8.27 | 300000 | 71.1984 | 85.4052 | 0.6113 |
0.4144 | 8.82 | 320000 | 71.2433 | 85.4699 | 0.6001 |
0.4016 | 9.37 | 340000 | 71.2522 | 85.4297 | 0.6099 |
0.4122 | 9.92 | 360000 | 71.1715 | 85.3448 | 0.5923 |
0.3966 | 10.47 | 380000 | 71.1984 | 85.4874 | 0.6240 |
0.3825 | 11.03 | 400000 | 71.4093 | 85.5420 | 0.6309 |
0.3639 | 11.58 | 420000 | 70.9336 | 85.3509 | 0.6431 |
0.3728 | 12.13 | 440000 | 70.9425 | 85.2109 | 0.6562 |
0.3655 | 12.68 | 460000 | 71.0503 | 85.3442 | 0.6543 |
0.3476 | 13.23 | 480000 | 71.0637 | 85.3476 | 0.6963 |
0.332 | 13.78 | 500000 | 70.9560 | 85.2729 | 0.6963 |
0.337 | 14.33 | 520000 | 70.5700 | 85.1620 | 0.7109 |
0.3415 | 14.89 | 540000 | 70.8573 | 85.1795 | 0.6914 |
Framework versions
- Transformers 4.25.1
- Pytorch 1.13.0+cu117
- Datasets 2.7.1
- Tokenizers 0.13.2