<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->
fine-tuned-DatasetQAS-IDK-MRC-with-xlm-roberta-large-with-ITTL-with-freeze-LR-1e-05
This model is a fine-tuned version of xlm-roberta-large on the None dataset. It achieves the following results on the evaluation set:
- Loss: 0.8698
- Exact Match: 74.6073
- F1: 81.6214
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 1e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- gradient_accumulation_steps: 16
- total_train_batch_size: 128
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 10
Training results
Training Loss | Epoch | Step | Validation Loss | Exact Match | F1 |
---|---|---|---|---|---|
6.2825 | 0.49 | 36 | 2.2341 | 49.2147 | 49.3071 |
3.465 | 0.98 | 72 | 1.8139 | 49.2147 | 49.4968 |
1.9165 | 1.48 | 108 | 1.3110 | 50.6545 | 59.1184 |
1.9165 | 1.97 | 144 | 0.9907 | 65.0524 | 72.4023 |
1.2487 | 2.46 | 180 | 0.9051 | 68.1937 | 75.7323 |
0.9426 | 2.95 | 216 | 0.8485 | 67.8010 | 75.3684 |
0.8069 | 3.45 | 252 | 0.8499 | 70.0262 | 77.7548 |
0.8069 | 3.94 | 288 | 0.9202 | 67.5393 | 74.8123 |
0.7193 | 4.44 | 324 | 0.7897 | 73.0366 | 79.9552 |
0.6234 | 4.92 | 360 | 0.7973 | 73.6911 | 80.5009 |
0.6234 | 5.42 | 396 | 0.8353 | 72.9058 | 80.2879 |
0.5583 | 5.91 | 432 | 0.8392 | 73.4293 | 80.6345 |
0.5263 | 6.41 | 468 | 0.8477 | 73.5602 | 81.0016 |
0.4642 | 6.9 | 504 | 0.8355 | 74.6073 | 81.7391 |
0.4642 | 7.39 | 540 | 0.8383 | 73.5602 | 81.1187 |
0.4381 | 7.88 | 576 | 0.8828 | 73.0366 | 79.8504 |
0.4099 | 8.38 | 612 | 0.8698 | 74.6073 | 81.6214 |
Framework versions
- Transformers 4.26.1
- Pytorch 1.13.1+cu117
- Datasets 2.2.0
- Tokenizers 0.13.2