<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->
MultipleQG-Full_Ctxt_Only-filtered_0_15_BertQA
This model is a fine-tuned version of distilbert-base-uncased on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 0.6132
- Rouge1: 0.8802
- Rouge2: 0.6072
- Rougel: 0.6465
- Rougelsum: 0.6465
- Exact Match: 0.0
- Precision: [0.9531208276748657, 0.9756398797035217]
- Recall: [0.9530860781669617, 0.9767947793006897]
- F1: [0.9531034827232361, 0.9762169718742371]
- Hashcode: roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.28.0)
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 10
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Exact Match | Precision | Recall | F1 | Hashcode |
---|---|---|---|---|---|---|---|---|---|---|---|---|
1.134 | 1.0 | 371 | 0.7058 | 0.5419 | 0.2270 | 0.3234 | 0.3234 | 0.0 | [0.9516705274581909, 0.9165871143341064] | [0.9475164413452148, 0.9262969493865967] | [0.9495888948440552, 0.9214164614677429] | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.28.0) |
0.6373 | 2.0 | 742 | 0.6281 | 0.7103 | 0.3534 | 0.4532 | 0.4532 | 0.0 | [0.9580843448638916, 0.9491904973983765] | [0.9564405679702759, 0.9551339149475098] | [0.9572617411613464, 0.9521529674530029] | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.28.0) |
0.5373 | 3.0 | 1113 | 0.5986 | 0.7438 | 0.4155 | 0.5042 | 0.5042 | 0.0 | [0.9565318822860718, 0.9545961618423462] | [0.9535980820655823, 0.9588986039161682] | [0.955062747001648, 0.956742525100708] | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.28.0) |
0.4729 | 4.0 | 1484 | 0.5799 | 0.8038 | 0.4601 | 0.5572 | 0.5572 | 0.0 | [0.9547574520111084, 0.9647554159164429] | [0.9532629251480103, 0.9663883447647095] | [0.9540095925331116, 0.9655712246894836] | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.28.0) |
0.435 | 5.0 | 1855 | 0.6095 | 0.8298 | 0.5149 | 0.5972 | 0.5972 | 0.0 | [0.9546163082122803, 0.9788346290588379] | [0.9538379907608032, 0.9811053276062012] | [0.9542269706726074, 0.9799686670303345] | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.28.0) |
0.4077 | 6.0 | 2226 | 0.5844 | 0.8291 | 0.5040 | 0.5826 | 0.5826 | 0.0 | [0.9527148008346558, 0.9647933840751648] | [0.951576828956604, 0.9686386585235596] | [0.952145516872406, 0.9667121767997742] | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.28.0) |
0.3825 | 7.0 | 2597 | 0.5676 | 0.8650 | 0.5907 | 0.6354 | 0.6354 | 0.0 | [0.955225944519043, 0.9776592254638672] | [0.9552159905433655, 0.9790937900543213] | [0.9552209973335266, 0.9783759713172913] | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.28.0) |
0.3588 | 8.0 | 2968 | 0.5958 | 0.8736 | 0.5896 | 0.6366 | 0.6366 | 0.0 | [0.954782247543335, 0.9791052341461182] | [0.9545933604240417, 0.9802587032318115] | [0.954687774181366, 0.9796816110610962] | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.28.0) |
0.3405 | 9.0 | 3339 | 0.6026 | 0.8788 | 0.6048 | 0.6410 | 0.6410 | 0.0 | [0.953073263168335, 0.9756398797035217] | [0.9531681537628174, 0.9767947793006897] | [0.9531207084655762, 0.9762169718742371] | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.28.0) |
0.3253 | 10.0 | 3710 | 0.6132 | 0.8802 | 0.6072 | 0.6465 | 0.6465 | 0.0 | [0.9531208276748657, 0.9756398797035217] | [0.9530860781669617, 0.9767947793006897] | [0.9531034827232361, 0.9762169718742371] | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.28.0) |
Framework versions
- Transformers 4.28.0
- Pytorch 1.12.1+cu113
- Datasets 2.7.1
- Tokenizers 0.13.2