<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->
t5-small-squad-qg-a2c-spt-valid
This model is a fine-tuned version of lmqg/t5-small-squad-qg on the qg_squad dataset. It achieves the following results on the evaluation set:
- Loss: 3.5585
- Bleu: 0.1856
- Precisions: [0.4899881007730557, 0.23798056024064962, 0.14699694604682728, 0.09541131612394267]
- Brevity Penalty: 0.9231
- Length Ratio: 0.9259
- Translation Length: 126899
- Reference Length: 137056
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 64
- eval_batch_size: 64
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 10
- label_smoothing_factor: 0.15
Training results
Training Loss | Epoch | Step | Validation Loss | Bleu | Precisions | Brevity Penalty | Length Ratio | Translation Length | Reference Length |
---|---|---|---|---|---|---|---|---|---|
3.4717 | 1.0 | 1184 | 3.5703 | 0.1850 | [0.4884210026960997, 0.23740423378300554, 0.14702360671696277, 0.09591845720324058] | 0.9198 | 0.9228 | 126479 | 137056 |
3.4432 | 2.0 | 2368 | 3.5676 | 0.1847 | [0.4899809765377299, 0.23739313808702955, 0.14709099076226004, 0.09610180163262601] | 0.9173 | 0.9205 | 126160 | 137056 |
3.4207 | 3.0 | 3552 | 3.5654 | 0.1855 | [0.48690609948692964, 0.236654650074526, 0.14669770766719153, 0.09533838196460138] | 0.9260 | 0.9286 | 127273 | 137056 |
3.4017 | 4.0 | 4736 | 3.5575 | 0.1861 | [0.4907433036243861, 0.23905491743183327, 0.14802083840498564, 0.09654473782730295] | 0.9195 | 0.9226 | 126449 | 137056 |
3.3862 | 5.0 | 5920 | 3.5540 | 0.1851 | [0.4916027385306181, 0.23877172085201795, 0.14769450336757936, 0.09608281170511601] | 0.9164 | 0.9197 | 126053 | 137056 |
3.3715 | 6.0 | 7104 | 3.5619 | 0.1847 | [0.4897172642552519, 0.23742624822429256, 0.14650127350144848, 0.09495653320731078] | 0.9209 | 0.9239 | 126620 | 137056 |
3.3602 | 7.0 | 8288 | 3.5581 | 0.1857 | [0.49199648336329865, 0.2390627732121, 0.14782006380301063, 0.09637410897534923] | 0.9180 | 0.9212 | 126257 | 137056 |
3.3523 | 8.0 | 9472 | 3.5575 | 0.1856 | [0.4896288812767368, 0.23802266135985578, 0.14728396021137705, 0.09588544697859817] | 0.9215 | 0.9244 | 126698 | 137056 |
3.3439 | 9.0 | 10656 | 3.5582 | 0.1862 | [0.4919672196048933, 0.23971752696254087, 0.14848694668474074, 0.09658739962940087] | 0.9183 | 0.9215 | 126295 | 137056 |
3.3395 | 10.0 | 11840 | 3.5585 | 0.1856 | [0.4899881007730557, 0.23798056024064962, 0.14699694604682728, 0.09541131612394267] | 0.9231 | 0.9259 | 126899 | 137056 |
Framework versions
- Transformers 4.27.4
- Pytorch 1.9.0
- Datasets 2.9.0
- Tokenizers 0.13.2