<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->
flan-t5-base-SQuAD-qag-ep12
This model is a fine-tuned version of google/flan-t5-base on the None dataset. It achieves the following results on the evaluation set:
- Loss: 0.9491
- Rouge1: 39.8393
- Rouge2: 18.8838
- Rougel: 36.5176
- Rougelsum: 36.5411
- F1: 19.9955
- Exact Match: 13.8849
- Gen Len: 18.4103
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 72
- eval_batch_size: 144
- seed: 1799
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 12
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | F1 | Exact Match | Gen Len |
---|---|---|---|---|---|---|---|---|---|---|
1.2248 | 0.76 | 200 | 0.9934 | 38.669 | 17.2616 | 35.2097 | 35.2505 | 17.2799 | 12.0464 | 18.5138 |
1.1059 | 1.52 | 400 | 0.9766 | 38.6953 | 17.255 | 35.1034 | 35.1485 | 16.7969 | 12.0464 | 18.5167 |
1.0784 | 2.28 | 600 | 0.9667 | 38.2853 | 17.165 | 34.8432 | 34.8858 | 17.5185 | 12.5786 | 18.4906 |
1.0628 | 3.04 | 800 | 0.9635 | 38.7413 | 17.4317 | 35.2968 | 35.3255 | 17.8293 | 12.6754 | 18.4751 |
1.0498 | 3.8 | 1000 | 0.9586 | 39.4263 | 18.0546 | 35.9042 | 35.9572 | 18.8349 | 13.4011 | 18.4475 |
1.0264 | 4.56 | 1200 | 0.9574 | 39.3702 | 17.9767 | 35.8919 | 35.9221 | 18.8685 | 13.1592 | 18.4272 |
1.0282 | 5.32 | 1400 | 0.9546 | 39.395 | 18.2361 | 35.9973 | 36.0321 | 19.2805 | 13.4494 | 18.4272 |
1.0119 | 6.08 | 1600 | 0.9550 | 39.3239 | 18.2565 | 35.9434 | 35.9944 | 19.2463 | 13.5946 | 18.4272 |
1.0022 | 6.84 | 1800 | 0.9526 | 39.6112 | 18.4996 | 36.1855 | 36.2362 | 19.7996 | 13.9816 | 18.4107 |
1.0 | 7.6 | 2000 | 0.9516 | 39.6419 | 18.5781 | 36.2251 | 36.2741 | 19.7594 | 13.6913 | 18.3991 |
0.9946 | 8.37 | 2200 | 0.9509 | 39.536 | 18.4871 | 36.1189 | 36.1826 | 19.6993 | 13.5462 | 18.4224 |
0.9888 | 9.13 | 2400 | 0.9503 | 39.7414 | 18.6988 | 36.3305 | 36.3656 | 19.8152 | 13.6913 | 18.4286 |
0.9838 | 9.89 | 2600 | 0.9500 | 39.9269 | 18.8588 | 36.5126 | 36.5534 | 19.9364 | 13.9332 | 18.4136 |
0.987 | 10.65 | 2800 | 0.9491 | 39.8393 | 18.8838 | 36.5176 | 36.5411 | 19.9955 | 13.8849 | 18.4103 |
0.9809 | 11.41 | 3000 | 0.9494 | 39.7507 | 18.6815 | 36.348 | 36.3761 | 19.7615 | 13.7397 | 18.4112 |
Framework versions
- Transformers 4.18.0
- Pytorch 1.11.0+cu113
- Datasets 2.5.1
- Tokenizers 0.12.1