<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->
flan-t5-base-TriviaQA-qag-ep10
This model is a fine-tuned version of google/flan-t5-base on the None dataset. It achieves the following results on the evaluation set:
- Loss: 1.3620
- Rouge1: 39.8143
- Rouge2: 17.6525
- Rougel: 33.7284
- Rougelsum: 33.7443
- F1: 11.1984
- Exact Match: 8.4778
- Gen Len: 18.7224
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 72
- eval_batch_size: 144
- seed: 1799
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 10
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | F1 | Exact Match | Gen Len |
---|---|---|---|---|---|---|---|---|---|---|
1.718 | 0.58 | 500 | 1.4892 | 36.7354 | 14.5408 | 30.6756 | 30.6799 | 8.9323 | 6.2428 | 18.7535 |
1.6019 | 1.17 | 1000 | 1.4516 | 37.5084 | 15.1746 | 31.3612 | 31.3617 | 9.3629 | 6.885 | 18.7486 |
1.5581 | 1.75 | 1500 | 1.4303 | 37.8554 | 15.5686 | 31.7633 | 31.7591 | 10.2193 | 7.6044 | 18.7371 |
1.5273 | 2.33 | 2000 | 1.4158 | 38.1533 | 15.9332 | 32.103 | 32.1008 | 10.1181 | 7.5273 | 18.7349 |
1.5096 | 2.92 | 2500 | 1.4040 | 38.5218 | 16.3547 | 32.4545 | 32.4433 | 10.6931 | 8.0283 | 18.7178 |
1.4864 | 3.5 | 3000 | 1.3956 | 39.0574 | 16.8492 | 33.019 | 33.0265 | 10.8949 | 8.2209 | 18.7097 |
1.4785 | 4.08 | 3500 | 1.3890 | 38.9339 | 16.7759 | 32.8597 | 32.8672 | 10.5823 | 7.9769 | 18.7178 |
1.4622 | 4.67 | 4000 | 1.3826 | 39.2009 | 16.9529 | 33.1057 | 33.1127 | 10.9865 | 8.3751 | 18.7155 |
1.4505 | 5.25 | 4500 | 1.3779 | 39.3368 | 17.2173 | 33.2987 | 33.3018 | 10.9311 | 8.3109 | 18.7138 |
1.4483 | 5.83 | 5000 | 1.3728 | 39.7136 | 17.5316 | 33.607 | 33.6267 | 10.575 | 7.9512 | 18.7297 |
1.4329 | 6.42 | 5500 | 1.3706 | 39.4669 | 17.3885 | 33.412 | 33.4095 | 10.8311 | 8.2595 | 18.7306 |
1.4311 | 7.0 | 6000 | 1.3677 | 39.6353 | 17.4778 | 33.5505 | 33.5499 | 11.0071 | 8.4136 | 18.724 |
1.4229 | 7.58 | 6500 | 1.3661 | 39.6099 | 17.511 | 33.5553 | 33.5608 | 11.0084 | 8.4008 | 18.7231 |
1.4213 | 8.17 | 7000 | 1.3638 | 39.7805 | 17.6628 | 33.6967 | 33.7052 | 10.9316 | 8.3622 | 18.7141 |
1.4112 | 8.75 | 7500 | 1.3625 | 39.7281 | 17.6079 | 33.6809 | 33.6946 | 11.1526 | 8.4522 | 18.718 |
1.4141 | 9.33 | 8000 | 1.3623 | 39.6856 | 17.5403 | 33.5985 | 33.6198 | 11.1693 | 8.465 | 18.7188 |
1.4139 | 9.92 | 8500 | 1.3620 | 39.8143 | 17.6525 | 33.7284 | 33.7443 | 11.1984 | 8.4778 | 18.7224 |
Framework versions
- Transformers 4.18.0
- Pytorch 1.11.0+cu113
- Datasets 2.5.1
- Tokenizers 0.12.1