<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->
flan-t5-small-taboo-for-llms
This model is a fine-tuned version of google/flan-t5-small on the None dataset. It achieves the following results on the evaluation set:
- Loss: 2.4825
- Rouge1: 27.3897
- Rouge2: 9.9232
- Rougel: 24.2026
- Rougelsum: 24.6485
- Gen Len: 18.5172
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 30
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
---|---|---|---|---|---|---|---|---|
No log | 1.0 | 137 | 2.5897 | 26.6789 | 9.9538 | 23.6637 | 24.2407 | 18.3621 |
No log | 2.0 | 274 | 2.5560 | 25.4162 | 9.6277 | 22.7084 | 23.0883 | 18.3966 |
No log | 3.0 | 411 | 2.5377 | 26.0239 | 9.7748 | 23.4425 | 23.7935 | 18.6034 |
2.8204 | 4.0 | 548 | 2.5241 | 26.6294 | 9.9168 | 23.8023 | 24.2756 | 18.7241 |
2.8204 | 5.0 | 685 | 2.5120 | 25.8274 | 9.9333 | 23.8865 | 24.0724 | 18.7586 |
2.8204 | 6.0 | 822 | 2.5031 | 26.7774 | 9.9651 | 24.3654 | 24.6102 | 18.6034 |
2.8204 | 7.0 | 959 | 2.4985 | 26.5058 | 10.0422 | 24.0403 | 24.635 | 18.4655 |
2.6101 | 8.0 | 1096 | 2.4934 | 26.6953 | 9.9536 | 24.0293 | 24.6809 | 18.4655 |
2.6101 | 9.0 | 1233 | 2.4907 | 26.7978 | 9.6249 | 23.714 | 23.9992 | 18.6034 |
2.6101 | 10.0 | 1370 | 2.4847 | 27.2135 | 9.878 | 23.8398 | 24.2389 | 18.5 |
2.4726 | 11.0 | 1507 | 2.4856 | 27.1799 | 9.9337 | 23.9393 | 24.4067 | 18.5172 |
2.4726 | 12.0 | 1644 | 2.4835 | 27.4491 | 10.1828 | 24.0926 | 24.4819 | 18.5 |
2.4726 | 13.0 | 1781 | 2.4825 | 27.3897 | 9.9232 | 24.2026 | 24.6485 | 18.5172 |
2.4726 | 14.0 | 1918 | 2.4836 | 27.5567 | 10.7405 | 24.2497 | 24.6566 | 18.5345 |
2.3731 | 15.0 | 2055 | 2.4872 | 27.7517 | 11.0182 | 24.1007 | 24.7218 | 18.4828 |
2.3731 | 16.0 | 2192 | 2.4852 | 27.3461 | 11.3381 | 24.084 | 24.5125 | 18.4655 |
2.3731 | 17.0 | 2329 | 2.4872 | 27.3558 | 11.1005 | 24.047 | 24.4973 | 18.4655 |
2.3731 | 18.0 | 2466 | 2.4841 | 26.9427 | 10.9288 | 23.7324 | 24.4298 | 18.5345 |
2.2967 | 19.0 | 2603 | 2.4881 | 27.5 | 10.8437 | 24.1593 | 24.6028 | 18.4483 |
2.2967 | 20.0 | 2740 | 2.4908 | 27.517 | 11.0039 | 24.1049 | 24.7111 | 18.5 |
2.2967 | 21.0 | 2877 | 2.4917 | 27.7333 | 10.935 | 24.4076 | 24.9887 | 18.4138 |
2.2553 | 22.0 | 3014 | 2.4926 | 27.6275 | 10.7562 | 24.2295 | 24.7476 | 18.4138 |
2.2553 | 23.0 | 3151 | 2.4945 | 27.9085 | 10.943 | 24.6135 | 25.2373 | 18.4138 |
2.2553 | 24.0 | 3288 | 2.4948 | 27.5261 | 10.7141 | 24.2429 | 24.816 | 18.4138 |
2.2553 | 25.0 | 3425 | 2.4931 | 27.5522 | 10.8702 | 24.5576 | 25.0714 | 18.4655 |
2.213 | 26.0 | 3562 | 2.4942 | 27.4758 | 11.0064 | 24.5062 | 25.05 | 18.4655 |
2.213 | 27.0 | 3699 | 2.4954 | 27.6967 | 11.1744 | 24.7646 | 25.3172 | 18.4655 |
2.213 | 28.0 | 3836 | 2.4951 | 27.7428 | 10.9365 | 24.6427 | 25.2432 | 18.5172 |
2.213 | 29.0 | 3973 | 2.4949 | 27.6877 | 10.9522 | 24.6101 | 25.2471 | 18.4655 |
2.1865 | 30.0 | 4110 | 2.4952 | 27.7295 | 11.0173 | 24.6556 | 25.2397 | 18.4655 |
Framework versions
- Transformers 4.30.2
- Pytorch 2.0.1+cu118
- Datasets 2.13.0
- Tokenizers 0.13.3