<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->
flan-t5-large-work-filters
This model is a fine-tuned version of google/flan-t5-base on the None dataset. It achieves the following results on the evaluation set:
- Loss: 0.0362
- Rouge1: 41.8961
- Rouge2: 31.4402
- Rougel: 41.841
- Rougelsum: 41.9024
- Gen Len: 18.9259
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 3e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 30
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
---|---|---|---|---|---|---|---|---|
0.5256 | 1.0 | 213 | 0.2899 | 41.8953 | 29.9382 | 41.4023 | 41.4413 | 18.9788 |
0.2377 | 2.0 | 426 | 0.1172 | 42.3662 | 31.0031 | 41.997 | 42.0997 | 19.0 |
0.1501 | 3.0 | 639 | 0.1091 | 42.0009 | 31.2986 | 41.8735 | 41.9067 | 19.0 |
0.1256 | 4.0 | 852 | 0.0905 | 43.6233 | 32.9567 | 43.5606 | 43.6054 | 18.9788 |
0.0997 | 5.0 | 1065 | 0.0936 | 43.4929 | 32.9118 | 43.5026 | 43.5589 | 18.9577 |
0.0792 | 6.0 | 1278 | 0.0743 | 43.3921 | 32.8388 | 43.3863 | 43.4487 | 18.9577 |
0.0738 | 7.0 | 1491 | 0.0613 | 42.3912 | 31.6893 | 42.3324 | 42.375 | 18.9577 |
0.0621 | 8.0 | 1704 | 0.0753 | 42.4408 | 31.7954 | 42.391 | 42.4501 | 18.9577 |
0.0664 | 9.0 | 1917 | 0.0568 | 42.0348 | 31.4631 | 41.9591 | 42.0159 | 18.9577 |
0.0575 | 10.0 | 2130 | 0.0576 | 43.0601 | 32.8756 | 42.9724 | 43.0502 | 18.9577 |
0.0488 | 11.0 | 2343 | 0.0473 | 42.3785 | 31.845 | 42.2759 | 42.37 | 18.9577 |
0.0528 | 12.0 | 2556 | 0.0503 | 43.1495 | 32.7992 | 43.1017 | 43.1919 | 18.9577 |
0.0392 | 13.0 | 2769 | 0.0407 | 42.0459 | 31.7063 | 41.9685 | 42.0368 | 18.9259 |
0.0462 | 14.0 | 2982 | 0.0446 | 43.473 | 33.1682 | 43.4607 | 43.5482 | 18.9259 |
0.0449 | 15.0 | 3195 | 0.0426 | 43.2263 | 32.5799 | 43.2171 | 43.255 | 18.9577 |
0.0432 | 16.0 | 3408 | 0.0419 | 42.2094 | 31.7081 | 42.1549 | 42.2244 | 18.9577 |
0.037 | 17.0 | 3621 | 0.0398 | 42.2089 | 31.5243 | 42.1439 | 42.213 | 18.9259 |
0.0376 | 18.0 | 3834 | 0.0402 | 42.624 | 31.7967 | 42.5462 | 42.6104 | 18.9259 |
0.0423 | 19.0 | 4047 | 0.0406 | 42.6076 | 31.9496 | 42.5665 | 42.6086 | 18.9259 |
0.0364 | 20.0 | 4260 | 0.0406 | 43.4863 | 33.0331 | 43.4492 | 43.5222 | 18.9259 |
0.0326 | 21.0 | 4473 | 0.0362 | 41.8961 | 31.4402 | 41.841 | 41.9024 | 18.9259 |
0.0302 | 22.0 | 4686 | 0.0410 | 42.9891 | 32.761 | 42.9509 | 42.9624 | 18.9259 |
0.0318 | 23.0 | 4899 | 0.0411 | 42.861 | 32.4727 | 42.8046 | 42.8544 | 18.9259 |
0.034 | 24.0 | 5112 | 0.0387 | 42.6177 | 32.1915 | 42.4974 | 42.5653 | 18.9259 |
0.0307 | 25.0 | 5325 | 0.0373 | 43.2371 | 32.9299 | 43.2075 | 43.2857 | 18.9259 |
0.0308 | 26.0 | 5538 | 0.0377 | 42.8476 | 32.5806 | 42.7802 | 42.837 | 18.9259 |
0.0282 | 27.0 | 5751 | 0.0381 | 42.9285 | 32.4737 | 42.8965 | 42.945 | 18.9259 |
0.0277 | 28.0 | 5964 | 0.0383 | 42.6384 | 31.6781 | 42.567 | 42.6305 | 18.9259 |
0.0316 | 29.0 | 6177 | 0.0380 | 42.9983 | 32.5656 | 42.9407 | 42.9974 | 18.9259 |
0.0357 | 30.0 | 6390 | 0.0378 | 43.0447 | 32.5656 | 43.0102 | 43.0802 | 18.9259 |
Framework versions
- Transformers 4.27.2
- Pytorch 2.0.0+cu118
- Datasets 2.9.0
- Tokenizers 0.13.3