<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->
flan-t5-base-gecfirst-e8-b16
This model is a fine-tuned version of google/flan-t5-base on the None dataset. It achieves the following results on the evaluation set:
- Loss: 0.2233
- Rouge1: 42.0185
- Rouge2: 34.0704
- Rougel: 42.0403
- Rougelsum: 41.8957
- Gen Len: 18.9865
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.001
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- optimizer: Adafactor
- lr_scheduler_type: linear
- num_epochs: 8
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
---|---|---|---|---|---|---|---|---|
0.9817 | 0.25 | 74 | 0.4035 | 38.3182 | 28.427 | 38.3215 | 38.2591 | 18.9882 |
0.6805 | 0.5 | 148 | 0.3467 | 39.9659 | 30.9972 | 40.04 | 39.9619 | 18.9831 |
0.5885 | 0.75 | 222 | 0.3205 | 40.4848 | 31.33 | 40.4959 | 40.3976 | 18.9797 |
0.5476 | 1.0 | 296 | 0.2869 | 40.2589 | 31.6407 | 40.3592 | 40.223 | 18.9831 |
0.4504 | 1.25 | 370 | 0.2754 | 40.7626 | 31.8985 | 40.755 | 40.6406 | 18.9831 |
0.4463 | 1.49 | 444 | 0.2650 | 40.908 | 32.2358 | 40.9207 | 40.8062 | 18.9831 |
0.4155 | 1.74 | 518 | 0.2561 | 41.056 | 32.4906 | 41.029 | 40.9401 | 18.9831 |
0.3948 | 1.99 | 592 | 0.2493 | 41.1813 | 32.7917 | 41.2183 | 41.1256 | 18.9831 |
0.329 | 2.24 | 666 | 0.2413 | 41.8005 | 33.7235 | 41.88 | 41.7556 | 18.9831 |
0.3195 | 2.49 | 740 | 0.2390 | 41.5207 | 33.2502 | 41.5599 | 41.4291 | 18.9848 |
0.3148 | 2.74 | 814 | 0.2344 | 41.5913 | 33.398 | 41.614 | 41.4909 | 18.9831 |
0.316 | 2.99 | 888 | 0.2266 | 41.6858 | 33.7369 | 41.731 | 41.6293 | 18.9831 |
0.2498 | 3.24 | 962 | 0.2353 | 41.7077 | 33.3652 | 41.7111 | 41.6256 | 18.9848 |
0.2534 | 3.49 | 1036 | 0.2299 | 41.8645 | 33.9926 | 41.9435 | 41.8168 | 18.9848 |
0.2435 | 3.74 | 1110 | 0.2233 | 42.0185 | 34.0704 | 42.0403 | 41.8957 | 18.9865 |
0.2514 | 3.99 | 1184 | 0.2292 | 41.9069 | 33.8917 | 41.9112 | 41.7937 | 18.9831 |
0.193 | 4.24 | 1258 | 0.2462 | 41.9671 | 34.0261 | 42.024 | 41.9178 | 18.9831 |
0.1927 | 4.48 | 1332 | 0.2322 | 42.2226 | 34.6158 | 42.3306 | 42.1946 | 18.9848 |
0.1984 | 4.73 | 1406 | 0.2278 | 41.9762 | 34.0828 | 41.999 | 41.9107 | 18.9848 |
0.1929 | 4.98 | 1480 | 0.2299 | 41.8244 | 33.831 | 41.8673 | 41.7786 | 18.9848 |
0.1522 | 5.23 | 1554 | 0.2432 | 41.9142 | 33.9634 | 41.9635 | 41.859 | 18.9848 |
0.1509 | 5.48 | 1628 | 0.2408 | 41.707 | 33.6909 | 41.7144 | 41.6345 | 18.9831 |
0.1457 | 5.73 | 1702 | 0.2426 | 42.1729 | 34.2971 | 42.2318 | 42.119 | 18.9848 |
0.1497 | 5.98 | 1776 | 0.2386 | 42.1408 | 34.3303 | 42.148 | 42.0599 | 18.9865 |
0.1195 | 6.23 | 1850 | 0.2627 | 41.897 | 34.0092 | 41.9336 | 41.8262 | 18.9865 |
0.1145 | 6.48 | 1924 | 0.2560 | 41.8456 | 33.7951 | 41.8578 | 41.7709 | 18.9865 |
0.1198 | 6.73 | 1998 | 0.2525 | 41.8393 | 33.6033 | 41.8313 | 41.7505 | 18.9831 |
0.114 | 6.98 | 2072 | 0.2524 | 41.8194 | 33.7992 | 41.8752 | 41.775 | 18.9848 |
0.1 | 7.23 | 2146 | 0.2690 | 41.9724 | 33.9339 | 42.0248 | 41.9444 | 18.9848 |
0.0948 | 7.47 | 2220 | 0.2715 | 41.8806 | 33.9232 | 41.9392 | 41.8432 | 18.9848 |
0.0947 | 7.72 | 2294 | 0.2722 | 41.8981 | 33.8642 | 41.9622 | 41.877 | 18.9848 |
0.0904 | 7.97 | 2368 | 0.2705 | 41.8596 | 33.9216 | 41.915 | 41.8342 | 18.9848 |
Framework versions
- Transformers 4.28.1
- Pytorch 1.11.0a0+b6df043
- Datasets 2.12.0
- Tokenizers 0.13.3