<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->
flan_vary_merged_5800_1
This model is a fine-tuned version of google/flan-t5-base on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 0.1597
- Rouge1: 66.8856
- Rouge2: 55.6869
- Rougel: 63.8241
- Rougelsum: 66.7005
- Gen Len: 16.3392
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 16
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 200
- num_epochs: 10
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
---|---|---|---|---|---|---|---|---|
11.8095 | 0.35 | 200 | 0.5275 | 38.2792 | 29.3331 | 37.9276 | 38.1283 | 8.0624 |
0.4481 | 0.7 | 400 | 0.3046 | 64.4437 | 52.3632 | 62.0225 | 64.2515 | 16.4262 |
0.3616 | 1.05 | 600 | 0.2656 | 64.9871 | 53.1185 | 62.4919 | 64.739 | 16.4279 |
0.2944 | 1.41 | 800 | 0.2412 | 65.2117 | 53.5512 | 62.6779 | 64.9318 | 16.4464 |
0.264 | 1.76 | 1000 | 0.2295 | 65.5748 | 54.0948 | 62.9803 | 65.3339 | 16.3866 |
0.2571 | 2.11 | 1200 | 0.2223 | 65.7216 | 53.793 | 62.9877 | 65.491 | 16.1898 |
0.2364 | 2.46 | 1400 | 0.2164 | 65.5444 | 53.9296 | 62.9975 | 65.3055 | 16.3172 |
0.2293 | 2.81 | 1600 | 0.2029 | 65.7977 | 54.3067 | 63.1851 | 65.5544 | 16.1766 |
0.2129 | 3.16 | 1800 | 0.2006 | 65.8342 | 53.9105 | 63.163 | 65.6175 | 16.1757 |
0.2184 | 3.51 | 2000 | 0.1931 | 65.1608 | 53.7707 | 62.6719 | 64.9743 | 16.1547 |
0.1952 | 3.87 | 2200 | 0.1873 | 66.3361 | 54.8382 | 63.2054 | 66.0954 | 16.3155 |
0.1992 | 4.22 | 2400 | 0.1847 | 66.316 | 55.0379 | 63.5154 | 66.0694 | 16.3594 |
0.1873 | 4.57 | 2600 | 0.1811 | 66.4999 | 55.263 | 63.8319 | 66.2513 | 16.3146 |
0.1839 | 4.92 | 2800 | 0.1783 | 66.0055 | 54.3406 | 62.9554 | 65.7387 | 16.3304 |
0.1748 | 5.27 | 3000 | 0.1777 | 66.1592 | 54.8048 | 63.407 | 66.0067 | 16.3348 |
0.1844 | 5.62 | 3200 | 0.1736 | 66.7642 | 55.3404 | 63.7069 | 66.5324 | 16.2996 |
0.1745 | 5.98 | 3400 | 0.1698 | 66.3946 | 55.1716 | 63.5596 | 66.1663 | 16.3216 |
0.1739 | 6.33 | 3600 | 0.1678 | 66.4472 | 55.1785 | 63.602 | 66.2704 | 16.3049 |
0.1633 | 6.68 | 3800 | 0.1680 | 66.6666 | 55.4584 | 63.8058 | 66.4708 | 16.3445 |
0.1659 | 7.03 | 4000 | 0.1682 | 66.6592 | 55.3712 | 63.5841 | 66.4587 | 16.2953 |
0.1557 | 7.38 | 4200 | 0.1634 | 66.876 | 55.423 | 63.8431 | 66.5569 | 16.2434 |
0.158 | 7.73 | 4400 | 0.1622 | 66.6165 | 55.2948 | 63.5996 | 66.4314 | 16.3849 |
0.1647 | 8.08 | 4600 | 0.1622 | 66.7592 | 55.5552 | 63.7194 | 66.5229 | 16.2794 |
0.1579 | 8.44 | 4800 | 0.1614 | 66.7889 | 55.5768 | 63.8266 | 66.5511 | 16.3181 |
0.1526 | 8.79 | 5000 | 0.1610 | 66.7516 | 55.5383 | 63.6509 | 66.5754 | 16.261 |
0.1506 | 9.14 | 5200 | 0.1608 | 66.9266 | 55.6277 | 63.7712 | 66.6668 | 16.3445 |
0.1502 | 9.49 | 5400 | 0.1604 | 66.9759 | 55.6586 | 63.8856 | 66.7849 | 16.3251 |
0.158 | 9.84 | 5600 | 0.1597 | 66.8856 | 55.6869 | 63.8241 | 66.7005 | 16.3392 |
Framework versions
- Transformers 4.34.0
- Pytorch 2.0.1+cu117
- Datasets 2.14.4
- Tokenizers 0.14.0