<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->
flan-t5-large-extraction-all-dm_8000-ep10-nonstop
This model is a fine-tuned version of google/flan-t5-large on the None dataset. It achieves the following results on the evaluation set:
- Loss: 1.5334
- Hint Hit Num: 2.2682
- Hint Precision: 0.4261
- Num: 5.2682
- Gen Len: 18.776
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 12
- eval_batch_size: 96
- seed: 1799
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 10
Training results
Training Loss | Epoch | Step | Validation Loss | Hint Hit Num | Hint Precision | Num | Gen Len |
---|---|---|---|---|---|---|---|
2.1988 | 0.3 | 200 | 1.5870 | 2.6768 | 0.4798 | 5.6224 | 18.9782 |
1.976 | 0.6 | 400 | 1.5613 | 2.5624 | 0.4663 | 5.495 | 18.9114 |
1.9367 | 0.9 | 600 | 1.5303 | 2.4822 | 0.4551 | 5.4574 | 18.9418 |
1.8626 | 1.2 | 800 | 1.5336 | 2.3896 | 0.4403 | 5.3966 | 18.9096 |
1.8278 | 1.5 | 1000 | 1.5110 | 2.5016 | 0.4514 | 5.5236 | 18.9486 |
1.8115 | 1.8 | 1200 | 1.5116 | 2.2886 | 0.4269 | 5.3196 | 18.9194 |
1.776 | 2.1 | 1400 | 1.5212 | 2.3278 | 0.4326 | 5.3394 | 18.8936 |
1.7332 | 2.4 | 1600 | 1.5172 | 2.2982 | 0.4323 | 5.2878 | 18.828 |
1.7543 | 2.7 | 1800 | 1.5003 | 2.473 | 0.4522 | 5.4414 | 18.9048 |
1.7212 | 3.0 | 2000 | 1.5051 | 2.3878 | 0.4389 | 5.4032 | 18.854 |
1.6915 | 3.3 | 2200 | 1.5083 | 2.3352 | 0.4347 | 5.3186 | 18.836 |
1.6808 | 3.6 | 2400 | 1.5065 | 2.3414 | 0.4367 | 5.321 | 18.8136 |
1.6812 | 3.9 | 2600 | 1.5047 | 2.3422 | 0.4376 | 5.3144 | 18.812 |
1.6408 | 4.2 | 2800 | 1.5158 | 2.3108 | 0.4297 | 5.33 | 18.8116 |
1.6266 | 4.5 | 3000 | 1.5086 | 2.2752 | 0.4227 | 5.329 | 18.8472 |
1.6144 | 4.8 | 3200 | 1.5120 | 2.2434 | 0.4192 | 5.283 | 18.8684 |
1.6164 | 5.1 | 3400 | 1.5135 | 2.3636 | 0.4356 | 5.3754 | 18.8526 |
1.5981 | 5.4 | 3600 | 1.5202 | 2.245 | 0.4201 | 5.2762 | 18.8574 |
1.5923 | 5.7 | 3800 | 1.5190 | 2.2462 | 0.4208 | 5.28 | 18.8358 |
1.5835 | 6.0 | 4000 | 1.5182 | 2.2812 | 0.4249 | 5.3042 | 18.8182 |
1.577 | 6.3 | 4200 | 1.5268 | 2.2928 | 0.4254 | 5.335 | 18.8268 |
1.5572 | 6.6 | 4400 | 1.5229 | 2.261 | 0.4237 | 5.276 | 18.7788 |
1.5522 | 6.9 | 4600 | 1.5153 | 2.3372 | 0.4323 | 5.3516 | 18.8326 |
1.5095 | 7.2 | 4800 | 1.5334 | 2.2108 | 0.4195 | 5.2086 | 18.7338 |
1.5568 | 7.5 | 5000 | 1.5243 | 2.302 | 0.4305 | 5.2964 | 18.7742 |
1.5373 | 7.8 | 5200 | 1.5277 | 2.2502 | 0.4204 | 5.2868 | 18.8176 |
1.5191 | 8.1 | 5400 | 1.5321 | 2.2716 | 0.4247 | 5.2856 | 18.7934 |
1.5261 | 8.4 | 5600 | 1.5300 | 2.2938 | 0.4273 | 5.3064 | 18.7828 |
1.5202 | 8.7 | 5800 | 1.5337 | 2.2744 | 0.4236 | 5.3086 | 18.8092 |
1.4942 | 9.0 | 6000 | 1.5351 | 2.2522 | 0.4239 | 5.257 | 18.7704 |
1.4816 | 9.3 | 6200 | 1.5349 | 2.2528 | 0.4247 | 5.2518 | 18.7682 |
1.5169 | 9.6 | 6400 | 1.5339 | 2.2698 | 0.4265 | 5.2646 | 18.7736 |
1.5007 | 9.9 | 6600 | 1.5334 | 2.269 | 0.4263 | 5.2664 | 18.776 |
Framework versions
- Transformers 4.18.0
- Pytorch 1.10.0+cu111
- Datasets 2.5.1
- Tokenizers 0.12.1