<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->
flan-t5-large-extraction-all-dm_8000-ep25-nonstop
This model is a fine-tuned version of google/flan-t5-large on the None dataset. It achieves the following results on the evaluation set:
- Loss: 1.5739
- Hint Hit Num: 2.184
- Hint Precision: 0.4154
- Num: 5.218
- Gen Len: 18.738
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 24
- eval_batch_size: 96
- seed: 1799
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 25
Training results
Training Loss | Epoch | Step | Validation Loss | Hint Hit Num | Hint Precision | Num | Gen Len |
---|---|---|---|---|---|---|---|
2.1513 | 0.6 | 200 | 1.5120 | 2.679 | 0.4838 | 5.551 | 18.94 |
1.9342 | 1.2 | 400 | 1.4818 | 2.563 | 0.4677 | 5.469 | 18.925 |
1.8591 | 1.8 | 600 | 1.4593 | 2.494 | 0.4607 | 5.396 | 18.899 |
1.7973 | 2.4 | 800 | 1.4597 | 2.389 | 0.4515 | 5.265 | 18.836 |
1.7824 | 2.99 | 1000 | 1.4494 | 2.387 | 0.4419 | 5.399 | 18.891 |
1.7263 | 3.59 | 1200 | 1.4597 | 2.278 | 0.4301 | 5.261 | 18.875 |
1.711 | 4.19 | 1400 | 1.4673 | 2.292 | 0.4314 | 5.272 | 18.826 |
1.6631 | 4.79 | 1600 | 1.4638 | 2.185 | 0.4163 | 5.177 | 18.832 |
1.6494 | 5.39 | 1800 | 1.4625 | 2.287 | 0.431 | 5.278 | 18.841 |
1.6328 | 5.99 | 2000 | 1.4584 | 2.209 | 0.4211 | 5.185 | 18.842 |
1.6008 | 6.59 | 2200 | 1.4677 | 2.299 | 0.4374 | 5.233 | 18.777 |
1.5646 | 7.19 | 2400 | 1.4902 | 2.182 | 0.4224 | 5.137 | 18.71 |
1.574 | 7.78 | 2600 | 1.4777 | 2.211 | 0.4235 | 5.19 | 18.781 |
1.5348 | 8.38 | 2800 | 1.4796 | 2.314 | 0.4311 | 5.317 | 18.792 |
1.5224 | 8.98 | 3000 | 1.4799 | 2.197 | 0.4212 | 5.17 | 18.805 |
1.4857 | 9.58 | 3200 | 1.4897 | 2.256 | 0.4296 | 5.221 | 18.755 |
1.4948 | 10.18 | 3400 | 1.5030 | 2.206 | 0.4203 | 5.201 | 18.76 |
1.4667 | 10.78 | 3600 | 1.4956 | 2.269 | 0.4319 | 5.203 | 18.772 |
1.4492 | 11.38 | 3800 | 1.5098 | 2.208 | 0.4191 | 5.235 | 18.801 |
1.4454 | 11.98 | 4000 | 1.5064 | 2.187 | 0.4153 | 5.22 | 18.799 |
1.4125 | 12.57 | 4200 | 1.5173 | 2.175 | 0.4164 | 5.182 | 18.766 |
1.426 | 13.17 | 4400 | 1.5299 | 2.162 | 0.414 | 5.189 | 18.772 |
1.3944 | 13.77 | 4600 | 1.5297 | 2.199 | 0.4182 | 5.224 | 18.797 |
1.382 | 14.37 | 4800 | 1.5301 | 2.204 | 0.4217 | 5.197 | 18.799 |
1.3836 | 14.97 | 5000 | 1.5303 | 2.188 | 0.4185 | 5.209 | 18.764 |
1.358 | 15.57 | 5200 | 1.5293 | 2.264 | 0.4283 | 5.261 | 18.812 |
1.3645 | 16.17 | 5400 | 1.5411 | 2.195 | 0.42 | 5.19 | 18.753 |
1.3455 | 16.77 | 5600 | 1.5417 | 2.267 | 0.4286 | 5.251 | 18.76 |
1.3395 | 17.37 | 5800 | 1.5436 | 2.207 | 0.4217 | 5.19 | 18.738 |
1.3302 | 17.96 | 6000 | 1.5468 | 2.268 | 0.4256 | 5.288 | 18.765 |
1.3329 | 18.56 | 6200 | 1.5488 | 2.265 | 0.4251 | 5.299 | 18.788 |
1.299 | 19.16 | 6400 | 1.5582 | 2.245 | 0.4253 | 5.25 | 18.717 |
1.3141 | 19.76 | 6600 | 1.5562 | 2.211 | 0.421 | 5.195 | 18.742 |
1.318 | 20.36 | 6800 | 1.5597 | 2.22 | 0.4204 | 5.24 | 18.776 |
1.2905 | 20.96 | 7000 | 1.5605 | 2.228 | 0.4224 | 5.24 | 18.745 |
1.2967 | 21.56 | 7200 | 1.5679 | 2.199 | 0.4149 | 5.255 | 18.798 |
1.2896 | 22.16 | 7400 | 1.5667 | 2.218 | 0.4212 | 5.229 | 18.736 |
1.2886 | 22.75 | 7600 | 1.5663 | 2.212 | 0.4175 | 5.262 | 18.8 |
1.2818 | 23.35 | 7800 | 1.5718 | 2.211 | 0.4193 | 5.228 | 18.757 |
1.2893 | 23.95 | 8000 | 1.5730 | 2.185 | 0.4155 | 5.215 | 18.737 |
1.2772 | 24.55 | 8200 | 1.5736 | 2.186 | 0.4153 | 5.224 | 18.753 |
Framework versions
- Transformers 4.18.0
- Pytorch 1.10.0+cu111
- Datasets 2.5.1
- Tokenizers 0.12.1