<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->
text_shortening_model_v42
This model is a fine-tuned version of facebook/bart-large-xsum on the None dataset. It achieves the following results on the evaluation set:
- Loss: 3.2972
- Rouge1: 0.4588
- Rouge2: 0.2356
- Rougel: 0.4162
- Rougelsum: 0.4165
- Bert precision: 0.8664
- Bert recall: 0.8655
- Average word count: 8.5616
- Max word count: 16
- Min word count: 4
- Average token count: 16.1051
- % shortened texts with length > 12: 4.8048
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0003
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 20
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Bert precision | Bert recall | Average word count | Max word count | Min word count | Average token count | % shortened texts with length > 12 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1.1087 | 1.0 | 73 | 2.0307 | 0.4468 | 0.2283 | 0.3951 | 0.394 | 0.8582 | 0.8635 | 8.5435 | 15 | 4 | 14.6997 | 3.6036 |
0.6451 | 2.0 | 146 | 2.0108 | 0.4629 | 0.2419 | 0.4159 | 0.4142 | 0.8724 | 0.8668 | 8.1081 | 17 | 5 | 14.7718 | 4.2042 |
0.4594 | 3.0 | 219 | 1.9499 | 0.4267 | 0.229 | 0.3887 | 0.3882 | 0.8579 | 0.8575 | 8.3093 | 16 | 5 | 13.976 | 1.8018 |
0.4681 | 4.0 | 292 | 2.0819 | 0.4127 | 0.2049 | 0.3734 | 0.372 | 0.8549 | 0.8543 | 8.3123 | 17 | 4 | 15.3514 | 3.6036 |
0.334 | 5.0 | 365 | 2.1413 | 0.4302 | 0.2184 | 0.3885 | 0.3886 | 0.857 | 0.8595 | 8.8589 | 15 | 4 | 14.5285 | 3.6036 |
0.296 | 6.0 | 438 | 2.0881 | 0.4716 | 0.2349 | 0.4216 | 0.4217 | 0.8684 | 0.8706 | 8.7928 | 16 | 5 | 15.0841 | 6.006 |
0.2588 | 7.0 | 511 | 2.2671 | 0.4517 | 0.2262 | 0.4085 | 0.4079 | 0.8654 | 0.8632 | 8.4985 | 14 | 4 | 14.8258 | 3.3033 |
0.1883 | 8.0 | 584 | 2.4313 | 0.4572 | 0.2369 | 0.409 | 0.4099 | 0.8646 | 0.867 | 8.7207 | 16 | 5 | 14.2192 | 4.2042 |
0.1822 | 9.0 | 657 | 2.3293 | 0.4413 | 0.2154 | 0.3943 | 0.3936 | 0.857 | 0.8619 | 8.8318 | 16 | 4 | 16.2973 | 6.006 |
0.1298 | 10.0 | 730 | 2.4037 | 0.4614 | 0.2303 | 0.4145 | 0.4144 | 0.8668 | 0.866 | 8.4715 | 18 | 4 | 15.8348 | 6.3063 |
0.1413 | 11.0 | 803 | 2.7031 | 0.4533 | 0.2337 | 0.4099 | 0.4095 | 0.8656 | 0.8637 | 8.2943 | 16 | 4 | 15.9009 | 4.2042 |
0.0786 | 12.0 | 876 | 2.5766 | 0.441 | 0.2218 | 0.3982 | 0.3982 | 0.8609 | 0.8613 | 8.5916 | 16 | 4 | 15.8228 | 3.6036 |
0.0662 | 13.0 | 949 | 2.8013 | 0.4408 | 0.2177 | 0.3989 | 0.3984 | 0.8573 | 0.8596 | 8.5946 | 15 | 4 | 16.4204 | 4.2042 |
0.0635 | 14.0 | 1022 | 2.8125 | 0.44 | 0.2265 | 0.3974 | 0.3975 | 0.8591 | 0.8618 | 8.8919 | 17 | 4 | 16.7898 | 4.5045 |
0.0648 | 15.0 | 1095 | 2.7665 | 0.4642 | 0.2371 | 0.42 | 0.4197 | 0.8662 | 0.8675 | 8.7477 | 16 | 4 | 15.6186 | 4.8048 |
0.0446 | 16.0 | 1168 | 3.1244 | 0.4599 | 0.2327 | 0.4211 | 0.4205 | 0.8656 | 0.8667 | 8.6396 | 16 | 4 | 16.1351 | 5.7057 |
0.0475 | 17.0 | 1241 | 3.3107 | 0.4626 | 0.24 | 0.422 | 0.4221 | 0.8673 | 0.8696 | 8.7027 | 16 | 5 | 16.3934 | 5.4054 |
0.0332 | 18.0 | 1314 | 3.1808 | 0.465 | 0.2413 | 0.4231 | 0.4231 | 0.8672 | 0.867 | 8.5315 | 16 | 5 | 16.048 | 5.1051 |
0.0252 | 19.0 | 1387 | 3.2446 | 0.4587 | 0.2315 | 0.4142 | 0.4143 | 0.866 | 0.8655 | 8.5586 | 16 | 4 | 16.012 | 4.8048 |
0.0294 | 20.0 | 1460 | 3.2972 | 0.4588 | 0.2356 | 0.4162 | 0.4165 | 0.8664 | 0.8655 | 8.5616 | 16 | 4 | 16.1051 | 4.8048 |
Framework versions
- Transformers 4.33.1
- Pytorch 2.0.1+cu118
- Datasets 2.14.5
- Tokenizers 0.13.3