<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->
distilbart-cnn-arxiv-pubmed-v3-e32
This model is a fine-tuned version of theojolliffe/distilbart-cnn-arxiv-pubmed on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 0.9622
- Rouge1: 58.4519
- Rouge2: 45.6847
- Rougel: 49.3188
- Rougelsum: 57.1351
- Gen Len: 141.9815
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 1
- eval_batch_size: 1
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 32
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
---|---|---|---|---|---|---|---|---|
1.4924 | 1.0 | 795 | 1.0924 | 52.3565 | 32.9081 | 34.6648 | 49.6351 | 142.0 |
0.8865 | 2.0 | 1590 | 0.9394 | 54.2962 | 35.9725 | 38.3888 | 51.5708 | 140.9815 |
0.6979 | 3.0 | 2385 | 0.8831 | 53.6795 | 35.226 | 37.4988 | 51.4424 | 141.8704 |
0.4868 | 4.0 | 3180 | 0.8457 | 53.9141 | 35.2212 | 37.6423 | 51.63 | 142.0 |
0.3903 | 5.0 | 3975 | 0.8252 | 54.8908 | 36.8468 | 39.072 | 52.6068 | 141.8704 |
0.2725 | 6.0 | 4770 | 0.8338 | 54.2424 | 36.4675 | 39.6312 | 51.9973 | 142.0 |
0.2177 | 7.0 | 5565 | 0.8224 | 54.0085 | 36.9395 | 39.7131 | 51.8476 | 142.0 |
0.1736 | 8.0 | 6360 | 0.8001 | 55.5106 | 38.8828 | 41.7174 | 53.3171 | 141.7222 |
0.1368 | 9.0 | 7155 | 0.8036 | 56.7284 | 40.8327 | 42.8486 | 54.6505 | 141.8519 |
0.1272 | 10.0 | 7950 | 0.8197 | 54.5703 | 38.5037 | 41.591 | 52.4417 | 141.2963 |
0.0977 | 11.0 | 8745 | 0.8463 | 55.3691 | 40.5406 | 43.9156 | 53.6637 | 141.7593 |
0.0768 | 12.0 | 9540 | 0.8467 | 56.7099 | 41.6472 | 44.8171 | 54.8111 | 142.0 |
0.0702 | 13.0 | 10335 | 0.8488 | 56.6646 | 41.2164 | 43.8938 | 54.7209 | 142.0 |
0.0597 | 14.0 | 11130 | 0.8543 | 55.7245 | 40.9593 | 42.5698 | 53.8763 | 142.0 |
0.0514 | 15.0 | 11925 | 0.8567 | 56.4837 | 41.8224 | 44.5484 | 54.9102 | 142.0 |
0.045 | 16.0 | 12720 | 0.8794 | 57.5862 | 43.4725 | 46.3658 | 55.9579 | 142.0 |
0.0367 | 17.0 | 13515 | 0.8974 | 57.1023 | 42.9042 | 45.8444 | 55.2216 | 142.0 |
0.0346 | 18.0 | 14310 | 0.9143 | 57.7781 | 43.8333 | 47.0943 | 56.0032 | 142.0 |
0.03 | 19.0 | 15105 | 0.9044 | 56.9211 | 41.9678 | 44.5081 | 54.8092 | 141.6667 |
0.0241 | 20.0 | 15900 | 0.9109 | 57.7747 | 44.1122 | 46.5743 | 55.9199 | 141.8148 |
0.0225 | 21.0 | 16695 | 0.9180 | 56.2307 | 42.2787 | 45.602 | 54.6285 | 142.0 |
0.0184 | 22.0 | 17490 | 0.9120 | 57.4024 | 43.657 | 46.5646 | 55.4614 | 142.0 |
0.0182 | 23.0 | 18285 | 0.9262 | 57.292 | 42.8935 | 46.1294 | 55.3741 | 141.963 |
0.016 | 24.0 | 19080 | 0.9268 | 58.2018 | 44.3914 | 47.7056 | 56.4628 | 142.0 |
0.0139 | 25.0 | 19875 | 0.9373 | 58.1187 | 44.7233 | 47.8946 | 56.26 | 142.0 |
0.0125 | 26.0 | 20670 | 0.9300 | 57.8399 | 44.3073 | 48.4549 | 56.1325 | 141.8889 |
0.012 | 27.0 | 21465 | 0.9487 | 57.8585 | 43.8361 | 47.6488 | 56.2748 | 142.0 |
0.0095 | 28.0 | 22260 | 0.9620 | 57.5966 | 44.0481 | 46.8771 | 56.079 | 141.6852 |
0.009 | 29.0 | 23055 | 0.9526 | 57.8869 | 44.2234 | 48.0884 | 56.3158 | 141.9815 |
0.008 | 30.0 | 23850 | 0.9626 | 58.2649 | 45.0371 | 48.5288 | 56.7707 | 141.9815 |
0.0076 | 31.0 | 24645 | 0.9640 | 58.1467 | 45.0457 | 48.7258 | 56.7111 | 141.3704 |
0.0072 | 32.0 | 25440 | 0.9622 | 58.4519 | 45.6847 | 49.3188 | 57.1351 | 141.9815 |
Framework versions
- Transformers 4.18.0
- Pytorch 1.11.0+cu113
- Datasets 2.1.0
- Tokenizers 0.12.1