grizzled-interest-2023-03-29
This model is a fine-tuned version of facebook/mbart-large-cc25 on the mtc/newsum2021 dataset. It achieves the following results on the test set:
- Loss: 3.5178
- Rouge1: 31.4512
- Rouge2: 11.0965
- Rougel: 21.5021
- Rougelsum: 28.634
- Gen Len: 75.755
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 3e-05
- train_batch_size: 2
- eval_batch_size: 4
- seed: 42
- gradient_accumulation_steps: 16
- total_train_batch_size: 32
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: polynomial
- lr_scheduler_warmup_steps: 500
- training_steps: 8000
- mixed_precision_training: Native AMP
- label_smoothing_factor: 0.1
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
---|---|---|---|---|---|---|---|---|
3.6815 | 0.89 | 500 | 3.5617 | 29.5414 | 10.5201 | 20.056 | 27.2581 | 86.07 |
3.4132 | 1.79 | 1000 | 3.4133 | 29.6393 | 9.9545 | 19.6903 | 27.0861 | 96.545 |
3.198 | 2.68 | 1500 | 3.3693 | 29.8614 | 10.4517 | 20.1728 | 27.3879 | 94.31 |
3.0292 | 3.58 | 2000 | 3.3370 | 30.6444 | 11.5935 | 21.1955 | 28.2699 | 87.355 |
2.901 | 4.47 | 2500 | 3.3440 | 30.7453 | 11.111 | 21.2076 | 28.269 | 88.365 |
2.7832 | 5.37 | 3000 | 3.3758 | 30.4995 | 10.9025 | 20.6601 | 28.0575 | 104.655 |
2.6965 | 6.26 | 3500 | 3.3793 | 31.2287 | 11.5544 | 21.1909 | 28.738 | 88.47 |
2.6475 | 7.16 | 4000 | 3.4083 | 32.0341 | 11.9417 | 22.2785 | 29.2495 | 84.095 |
2.6196 | 8.05 | 4500 | 3.4007 | 30.8963 | 11.3811 | 21.3146 | 28.3222 | 90.875 |
2.5574 | 8.94 | 5000 | 3.4104 | 32.3867 | 12.0469 | 21.9831 | 29.5205 | 87.46 |
2.4977 | 9.84 | 5500 | 3.4340 | 32.5857 | 12.5072 | 22.6288 | 30.1168 | 79.87 |
2.4362 | 10.73 | 6000 | 3.4626 | 31.9121 | 11.8577 | 22.3647 | 29.3822 | 85.17 |
2.3977 | 11.63 | 6500 | 3.4737 | 32.0202 | 12.0413 | 22.5237 | 29.5166 | 77.905 |
2.369 | 12.52 | 7000 | 3.4890 | 31.2516 | 11.3416 | 21.5711 | 28.5465 | 85.605 |
2.3446 | 13.42 | 7500 | 3.4949 | 32.1277 | 11.6876 | 22.0244 | 29.2239 | 83.895 |
2.3295 | 14.31 | 8000 | 3.4976 | 31.8729 | 11.629 | 21.9629 | 28.9948 | 84.47 |
Framework versions
- Transformers 4.26.1
- Pytorch 2.0.0.dev20230220+cu117
- Datasets 2.9.0
- Tokenizers 0.13.2