<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->
synpre_sort_1M_t5-small
This model is a fine-tuned version of t5-small on the tyzhu/synpre_sort_1M dataset. It achieves the following results on the evaluation set:
- Loss: 0.0190
- Bleu: 98.0316
- Gen Len: 111.961
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 128
- eval_batch_size: 128
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: constant_with_warmup
- lr_scheduler_warmup_steps: 10000
- training_steps: 200000
Training results
Training Loss | Epoch | Step | Validation Loss | Bleu | Gen Len |
---|---|---|---|---|---|
3.741 | 0.64 | 5000 | 3.7062 | 3.4412 | 254.9636 |
3.3709 | 1.28 | 10000 | 3.2792 | 4.175 | 250.294 |
3.0019 | 1.92 | 15000 | 3.0466 | 8.6186 | 231.7038 |
2.4007 | 2.56 | 20000 | 1.8277 | 28.7633 | 143.017 |
0.657 | 3.2 | 25000 | 0.3741 | 63.4137 | 137.5676 |
0.2657 | 3.84 | 30000 | 0.1893 | 88.1786 | 107.6849 |
0.1734 | 4.48 | 35000 | 0.1292 | 91.0577 | 112.1501 |
0.1347 | 5.12 | 40000 | 0.0941 | 86.8779 | 120.4858 |
0.1084 | 5.76 | 45000 | 0.0758 | 87.4286 | 120.1785 |
0.0894 | 6.4 | 50000 | 0.0574 | 95.8102 | 112.2934 |
0.0779 | 7.04 | 55000 | 0.0511 | 96.0991 | 111.8631 |
0.0662 | 7.68 | 60000 | 0.0440 | 93.9008 | 114.9835 |
0.0588 | 8.32 | 65000 | 0.0414 | 95.7738 | 112.0711 |
0.0542 | 8.96 | 70000 | 0.0323 | 95.9925 | 113.2 |
0.0489 | 9.6 | 75000 | 0.0439 | 92.3498 | 116.3835 |
0.045 | 10.24 | 80000 | 0.0334 | 94.4857 | 114.6477 |
0.0404 | 10.88 | 85000 | 0.0297 | 94.7885 | 114.4035 |
0.0368 | 11.52 | 90000 | 0.0343 | 94.0487 | 114.9892 |
0.0341 | 12.16 | 95000 | 0.0479 | 91.8024 | 116.8868 |
0.0331 | 12.8 | 100000 | 0.0207 | 97.1429 | 112.4291 |
0.0303 | 13.44 | 105000 | 0.0212 | 96.508 | 113.036 |
0.0288 | 14.08 | 110000 | 0.0203 | 97.158 | 112.5062 |
0.0272 | 14.72 | 115000 | 0.0188 | 97.6459 | 111.7762 |
0.0255 | 15.36 | 120000 | 0.0171 | 97.6568 | 112.1423 |
0.0254 | 16.0 | 125000 | 0.0176 | 96.9753 | 112.6359 |
0.023 | 16.64 | 130000 | 0.0162 | 97.1541 | 112.5218 |
0.0222 | 17.28 | 135000 | 0.0214 | 95.6687 | 113.6831 |
0.021 | 17.92 | 140000 | 0.0182 | 97.3936 | 112.4909 |
0.0208 | 18.56 | 145000 | 0.0241 | 96.4207 | 113.2513 |
0.0192 | 19.2 | 150000 | 0.0296 | 95.2463 | 114.13 |
0.019 | 19.84 | 155000 | 0.0132 | 98.2794 | 111.7931 |
0.0175 | 20.48 | 160000 | 0.0182 | 97.3661 | 112.4856 |
0.017 | 21.12 | 165000 | 0.0208 | 96.944 | 112.8231 |
0.0162 | 21.76 | 170000 | 0.0119 | 98.2846 | 111.8194 |
0.0158 | 22.4 | 175000 | 0.0148 | 98.1654 | 111.94 |
0.0152 | 23.04 | 180000 | 0.0125 | 98.1486 | 111.882 |
0.0146 | 23.68 | 185000 | 0.0139 | 98.0184 | 111.9486 |
0.0139 | 24.32 | 190000 | 0.0265 | 96.2552 | 113.1783 |
0.0148 | 24.96 | 195000 | 0.0152 | 98.2077 | 111.9351 |
0.0139 | 25.6 | 200000 | 0.0190 | 98.0316 | 111.961 |
Framework versions
- Transformers 4.34.0
- Pytorch 2.1.0+cu121
- Datasets 2.14.5
- Tokenizers 0.14.1