<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->
synpre_union_1M_t5-small
This model is a fine-tuned version of t5-small on the tyzhu/synpre_union_1M dataset. It achieves the following results on the evaluation set:
- Loss: 0.0886
- Bleu: 88.4534
- Gen Len: 50.3667
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 128
- eval_batch_size: 128
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: constant_with_warmup
- lr_scheduler_warmup_steps: 10000
- training_steps: 200000
Training results
Training Loss | Epoch | Step | Validation Loss | Bleu | Gen Len |
---|---|---|---|---|---|
9.2983 | 0.64 | 5000 | 9.1434 | 0.0329 | 56.9374 |
8.1313 | 1.28 | 10000 | 7.7232 | 0.4594 | 85.0875 |
3.4773 | 1.92 | 15000 | 2.1247 | 14.433 | 41.8774 |
1.7077 | 2.56 | 20000 | 1.2042 | 29.0873 | 46.2582 |
1.1895 | 3.2 | 25000 | 0.9203 | 42.2246 | 49.5123 |
0.9788 | 3.84 | 30000 | 0.7934 | 47.8307 | 50.2281 |
0.8514 | 4.48 | 35000 | 0.7216 | 52.7369 | 50.2908 |
0.7396 | 5.12 | 40000 | 0.6212 | 56.4669 | 50.4039 |
0.673 | 5.76 | 45000 | 0.5436 | 59.5425 | 50.409 |
0.588 | 6.4 | 50000 | 0.4722 | 59.8999 | 50.3889 |
0.522 | 7.04 | 55000 | 0.4068 | 63.3246 | 50.3016 |
0.4795 | 7.68 | 60000 | 0.3772 | 65.3541 | 50.4084 |
0.4334 | 8.32 | 65000 | 0.3388 | 68.2614 | 50.3323 |
0.3952 | 8.96 | 70000 | 0.2975 | 70.2889 | 50.4226 |
0.3498 | 9.6 | 75000 | 0.2634 | 73.9835 | 50.4118 |
0.315 | 10.24 | 80000 | 0.2791 | 63.3974 | 50.3034 |
0.2962 | 10.88 | 85000 | 0.2213 | 76.1748 | 50.4519 |
0.2661 | 11.52 | 90000 | 0.1985 | 78.6865 | 50.4598 |
0.2429 | 12.16 | 95000 | 0.1819 | 80.8658 | 50.4492 |
0.2273 | 12.8 | 100000 | 0.1850 | 77.0985 | 50.4322 |
0.2115 | 13.44 | 105000 | 0.1527 | 83.8686 | 50.4352 |
0.1926 | 14.08 | 110000 | 0.1412 | 83.8982 | 50.4047 |
0.1864 | 14.72 | 115000 | 0.1468 | 78.5222 | 50.3565 |
0.1673 | 15.36 | 120000 | 0.1233 | 86.3438 | 50.3884 |
0.161 | 16.0 | 125000 | 0.1262 | 83.0453 | 50.3824 |
0.1511 | 16.64 | 130000 | 0.1239 | 83.3592 | 50.4264 |
0.1432 | 17.28 | 135000 | 0.1097 | 87.7233 | 50.3878 |
0.1373 | 17.92 | 140000 | 0.1311 | 80.1804 | 50.3788 |
0.1283 | 18.56 | 145000 | 0.1070 | 86.6683 | 50.4169 |
0.1243 | 19.2 | 150000 | 0.1204 | 82.8423 | 50.3879 |
0.1259 | 19.84 | 155000 | 0.0989 | 88.3505 | 50.4013 |
0.1172 | 20.48 | 160000 | 0.1006 | 88.0823 | 50.4004 |
0.1096 | 21.12 | 165000 | 0.0962 | 89.2558 | 50.4177 |
0.1089 | 21.76 | 170000 | 0.0933 | 88.8995 | 50.3794 |
0.1028 | 22.4 | 175000 | 0.1138 | 83.1713 | 50.4083 |
0.0974 | 23.04 | 180000 | 0.0962 | 86.6705 | 50.3565 |
0.0991 | 23.68 | 185000 | 0.1007 | 85.517 | 50.408 |
0.0959 | 24.32 | 190000 | 0.0942 | 87.3596 | 50.409 |
0.0946 | 24.96 | 195000 | 0.0923 | 87.6022 | 50.4 |
0.0893 | 25.6 | 200000 | 0.0886 | 88.4534 | 50.3667 |
Framework versions
- Transformers 4.34.0
- Pytorch 2.1.0+cu121
- Datasets 2.14.5
- Tokenizers 0.14.1