<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->
KoT5-test
This model is a fine-tuned version of hyorea1/KoT5-test on the None dataset. It achieves the following results on the evaluation set:
- Loss: 1.1671
- Rouge1: 12.2606
- Rouge2: 2.9413
- Rougel: 12.1602
- Rougelsum: 12.1171
- Gen Len: 34.7162
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 4
- eval_batch_size: 4
- seed: 100
- gradient_accumulation_steps: 2
- total_train_batch_size: 8
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 10
Training results
Training Loss | Epoch | Step | Gen Len | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum |
---|---|---|---|---|---|---|---|---|
1.39 | 0.26 | 400 | 40.4846 | 1.2447 | 11.4691 | 2.9613 | 11.3903 | 11.2842 |
1.6078 | 0.52 | 800 | 37.1794 | 1.2214 | 11.4047 | 2.8134 | 11.2911 | 11.2286 |
1.2711 | 0.78 | 1200 | 36.1625 | 1.2092 | 11.3608 | 2.8542 | 11.2502 | 11.1927 |
1.1407 | 1.05 | 1600 | 35.8515 | 1.1953 | 11.6278 | 2.6468 | 11.5164 | 11.4848 |
1.3556 | 1.31 | 2000 | 35.8926 | 1.1911 | 11.5258 | 3.2315 | 11.4592 | 11.4318 |
1.2502 | 1.57 | 2400 | 34.8743 | 1.1782 | 11.6087 | 3.0687 | 11.5359 | 11.4555 |
1.1821 | 1.83 | 2800 | 35.35 | 1.1731 | 11.6414 | 3.2523 | 11.5635 | 11.4865 |
1.5721 | 2.09 | 3200 | 35.5346 | 1.1740 | 11.9067 | 3.3382 | 11.8748 | 11.8156 |
1.014 | 2.35 | 3600 | 1.1666 | 11.6128 | 3.1918 | 11.5348 | 11.453 | 34.1853 |
1.2737 | 2.61 | 4000 | 1.1711 | 12.2584 | 2.9711 | 12.2113 | 12.1541 | 35.3162 |
1.1664 | 2.88 | 4400 | 1.1623 | 12.4344 | 3.221 | 12.3251 | 12.2923 | 34.5096 |
1.0872 | 3.14 | 4800 | 1.1677 | 12.6984 | 3.1725 | 12.5901 | 12.5768 | 34.5162 |
0.9654 | 3.4 | 5200 | 1.1622 | 12.2024 | 3.3137 | 12.1166 | 12.0733 | 33.7537 |
1.2357 | 3.66 | 5600 | 1.1614 | 12.0954 | 3.0476 | 12.0709 | 12.0331 | 34.5257 |
1.0516 | 3.92 | 6000 | 1.1610 | 12.2234 | 3.2148 | 12.1003 | 12.0567 | 34.5478 |
0.9412 | 4.18 | 6400 | 1.1614 | 12.1884 | 3.1935 | 12.1168 | 12.1024 | 34.4493 |
1.2583 | 4.44 | 6800 | 1.1609 | 12.5444 | 3.2265 | 12.5044 | 12.4172 | 34.8132 |
1.122 | 4.71 | 7200 | 1.1639 | 12.2393 | 3.2752 | 12.1647 | 12.1575 | 34.2728 |
1.4178 | 4.97 | 7600 | 1.1629 | 12.4617 | 3.2909 | 12.3475 | 12.3123 | 34.6971 |
1.1506 | 5.23 | 8000 | 1.1671 | 12.2606 | 2.9413 | 12.1602 | 12.1171 | 34.7162 |
Framework versions
- Transformers 4.25.1
- Pytorch 1.12.1+cu113
- Datasets 2.7.1
- Tokenizers 0.13.2