<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->
DialoGPT-small-FinalFantasyDialogue
This model is a fine-tuned version of microsoft/DialoGPT-small on the None dataset. It achieves the following results on the evaluation set:
- Loss: 0.3930
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.005
- train_batch_size: 32
- eval_batch_size: 32
- seed: 42
- gradient_accumulation_steps: 8
- total_train_batch_size: 256
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 1000
- num_epochs: 20
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
3.3955 | 1.0 | 141 | 2.8517 |
2.5623 | 1.99 | 282 | 2.1898 |
1.9315 | 3.0 | 424 | 1.7076 |
1.5264 | 4.0 | 565 | 1.3901 |
1.2892 | 4.99 | 706 | 1.1884 |
1.1325 | 6.0 | 848 | 1.0805 |
1.0404 | 7.0 | 989 | 0.9933 |
0.8733 | 8.0 | 1131 | 0.8070 |
0.6344 | 9.0 | 1272 | 0.6326 |
0.5047 | 9.99 | 1413 | 0.5504 |
0.413 | 11.0 | 1555 | 0.5021 |
0.3457 | 12.0 | 1696 | 0.4586 |
0.3049 | 12.99 | 1837 | 0.4294 |
0.2475 | 14.0 | 1979 | 0.4154 |
0.2081 | 15.0 | 2120 | 0.3943 |
0.1808 | 16.0 | 2262 | 0.3886 |
0.1601 | 17.0 | 2403 | 0.3839 |
0.1431 | 17.99 | 2544 | 0.3850 |
0.1323 | 19.0 | 2686 | 0.3843 |
0.1221 | 19.95 | 2820 | 0.3930 |
Framework versions
- Transformers 4.33.2
- Pytorch 2.0.1+cu118
- Datasets 2.14.5
- Tokenizers 0.13.3