<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->
banglabert_generator-finetuned-fill-in-the-blanks
This model is a fine-tuned version of csebuetnlp/banglabert_generator on the bangla_paraphrase dataset. It achieves the following results on the evaluation set:
- Loss: 4.0254
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 64
- eval_batch_size: 64
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 20
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
4.2018 | 1.0 | 235 | 4.0979 |
4.177 | 2.0 | 470 | 4.0591 |
4.1532 | 3.0 | 705 | 4.0385 |
4.1417 | 4.0 | 940 | 4.0490 |
4.133 | 5.0 | 1175 | 4.0387 |
4.1137 | 6.0 | 1410 | 4.0716 |
4.1033 | 7.0 | 1645 | 4.0118 |
4.0874 | 8.0 | 1880 | 4.0448 |
4.0791 | 9.0 | 2115 | 4.0381 |
4.0788 | 10.0 | 2350 | 4.0457 |
4.061 | 11.0 | 2585 | 3.9917 |
4.0557 | 12.0 | 2820 | 3.9950 |
4.0533 | 13.0 | 3055 | 4.0131 |
4.0582 | 14.0 | 3290 | 4.0080 |
4.042 | 15.0 | 3525 | 4.0265 |
4.0338 | 16.0 | 3760 | 3.9908 |
4.0222 | 17.0 | 3995 | 3.9967 |
4.0343 | 18.0 | 4230 | 4.0011 |
4.0294 | 19.0 | 4465 | 4.0334 |
4.0313 | 20.0 | 4700 | 4.0099 |
Framework versions
- Transformers 4.30.2
- Pytorch 2.0.1+cu118
- Datasets 2.13.1
- Tokenizers 0.13.3