<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->
recipe-distilbert-upper-tIs
This model is a fine-tuned version of distilbert-base-uncased on the None dataset. It achieves the following results on the evaluation set:
- Loss: 0.8746
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 256
- eval_batch_size: 256
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 20
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
1.67 | 1.0 | 1353 | 1.2945 |
1.2965 | 2.0 | 2706 | 1.1547 |
1.1904 | 3.0 | 4059 | 1.0846 |
1.1272 | 4.0 | 5412 | 1.0407 |
1.0857 | 5.0 | 6765 | 1.0039 |
1.0549 | 6.0 | 8118 | 0.9802 |
1.03 | 7.0 | 9471 | 0.9660 |
1.01 | 8.0 | 10824 | 0.9474 |
0.9931 | 9.0 | 12177 | 0.9365 |
0.9807 | 10.0 | 13530 | 0.9252 |
0.9691 | 11.0 | 14883 | 0.9105 |
0.9601 | 12.0 | 16236 | 0.9079 |
0.9503 | 13.0 | 17589 | 0.8979 |
0.9436 | 14.0 | 18942 | 0.8930 |
0.9371 | 15.0 | 20295 | 0.8875 |
0.9322 | 16.0 | 21648 | 0.8851 |
0.9279 | 17.0 | 23001 | 0.8801 |
0.9254 | 18.0 | 24354 | 0.8812 |
0.9227 | 19.0 | 25707 | 0.8768 |
0.9232 | 20.0 | 27060 | 0.8746 |
Framework versions
- Transformers 4.19.0.dev0
- Pytorch 1.11.0+cu102
- Datasets 2.3.2
- Tokenizers 0.12.1