<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->
model_v1_complete_training_wt_init_48_tiny_emb_comp_frz
This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:
- Loss: 3.7746
- Accuracy: 0.3785
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 1e-05
- train_batch_size: 64
- eval_batch_size: 64
- seed: 10
- distributed_type: multi-GPU
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 10000
- num_epochs: 50
Training results
Training Loss | Epoch | Step | Validation Loss | Accuracy |
---|---|---|---|---|
4.7364 | 0.33 | 30000 | 4.6764 | 0.2872 |
4.6351 | 0.66 | 60000 | 4.5672 | 0.2965 |
4.5638 | 0.98 | 90000 | 4.4978 | 0.3026 |
4.5108 | 1.31 | 120000 | 4.4419 | 0.3078 |
4.4634 | 1.64 | 150000 | 4.3959 | 0.3124 |
4.4264 | 1.97 | 180000 | 4.3567 | 0.3165 |
4.3912 | 2.29 | 210000 | 4.3182 | 0.3205 |
4.36 | 2.62 | 240000 | 4.2838 | 0.3242 |
4.3261 | 2.95 | 270000 | 4.2513 | 0.3278 |
4.295 | 3.28 | 300000 | 4.2186 | 0.3321 |
4.2635 | 3.6 | 330000 | 4.1912 | 0.3347 |
4.2496 | 3.93 | 360000 | 4.1700 | 0.3369 |
4.224 | 4.26 | 390000 | 4.1433 | 0.3399 |
4.2082 | 4.59 | 420000 | 4.1228 | 0.3419 |
4.1783 | 4.92 | 450000 | 4.0936 | 0.3451 |
4.1461 | 5.24 | 480000 | 4.0654 | 0.3481 |
4.1124 | 5.57 | 510000 | 4.0376 | 0.3507 |
4.0784 | 5.9 | 540000 | 4.0083 | 0.3538 |
4.0419 | 6.23 | 570000 | 3.9822 | 0.3572 |
4.0211 | 6.55 | 600000 | 3.9610 | 0.3588 |
3.9944 | 6.88 | 630000 | 3.9493 | 0.3601 |
3.994 | 7.21 | 660000 | 3.9389 | 0.3604 |
3.9794 | 7.54 | 690000 | 3.9216 | 0.3629 |
3.959 | 7.87 | 720000 | 3.9106 | 0.3641 |
3.9486 | 8.19 | 750000 | 3.8976 | 0.3657 |
3.939 | 8.52 | 780000 | 3.8868 | 0.3668 |
3.9225 | 8.85 | 810000 | 3.8778 | 0.3675 |
3.9115 | 9.18 | 840000 | 3.8672 | 0.3689 |
3.9036 | 9.5 | 870000 | 3.8573 | 0.3694 |
3.8884 | 9.83 | 900000 | 3.8497 | 0.3704 |
3.8877 | 10.16 | 930000 | 3.8422 | 0.3711 |
3.8735 | 10.49 | 960000 | 3.8343 | 0.3721 |
3.8628 | 10.81 | 990000 | 3.8277 | 0.3727 |
3.8572 | 11.14 | 1020000 | 3.8203 | 0.3738 |
3.8519 | 11.47 | 1050000 | 3.8120 | 0.3744 |
3.8481 | 11.8 | 1080000 | 3.8054 | 0.3752 |
3.8363 | 12.13 | 1110000 | 3.7997 | 0.3756 |
3.8305 | 12.45 | 1140000 | 3.7940 | 0.3762 |
3.8237 | 12.78 | 1170000 | 3.7855 | 0.3774 |
3.82 | 13.11 | 1200000 | 3.7804 | 0.3779 |
3.8083 | 13.44 | 1230000 | 3.7746 | 0.3785 |
Framework versions
- Transformers 4.30.2
- Pytorch 1.14.0a0+410ce96
- Datasets 2.13.1
- Tokenizers 0.13.3