generated_from_trainer

<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->

tiny-mlm-glue-mnli-custom-tokenizer

This model is a fine-tuned version of google/bert_uncased_L-2_H-128_A-2 on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

Training results

Training Loss Epoch Step Validation Loss
7.8162 0.4 500 7.1032
6.9567 0.8 1000 7.0697
6.8563 1.2 1500 7.0460
6.7685 1.6 2000 7.0131
6.6897 2.0 2500 6.9769
6.5455 2.4 3000 6.9249
6.482 2.8 3500 6.8552
6.4153 3.2 4000 6.8445
6.38 3.6 4500 6.7803
6.4066 4.0 5000 6.8070
6.2854 4.4 5500 6.7329
6.2966 4.8 6000 6.7094
6.1244 5.2 6500 6.6476
6.1276 5.6 7000 6.6118
6.0685 6.0 7500 6.5714
5.98 6.4 8000 6.5522
6.0174 6.8 8500 6.5093
5.9451 7.2 9000 6.4866
5.9681 7.6 9500 6.5238
5.9246 8.0 10000 6.5340
5.9219 8.4 10500 6.4727
5.8812 8.8 11000 6.4483
5.7815 9.2 11500 6.4402
5.7938 9.6 12000 6.4124
5.7934 10.0 12500 6.3908
5.7332 10.4 13000 6.3861
5.7628 10.8 13500 6.3638
5.7259 11.2 14000 6.3345
5.7505 11.6 14500 6.3117
5.6441 12.0 15000 6.3118
5.7058 12.4 15500 6.3116
5.6017 12.8 16000 6.2728
5.6424 13.2 16500 6.2790
5.5799 13.6 17000 6.3034
5.5625 14.0 17500 6.2580
5.6015 14.4 18000 6.2607
5.4884 14.8 18500 6.2535
5.5117 15.2 19000 6.1960
5.4919 15.6 19500 6.1907
5.4624 16.0 20000 6.1838
5.4721 16.4 20500 6.1461
5.4833 16.8 21000 6.1251
5.4404 17.2 21500 6.1725
5.4487 17.6 22000 6.1417
5.4499 18.0 22500 6.1721

Framework versions