generated_from_trainer

<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->

small-mlm-glue-mnli-custom-tokenizer

This model is a fine-tuned version of google/bert_uncased_L-4_H-512_A-8 on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

Training results

Training Loss Epoch Step Validation Loss
7.0308 0.4 500 6.6001
6.346 0.8 1000 6.3998
6.1061 1.2 1500 6.3170
5.9586 1.6 2000 6.2799
5.8773 2.0 2500 6.2034
5.7403 2.4 3000 6.1609
5.6602 2.8 3500 6.1113
5.5809 3.2 4000 6.1267
5.5663 3.6 4500 6.0647
5.6266 4.0 5000 6.1090
5.4756 4.4 5500 6.0302
5.4905 4.8 6000 6.0292
5.3179 5.2 6500 5.9758
5.3375 5.6 7000 6.0125
5.3035 6.0 7500 5.9495
5.1918 6.4 8000 5.9537
5.2499 6.8 8500 5.9100
5.1905 7.2 9000 5.8620
5.1787 7.6 9500 5.9296
5.1534 8.0 10000 5.9442
5.1396 8.4 10500 5.8609
5.1272 8.8 11000 5.8358
4.9615 9.2 11500 5.8617
5.0062 9.6 12000 5.8043
5.0131 10.0 12500 5.8119
4.9326 10.4 13000 5.7851
4.9655 10.8 13500 5.7792
4.9256 11.2 14000 5.7843
4.9195 11.6 14500 5.7652
4.8299 12.0 15000 5.7606
4.8748 12.4 15500 5.7577
4.7588 12.8 16000 5.7048
4.8185 13.2 16500 5.7245
4.7679 13.6 17000 5.7402
4.7377 14.0 17500 5.7034
4.7403 14.4 18000 5.7054
4.6628 14.8 18500 5.7203
4.6801 15.2 19000 5.6798
4.6014 15.6 19500 5.6931
4.618 16.0 20000 5.6620
4.6037 16.4 20500 5.6441
4.6004 16.8 21000 5.6262
4.5432 17.2 21500 5.6726
4.576 17.6 22000 5.6322
4.5568 18.0 22500 5.6551

Framework versions