Bert-base-cased Fine Tuned Glue Sst2

This checkpoint was initialized from the pre-trained checkpoint bert-base-cased and subsequently fine-tuned on GLUE task: sst2 using this notebook. Training was conducted for 3 epochs, using a linear decaying learning rate of 2e-05, and a total batch size of 32.

The model has a final training loss of 0.035 and a accuracy of 0.927.