bert-uncased-L2-H768-A12
This is one of 24 smaller BERT models (English only, uncased, trained with WordPiece masking) released by google-research/bert.
These BERT models was released as TensorFlow checkpoints, however, this is the converted version to PyTorch. More information can be found in google-research/bert or lyeoni/convert-tf-to-pytorch.
Evaluation
Here are the evaluation scores (F1/Accuracy) for the MPRC task.
Model | MRPC |
---|---|
BERT-Tiny | 81.22/68.38 |
BERT-Mini | 81.43/69.36 |
BERT-Small | 81.41/70.34 |
BERT-Medium | 83.33/73.53 |
BERT-Base | 85.62/78.19 |
References
@article{turc2019,
title={Well-Read Students Learn Better: On the Importance of Pre-training Compact Models},
author={Turc, Iulia and Chang, Ming-Wei and Lee, Kenton and Toutanova, Kristina},
journal={arXiv preprint arXiv:1908.08962v2 },
year={2019}
}