BERT base for Dhivehi

Pretrained model on Dhivehi language using masked language modeling (MLM).

Tokenizer

The WordPiece tokenizer uses several components:

Training

Training was performed over 16M+ Dhivehi sentences/paragraphs put together by @ashraq. An Adam optimizer with weighted decay was used with following parameters: