dv-muril

This is an experiment in transfer learning, to insert Dhivehi word and word-piece tokens into Google's MuRIL model.

This BERT-based model currently performs better than dv-wave ELECTRA on the Maldivian News Classification task https://github.com/Sofwath/DhivehiDatasets

Training

Performance

CoLab notebook: https://colab.research.google.com/drive/113o6vkLZRkm6OwhTHrvE0x6QPpavj0fn