Taglish-Electra
Our Taglish-Electra model was pretrained with two Filipino training datasets and one English dataset to increase improvement against Filipino text with English where speakers may code-switch between the two languages.
- Openwebtext (English)
- WikiText-TL-39 (Filipino)
- TLUnified Large Scale Corpus
This is the discriminator model, which is the main Transformer used for finetuning to downstream tasks. For generation, mask-filling, and retraining, refer to the Generator models.