Transformer language model for Croatian and Serbian
Trained on 6GB datasets that contain Croatian and Serbian language for two epochs (500k steps). Leipzig, OSCAR and srWac datasets
Model | #params | Arch. | Training data |
---|---|---|---|
Andrija/SRoBERTa-L |
80M | Third | Leipzig Corpus, OSCAR and srWac (6 GB of text) |