BERTSSON Models

The models are trained on:

Corpus size: Roughly 6B tokens.

The following models are currently available:

All models are cased and trained with whole word masking.

Stay tuned for evaluations.