
PreTraining
| Architecture | Weights | PreTraining Loss | PreTraining Perplexity |
|---|---|---|---|
| roberta-base | huggingface/hub | 0.3488 | 3.992 |
| bert-base-uncased | huggingface/hub | 0.3909 | 6.122 |
| electra-large | huggingface/hub | 0.723 | 6.394 |
| albert-base | huggingface/hub | 0.7343 | 7.76 |
| electra-small | huggingface/hub | 0.9226 | 11.098 |
| electra-base | huggingface/hub | 0.9468 | 8.783 |
| distilbert-base-uncased | huggingface/hub | 1.082 | 7.963 |