xlm-roberta-longformer-base-16384
xlm-roberta-longformer is a multilingual Longformer initialized with XLM-RoBERTa's weights without further pretraining. It is intended to be fine-tuned on a downstream task.
Model | attention_window | hidden_size | num_hidden_layers | model_max_length |
---|---|---|---|---|
base | 256 | 768 | 12 | 16384 |
large | 512 | 1024 | 24 | 16384 |
Framework versions
- Transformers 4.26.0
- TensorFlow 2.11.0
- Tokenizers 0.13.2