generated_from_keras_callback

xlm-roberta-longformer-base-16384

xlm-roberta-longformer is a multilingual Longformer initialized with XLM-RoBERTa's weights without further pretraining. It is intended to be fine-tuned on a downstream task.

Model attention_window hidden_size num_hidden_layers model_max_length
base 256 768 12 16384
large 512 1024 24 16384

Framework versions