This is a pretrained-from-scratch BART large model (400M parameters).
Training was performed on a clean 50GB Romanian text corpus for 3M steps with these scripts. The model was trained with a maximum sequence length of 512.
!! IMPORTANT !! This model was pretrained on the text corruption task, meaning this model is not usable in any downstream task without finetuning first!