generated_from_trainer

<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->

baseline-roberta_pre_layer_norm-model

Model description

Base Model Architecture: Roberta Pre-Layer Norm

Training and evaluation data

BabyLM Dataset (CoNLL 2023 Workshop)

Training procedure

Masked language modeling

Training hyperparameters

The following hyperparameters were used during training:

Framework versions