The foundation this model is the RoBERTa-style model deepset/gbert-large. <br> Following Gururangan et al. (2020) we gathered a collection of narrative fiction and continued the models pre-training task with it. <br> The training is performed over 10 epochs on 2.3 GB of text with a learning rate of 0.0001 (linear decrease) and a batch size of 512.