generated_from_trainer

<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->

gpt2-shakespeare

This model is a fine-tuned version of gpt2 on datasets containing Shakespeare Books. It achieves the following results on the evaluation set:

Model description

GPT-2 model is finetuned with text corpus.

Intended uses & limitations

Intended use for this model is to write novel in Shakespeare Style. It has limitations to write in other writer's style.

Datasets Description

Text corpus is developed for fine-tuning gpt-2 model. Books are downloaded from Project Gutenberg as plain text files. A large text corpus were needed to train the model to be abled to write in Shakespeare style.

The following books are used to develop text corpus:

Corpus has total 1078389 word tokens.

Datasets Preprocessing

Training and evaluation data

Training dataset has 880447 word tokens and test dataset has 197913 word tokens.

Training procedure

To train the model, training api from Transformer class is used.

Training hyperparameters

The following hyperparameters were used during training:

Training results

Training Loss Epoch Step Validation Loss
No log 0.63 250 2.7133
2.8492 1.25 500 2.6239
2.8492 1.88 750 2.5851
2.3842 2.51 1000 2.5738

Sample Code Using Transformers Pipeline

from transformers import pipeline

story = pipeline('text-generation',model='./gpt2-shakespeare', tokenizer='gpt2', max_length = 300)
story("how art thou")

Framework versions