generated_from_trainer

<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->

deep-haiku-gpt-2

This model is a fine-tuned version of gpt2 on the haiku dataset.

Model description

The model is a fine-tuned version of GPT-2 for generation of Haikus. The model, data and training procedure is inspired by a blog post by Robert A. Gonsalves. Instead of using a 8bit version of GPT-J 6B, we instead used vanilla GPT-2. From what we saw, the model performance comparable but is much easier to fine-tune.

We used the same multitask training approach as in der post, but significantly extended the dataset (almost double the size of the original on). A prepared version of the dataset can be found here.

Intended uses & limitations

The model is intended to generate Haikus. To do so, it was trained using a multitask learning approach (see Caruana 1997) with the following four different tasks: :

To use the model, use an appropriate prompt like "(dog rain =" and let the model generate a Haiku given the keyword.

Training and evaluation data

We used a collection of existing haikus for training. Furthermore, all haikus were used in their graphemes version as well as a phonemes version. In addition, we extracted key word for all haikus using KeyBERT and sorted out haikus with a low text quality according to the GRUEN score.

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

Training results

Framework versions