generated_from_trainer recipe-generation

Model description

This model is a fine-tuned version of flax-community/gpt-2-spanish on a custom dataset (not publicly available). The dataset is made of crawled data from 3 Spanish cooking websites and it contains approximately ~50000 recipes. It achieves the following results on the evaluation set:

Contributors

How to use it

from transformers import AutoTokenizer, AutoModelForCausalLM

model_checkpoint = 'gastronomia-para-to2/gastronomia_para_to2'
tokenizer = AutoTokenizer.from_pretrained(model_checkpoint)
model = AutoModelForCausalLM.from_pretrained(model_checkpoint)

The tokenizer makes use of the following special tokens to indicate the structure of the recipe:

special_tokens = [
'<INPUT_START>',
'<NEXT_INPUT>',
'<INPUT_END>',
'<TITLE_START>',
'<TITLE_END>',
'<INGR_START>',
'<NEXT_INGR>',
'<INGR_END>',
'<INSTR_START>',
'<NEXT_INSTR>',
'<INSTR_END>',
'<RECIPE_START>',
'<RECIPE_END>']

The input should be of the form:

<RECIPE_START> <INPUT_START> ingredient_1 <NEXT_INPUT> ingredient_2 <NEXT_INPUT> ... <NEXT_INPUT> ingredient_n <INPUT_END> <INGR_START>

We are using the following configuration to generate recipes, but feel free to change parameters as needed:

tokenized_input = tokenizer(input, return_tensors='pt')
output = model.generate(**tokenized_input,
                          max_length=600,
                          do_sample=True,
                          top_p=0.92,
                          top_k=50,
                          num_return_sequences=3)
pre_output = tokenizer.decode(output[0], skip_special_tokens=False)

The recipe ends where the <RECIPE_END> special token appears for the first time.

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

Training results

Training Loss Epoch Step Validation Loss
0.6213 1.0 5897 0.6214
0.5905 2.0 11794 0.5995
0.5777 3.0 17691 0.5893
0.574 4.0 23588 0.5837
0.5553 5.0 29485 0.5807
0.5647 6.0 35382 0.5796

Framework versions

References

The list of special tokens used for generation recipe structure has been taken from: RecipeNLG: A Cooking Recipes Dataset for Semi-Structured Text Generation.