Model description

This model is a fine-tuned version of flax-community/gpt-2-spanish on a custom dataset (not publicly available). The dataset is made of crawled data from 3 Spanish cooking websites and it contains approximately ~50000 recipes. It achieves the following results on the evaluation set:

Loss: 0.5796

Contributors

Julián Cendrero (jucendrero)
Silvia Duque (silBERTa)

How to use it

from transformers import AutoTokenizer, AutoModelForCausalLM

model_checkpoint = 'gastronomia-para-to2/gastronomia_para_to2'
tokenizer = AutoTokenizer.from_pretrained(model_checkpoint)
model = AutoModelForCausalLM.from_pretrained(model_checkpoint)

The tokenizer makes use of the following special tokens to indicate the structure of the recipe:

special_tokens = [
'<INPUT_START>',
'<NEXT_INPUT>',
'<INPUT_END>',
'<TITLE_START>',
'<TITLE_END>',
'<INGR_START>',
'<NEXT_INGR>',
'<INGR_END>',
'<INSTR_START>',
'<NEXT_INSTR>',
'<INSTR_END>',
'<RECIPE_START>',
'<RECIPE_END>']

The input should be of the form:

<RECIPE_START> <INPUT_START> ingredient_1 <NEXT_INPUT> ingredient_2 <NEXT_INPUT> ... <NEXT_INPUT> ingredient_n <INPUT_END> <INGR_START>

We are using the following configuration to generate recipes, but feel free to change parameters as needed:

tokenized_input = tokenizer(input, return_tensors='pt')
output = model.generate(**tokenized_input,
                          max_length=600,
                          do_sample=True,
                          top_p=0.92,
                          top_k=50,
                          num_return_sequences=3)
pre_output = tokenizer.decode(output[0], skip_special_tokens=False)

The recipe ends where the <RECIPE_END> special token appears for the first time.

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 1
eval_batch_size: 1
seed: 42
gradient_accumulation_steps: 8
total_train_batch_size: 8
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 6
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss
0.6213	1.0	5897	0.6214
0.5905	2.0	11794	0.5995
0.5777	3.0	17691	0.5893
0.574	4.0	23588	0.5837
0.5553	5.0	29485	0.5807
0.5647	6.0	35382	0.5796

Framework versions

Transformers 4.17.0
Pytorch 1.11.0+cu102
Datasets 2.0.0
Tokenizers 0.11.6

References

The list of special tokens used for generation recipe structure has been taken from: RecipeNLG: A Cooking Recipes Dataset for Semi-Structured Text Generation.