Model description
This model is a fine-tuned version of flax-community/gpt-2-spanish on a custom dataset (not publicly available). The dataset is made of crawled data from 3 Spanish cooking websites and it contains approximately ~50000 recipes. It achieves the following results on the evaluation set:
- Loss: 0.5796
Contributors
- Julián Cendrero (jucendrero)
- Silvia Duque (silBERTa)
How to use it
from transformers import AutoTokenizer, AutoModelForCausalLM
model_checkpoint = 'gastronomia-para-to2/gastronomia_para_to2'
tokenizer = AutoTokenizer.from_pretrained(model_checkpoint)
model = AutoModelForCausalLM.from_pretrained(model_checkpoint)
The tokenizer makes use of the following special tokens to indicate the structure of the recipe:
special_tokens = [
'<INPUT_START>',
'<NEXT_INPUT>',
'<INPUT_END>',
'<TITLE_START>',
'<TITLE_END>',
'<INGR_START>',
'<NEXT_INGR>',
'<INGR_END>',
'<INSTR_START>',
'<NEXT_INSTR>',
'<INSTR_END>',
'<RECIPE_START>',
'<RECIPE_END>']
The input should be of the form:
<RECIPE_START> <INPUT_START> ingredient_1 <NEXT_INPUT> ingredient_2 <NEXT_INPUT> ... <NEXT_INPUT> ingredient_n <INPUT_END> <INGR_START>
We are using the following configuration to generate recipes, but feel free to change parameters as needed:
tokenized_input = tokenizer(input, return_tensors='pt')
output = model.generate(**tokenized_input,
max_length=600,
do_sample=True,
top_p=0.92,
top_k=50,
num_return_sequences=3)
pre_output = tokenizer.decode(output[0], skip_special_tokens=False)
The recipe ends where the <RECIPE_END> special token appears for the first time.
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 1
- eval_batch_size: 1
- seed: 42
- gradient_accumulation_steps: 8
- total_train_batch_size: 8
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 6
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
0.6213 | 1.0 | 5897 | 0.6214 |
0.5905 | 2.0 | 11794 | 0.5995 |
0.5777 | 3.0 | 17691 | 0.5893 |
0.574 | 4.0 | 23588 | 0.5837 |
0.5553 | 5.0 | 29485 | 0.5807 |
0.5647 | 6.0 | 35382 | 0.5796 |
Framework versions
- Transformers 4.17.0
- Pytorch 1.11.0+cu102
- Datasets 2.0.0
- Tokenizers 0.11.6
References
The list of special tokens used for generation recipe structure has been taken from: RecipeNLG: A Cooking Recipes Dataset for Semi-Structured Text Generation.