poem-gen-spanish-t5-small
This model is a fine-tuned version of flax-community/spanish-t5-small on the Spanish Poetry Dataset dataset.
The model was created during the First Spanish Hackathon organized by Somos NLP.
The team who participated was composed by:
- ๐จ๐บ Alberto Carmona Barthelemy
- ๐จ๐ด Jorge Henao
- ๐ช๐ธ Andrea Morales Garzรณn
- ๐ฎ๐ณ Drishti Sharma
It achieves the following results on the evaluation set:
- Loss: 2.8707
- Perplexity: 17.65
Model description
The model was trained to generate spanish poems attending to some parameters like style, sentiment, words to include and starting phrase.
Example:
poema:
estilo: Pablo Neruda &&
sentimiento: positivo &&
palabras: cielo, luna, mar &&
texto: Todos fueron a verle pasar
How to use
You can use this model directly with a pipeline for masked language modeling:
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
model_name = 'hackathon-pln-es/poem-gen-spanish-t5-small'
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)
author, sentiment, word, start_text = 'Pablo Neruda', 'positivo', 'cielo', 'Todos fueron a la plaza'
input_text = f"""poema: estilo: {author} && sentimiento: {sentiment} && palabras: {word} && texto: {start_text} """
inputs = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(inputs["input_ids"],
do_sample = True,
max_length = 30,
repetition_penalty = 20.0,
top_k = 50,
top_p = 0.92)
detok_outputs = [tokenizer.decode(x, skip_special_tokens=True) for x in outputs]
res = detok_outputs[0]
Training and evaluation data
The original dataset has the columns author, content and title.
For each poem we generate new examples:
- content: line_i , generated: line_i+1
- content: concatenate(line_i, line_i+1) , generated: line_i+2
- content: concatenate(line_i, line_i+1, line_i+2) , generated: line_i+3
The resulting dataset has the columns author, content, title and generated.
For each example we compute the sentiment of the generated column and the nouns. In the case of sentiment, we used the model mrm8488/electricidad-small-finetuned-restaurant-sentiment-analysis and for nouns extraction we used spaCy.
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 6
- eval_batch_size: 6
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 6
Training results
| Training Loss | Epoch | Step | Validation Loss |
|---|---|---|---|
| 2.7082 | 0.73 | 30000 | 2.8878 |
| 2.6251 | 1.46 | 60000 | 2.8940 |
| 2.5796 | 2.19 | 90000 | 2.8853 |
| 2.5556 | 2.93 | 120000 | 2.8749 |
| 2.527 | 3.66 | 150000 | 2.8850 |
| 2.5024 | 4.39 | 180000 | 2.8760 |
| 2.4887 | 5.12 | 210000 | 2.8749 |
| 2.4808 | 5.85 | 240000 | 2.8707 |
Framework versions
- Transformers 4.17.0
- Pytorch 1.10.0+cu111
- Datasets 2.0.0
- Tokenizers 0.11.6