text-generation-inference

T5S (base-sized model)

T5S model pre-trained on Spanish language. It was introduced in the paper Sequence-to-Sequence Spanish Pre-trained Language Models.

Model description

T5S is a T5 Version 1.1 model (transformer encoder-decoder) with a bidirectional (BERT-like) encoder and an autoregressive (GPT-like) decoder, which includes the following improvements compared to the original T5 model:

T5S is particularly effective when fine-tuned for text generation (e.g. summarization, translation) or comprehension tasks (e.g. text classification, question answering) using text-to-text format.

How to use

Here is how to use this model in PyTorch:

from transformers import T5Tokenizer, T5Model

tokenizer = T5Tokenizer.from_pretrained("vgaraujov/t5-base-spanish")
model = T5Model.from_pretrained("vgaraujov/t5-base-spanish")

input_ids = tokenizer(
    "Estudios han demostrado que tener un perro es bueno para la salud", return_tensors="pt"
).input_ids  # Batch size 1
decoder_input_ids = tokenizer("Estudios demuestran que", return_tensors="pt").input_ids  # Batch size 1

# forward pass
outputs = model(input_ids=input_ids, decoder_input_ids=decoder_input_ids)
last_hidden_states = outputs.last_hidden_state

Citation (BibTeX)

@misc{araujo2023sequencetosequence,
      title={Sequence-to-Sequence Spanish Pre-trained Language Models}, 
      author={Vladimir Araujo and Maria Mihaela Trusca and Rodrigo TufiƱo and Marie-Francine Moens},
      year={2023},
      eprint={2309.11259},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}