question-answering

This model was trained on the SQAC dataset, provided by BSC. It is a question-answering dataset originally developed in Spanish. As for the model, it is a fine-tuned version of MarIA-Roberta, a spanish roberta also developed by BSC under the project MarIA.

For training the model, we followed the recommendations of the own authors in their paper, performing a full grid search over the hyperparameter space provided in the paper, and selected the best model based on eval_loss.

You can use the model like this:

from transformers import RobertaTokenizer, RobertaForQuestionAnswering
import torch

tokenizer = RobertaTokenizer.from_pretrained("IIC/roberta-base-spanish-sqac")
model = RobertaForQuestionAnswering.from_pretrained("IIC/roberta-base-spanish-sqac")

question, text = "Quién es el padre de Luke Skywalker?", "En la famosa película, Darth Veider le dice a Luke Skywalker aquella frase que todos recordamos: yo soy tu padre."
inputs = tokenizer(question, text, return_tensors="pt")
start_positions = torch.tensor([1])
end_positions = torch.tensor([3])

outputs = model(**inputs, start_positions=start_positions, end_positions=end_positions)
loss = outputs.loss
start_scores = outputs.start_logits
end_scores = outputs.end_logits

Contributions

Thanks to @avacaondata, @alborotis, @albarji, @Dabs, @GuillemGSubies for adding this model.