BERTimbau large for Recognizing Textual Entailment

This is the neuralmind/bert-large-portuguese-cased model finetuned for Recognizing Textual Entailment with the ASSIN dataset. This model is suitable for Portuguese.


Full classification example

from transformers import AutoModelForSequenceClassification, AutoTokenizer, AutoConfig
import numpy as np
import torch
from scipy.special import softmax

model_name = "ruanchaves/bert-large-portuguese-cased-assin-entailment"
s1 = "Os homens estão cuidadosamente colocando as malas no porta-malas de um carro."
s2 = "Os homens estão colocando bagagens dentro do porta-malas de um carro."
model = AutoModelForSequenceClassification.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)
config = AutoConfig.from_pretrained(model_name)
model_input = tokenizer(*([s1], [s2]), padding=True, return_tensors="pt")
with torch.no_grad():
    output = model(**model_input)
    scores = output[0][0].detach().numpy()
    scores = softmax(scores)
    ranking = np.argsort(scores)
    ranking = ranking[::-1]
    for i in range(scores.shape[0]):
        l = config.id2label[ranking[i]]
        s = scores[ranking[i]]
        print(f"{i+1}) Label: {l} Score: {np.round(float(s), 4)}")


Our research is ongoing, and we are currently working on describing our experiments in a paper, which will be published soon. In the meanwhile, if you would like to cite our work or models before the publication of the paper, please cite our GitHub repository:

author = {Chaves Rodrigues, Ruan and Tanti, Marc and Agerri, Rodrigo},
doi = {10.5281/zenodo.7781848},
month = {3},
title = {{Evaluation of Portuguese Language Models}},
url = {},
version = {1.0.0},
year = {2023}