CardioBERTpt - Portuguese Transformer-based Models for Clinical Language Representation in Cardiology

This model card describes CardioBERTpt, a clinical model trained on the cardiology domain for NER tasks in Portuguese. This model is a fine-tuned version of bert-base-multilingual-cased on a cardiology text dataset. It achieves the following results on the evaluation set:

Loss: 0.4495
Accuracy: 0.8864

How to use the model

Load the model via the transformers library:

from transformers import AutoTokenizer, AutoModel
tokenizer = AutoTokenizer.from_pretrained("pucpr-br/cardiobertpt")
model = AutoModel.from_pretrained("pucpr-br/cardiobertpt")

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 4
eval_batch_size: 4
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 15.0

Framework versions

Transformers 4.17.0.dev0
Pytorch 1.8.0
Datasets 1.18.3
Tokenizers 0.11.0

More Information

Refer to the original paper, CardioBERTpt - Portuguese Transformer-based Models for Clinical Language Representation in Cardiology for additional details and performance on Portuguese NER tasks.

Acknowledgements

This study was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - Brasil (CAPES) - Finance Code 001, and by Foxconn Brazil and Zerbini Foundation as part of the research project Machine Learning in Cardiovascular Medicine.

Citation

@INPROCEEDINGS{10178779,
  author={Schneider, Elisa Terumi Rubel and Gumiel, Yohan Bonescki and de Souza, João Vitor Andrioli and Mie Mukai, Lilian and Emanuel Silva e Oliveira, Lucas and de Sa Rebelo, Marina and Antonio Gutierrez, Marco and Eduardo Krieger, Jose and Teodoro, Douglas and Moro, Claudia and Paraiso, Emerson Cabrera},
  booktitle={2023 IEEE 36th International Symposium on Computer-Based Medical Systems (CBMS)}, 
  title={CardioBERTpt: Transformer-based Models for Cardiology Language Representation in Portuguese}, 
  year={2023},
  volume={},
  number={},
  pages={378-381},
  doi={10.1109/CBMS58004.2023.00247}}
}

Questions?

Post a Github issue on the CardioBERTpt repo.