CardioBERTpt - Portuguese Transformer-based Models for Clinical Language Representation in Cardiology
This model card describes CardioBERTpt, a clinical model trained on the cardiology domain for NER tasks in Portuguese. This model is a fine-tuned version of bert-base-multilingual-cased on a cardiology text dataset. It achieves the following results on the evaluation set:
- Loss: 0.4495
- Accuracy: 0.8864
How to use the model
Load the model via the transformers library:
from transformers import AutoTokenizer, AutoModel
tokenizer = AutoTokenizer.from_pretrained("pucpr-br/cardiobertpt")
model = AutoModel.from_pretrained("pucpr-br/cardiobertpt")
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 1e-05
- train_batch_size: 4
- eval_batch_size: 4
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 15.0
Framework versions
- Transformers 4.17.0.dev0
- Pytorch 1.8.0
- Datasets 1.18.3
- Tokenizers 0.11.0
More Information
Refer to the original paper, CardioBERTpt - Portuguese Transformer-based Models for Clinical Language Representation in Cardiology for additional details and performance on Portuguese NER tasks.
Acknowledgements
This study was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - Brasil (CAPES) - Finance Code 001, and by Foxconn Brazil and Zerbini Foundation as part of the research project Machine Learning in Cardiovascular Medicine.
Citation
@INPROCEEDINGS{10178779,
author={Schneider, Elisa Terumi Rubel and Gumiel, Yohan Bonescki and de Souza, João Vitor Andrioli and Mie Mukai, Lilian and Emanuel Silva e Oliveira, Lucas and de Sa Rebelo, Marina and Antonio Gutierrez, Marco and Eduardo Krieger, Jose and Teodoro, Douglas and Moro, Claudia and Paraiso, Emerson Cabrera},
booktitle={2023 IEEE 36th International Symposium on Computer-Based Medical Systems (CBMS)},
title={CardioBERTpt: Transformer-based Models for Cardiology Language Representation in Portuguese},
year={2023},
volume={},
number={},
pages={378-381},
doi={10.1109/CBMS58004.2023.00247}}
}
Questions?
Post a Github issue on the CardioBERTpt repo.