FactualMedBERT-DE: Clinical Factuality Detection BERT model for German language
Model description
FactualMedBERT-DE is the first pre-trained language model to address factuality/assertion detection problem in German clinical texts (primarily discharge summaries).
It is introduced in the paper Factuality Detection using Machine Translation - a Use Case for German Clinical Text. The model classifies tagged medical conditions based
on their factuality value. It can support label classification of Affirmed, Negated and Possible.
It was intialized from smanjil/German-MedBERT German language model and was trained on a translated subset data of the 2010 i2b2/VA assertion challenege.
How to use the model
- You might need to authenticate and login before being able to download the model (see more here)
- Get the model using the transformers library
from transformers import AutoTokenizer, AutoModelForSequenceClassification
tokenizer = AutoTokenizer.from_pretrained("binsumait/factual-med-bert-de")
model = AutoModelForSequenceClassification.from_pretrained("binsumait/factual-med-bert-de")
- Predict an instance by pre-tagging the factuality target (ideally a medical condition) with
[unused1]special token:
from transformers import TextClassificationPipeline
instance = "Der Patient hat vielleicht [unused1] Fieber [unused1]"
factuality_pipeline = TextClassificationPipeline(model=model, tokenizer=tokenizer)
print(factuality_pipeline(instance))
which should output:
[{'label': 'possible', 'score': 0.9744388461112976}]
Cite
If you use our model, please cite your paper as follows:
@inproceedings{bin_sumait_2023,
title={Factuality Detection using Machine Translation - a Use Case for German Clinical Text},
author={Bin Sumait, Mohammed and Gabryszak, Aleksandra and Hennig, Leonhard and Roller, Roland},
booktitle={Proceedings of the 19th Conference on Natural Language Processing (KONVENS 2023)},
year={2023}
}