FactualMedBERT-DE: Clinical Factuality Detection BERT model for German language
Model description
FactualMedBERT-DE is the first pre-trained language model to address factuality/assertion detection problem in German clinical texts (primarily discharge summaries).
It is introduced in the paper Factuality Detection using Machine Translation - a Use Case for German Clinical Text. The model classifies tagged medical conditions based
on their factuality value. It can support label classification of Affirmed
, Negated
and Possible
.
It was intialized from smanjil/German-MedBERT German language model and was trained on a translated subset data of the 2010 i2b2/VA assertion challenege.
How to use the model
- You might need to authenticate and login before being able to download the model (see more here)
- Get the model using the transformers library
from transformers import AutoTokenizer, AutoModelForSequenceClassification
tokenizer = AutoTokenizer.from_pretrained("binsumait/factual-med-bert-de")
model = AutoModelForSequenceClassification.from_pretrained("binsumait/factual-med-bert-de")
- Predict an instance by pre-tagging the factuality target (ideally a medical condition) with
[unused1]
special token:
from transformers import TextClassificationPipeline
instance = "Der Patient hat vielleicht [unused1] Fieber [unused1]"
factuality_pipeline = TextClassificationPipeline(model=model, tokenizer=tokenizer)
print(factuality_pipeline(instance))
which should output:
[{'label': 'possible', 'score': 0.9744388461112976}]
Cite
If you use our model, please cite your paper as follows:
@inproceedings{bin_sumait_2023,
title={Factuality Detection using Machine Translation - a Use Case for German Clinical Text},
author={Bin Sumait, Mohammed and Gabryszak, Aleksandra and Hennig, Leonhard and Roller, Roland},
booktitle={Proceedings of the 19th Conference on Natural Language Processing (KONVENS 2023)},
year={2023}
}