tdobrxl/ClinicBERT - AI Model Zoo

ClinicBERT has the same architecture of RoBERTa model. It has been trained on clinical text and can be used for feature extraction from textual data.

How to use

Feature Extraction

from transformers import RobertaModel, RobertaTokenizer
model = RobertaModel.from_pretrained("tdobrxl/ClinicBERT")
tokenizer = RobertaTokenizer.from_pretrained("tdobrxl/ClinicBERT")

text = "Randomized Study of Shark Cartilage in Patients With Breast Cancer."
last_hidden_state, pooler_output = model(tokenizer.encode(text, return_tensors="pt")).last_hidden_state, model(tokenizer.encode(text, return_tensors="pt")).pooler_output

Masked Word Prediction

from transformers import pipeline
fill_mask = pipeline("fill-mask", model="tdobrxl/ClinicBERT", tokenizer="tdobrxl/ClinicBERT")
text = "this is the start of a beautiful <mask>."
fill_mask(text)

[{'score': 0.26558592915534973, 'token': 363, 'token_str': ' study', 'sequence': 'this is the start of a beautiful study.'}, {'score': 0.06330082565546036, 'token': 2010, 'token_str': ' procedure', 'sequence': 'this is the start of a beautiful procedure.'}, {'score': 0.04393036663532257, 'token': 661, 'token_str': ' trial', 'sequence': 'this is the start of a beautiful trial.'}, {'score': 0.0363750196993351, 'token': 839, 'token_str': ' period', 'sequence': 'this is the start of a beautiful period.'}, {'score': 0.027248281985521317, 'token': 436, 'token_str': ' treatment', 'sequence': 'this is the start of a beautiful treatment.'}