NER model based on allenai/scibert_scivocab_cased Fine-tuned using the SciERC Dataset to identify scientific terms:

Training

Performance

Colab

Check out how this model is used for NER-enhanced topic modelling, inspired by BERTopic.

Use

from transformers import AutoTokenizer, AutoModelForTokenClassification
from transformers import pipeline

tokenizer = AutoTokenizer.from_pretrained("RJuro/SciNERTopic")
model_trf = AutoModelForTokenClassification.from_pretrained("RJuro/SciNERTopic")

nlp = pipeline("ner", model=model_trf, tokenizer=tokenizer, aggregation_strategy='average')

Cite this model

@misc {roman_jurowetzki_2022,
	author       = { {Roman Jurowetzki, Hamid Bekamiri} },
	title        = { SciNERTopic - NER enhanced transformer-based topic modelling for scientific text },
	year         = 2022,
	url          = { https://huggingface.co/RJuro/SciNERTopic },
	doi          = { 10.57967/hf/0095 },
	publisher    = { Hugging Face }
}