flair hunflair token-classification sequence-tagger-model

HunFlair model for PROMOTER

HunFlair (biomedical flair) for promoter entity.

Predicts 1 tag:

tag meaning
Promoter DNA promoter region


Please cite the following paper when using this model.

  title={RegEl corpus: identifying DNA regulatory elements in the scientific literature},
  author={Garda, Samuele and Lenihan-Geels, Freyda and Proft, Sebastian and Hochmuth, Stefanie and Sch{\"u}lke, Markus and Seelow, Dominik and Leser, Ulf},
  publisher={Oxford Academic}

Demo: How to use in Flair


from flair.data import Sentence
from flair.models import SequenceTagger
# for biomedical-specific tokenization:
# from flair.tokenization import SciSpacyTokenizer

# load tagger
tagger = SequenceTagger.load("regel-corpus/hunflair-promoter")

text = "The upstream region of the glnA gene contained two putative extended promoter consensus sequences (p1 and p2)."

# make example sentence
sentence = Sentence(text)

# for biomedical-specific tokenization:
# sentence = Sentence(text, use_tokenizer=SciSpacyTokenizer())

# predict NER tags

# print sentence

# print predicted NER spans
print('The following NER tags are found:')
# iterate over entities and print
for entity in sentence.get_spans('ner'):

This yields the following output:

Span [16]: "p1"   [− Labels: Promoter (0.9878)]
Span [18]: "p2"   [− Labels: Promoter (0.9216)]

So, the entities "p1" and "p2" (labeled as a promoter) are found in the sentence.

Alternatively download all models locally and use the MultiTagger class.

from flair.models import MultiTagger

tagger = [

tagger = MultiTagger.load(['./models/hunflair-'])
