bert-base-cased biodiversity token-classification sequence-classification

BiodivBERT

Model description

How to use

  1. Masked Language Model
>>> from transformers import AutoTokenizer, AutoModelForMaskedLM

>>> tokenizer = AutoTokenizer.from_pretrained("NoYo25/BiodivBERT")

>>> model = AutoModelForMaskedLM.from_pretrained("NoYo25/BiodivBERT")
  1. Token Classification - Named Entity Recognition
>>> from transformers import AutoTokenizer, AutoModelForTokenClassification

>>> tokenizer = AutoTokenizer.from_pretrained("NoYo25/BiodivBERT")

>>> model = AutoModelForTokenClassification.from_pretrained("NoYo25/BiodivBERT")
  1. Sequence Classification - Relation Extraction
>>> from transformers import AutoTokenizer, AutoModelForSequenceClassification

>>> tokenizer = AutoTokenizer.from_pretrained("NoYo25/BiodivBERT")

>>> model = AutoModelForSequenceClassification.from_pretrained("NoYo25/BiodivBERT")

Training data

Evaluation results

BiodivBERT overperformed both BERT_base_cased, biobert_v1.1, and BiLSTM as a baseline approach on the down stream tasks.