generated_from_keras_callback named entity recognition bert-base finetuned umair akram


The model training notebook is available on my GitHub Repo.

This model is a fine-tuned version of bert-base-cased on Cnoll2003 dataset. It achieves the following results on the evaluation set:

How to use this model

#Install the transformers library
!pip install transformers

#Import the pipeline
from transformers import pipeline

#Import the model from HuggingFace
checkpoint = "MUmairAB/bert-ner"
model = pipeline(task="token-classification",

#Use the model
raw_text = "My name is umair and i work at Swits AI in Antarctica."

Model description

Model: "tf_bert_for_token_classification"

 Layer (type)                Output Shape              Param #   
 bert (TFBertMainLayer)      multiple                  107719680 
 dropout_37 (Dropout)        multiple                  0         
 classifier (Dense)          multiple                  6921      
Total params: 107,726,601
Trainable params: 107,726,601
Non-trainable params: 0

Intended uses & limitations

This model can be used for named entity recognition tasks. It is trained on Conll2003 dataset. The model can classify four types of named entities:

  1. persons,
  2. locations,
  3. organizations, and
  4. names of miscellaneous entities that do not belong to the previous three groups.

Training and evaluation data

The model is evaluated on seqeval metric and the result is as follows:

{'LOC': {'precision': 0.9655361050328227,
  'recall': 0.9608056614044638,
  'f1': 0.9631650750341064,
  'number': 1837},
 'MISC': {'precision': 0.8789144050104384,
  'recall': 0.913232104121475,
  'f1': 0.8957446808510638,
  'number': 922},
 'ORG': {'precision': 0.9075144508670521,
  'recall': 0.9366144668158091,
  'f1': 0.9218348623853211,
  'number': 1341},
 'PER': {'precision': 0.962011771000535,
  'recall': 0.9761129207383279,
  'f1': 0.9690110482349771,
  'number': 1842},
 'overall_precision': 0.9374068554396423,
 'overall_recall': 0.9527095254123191,
 'overall_f1': 0.944996244053084,
 'overall_accuracy': 0.9864013657502796}

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

Training results

Train Loss Validation Loss Epoch
0.1775 0.0635 0
0.0470 0.0559 1
0.0278 0.0603 2
0.0174 0.0603 3
0.0124 0.0615 4
0.0077 0.0722 5
0.0060 0.0731 6
0.0038 0.0757 7
0.0043 0.0731 8
0.0041 0.0735 9
0.0019 0.0724 10
0.0019 0.0786 11
0.0010 0.0843 12
0.0008 0.0814 13
0.0011 0.0867 14
0.0008 0.0883 15
0.0005 0.0861 16
0.0005 0.0869 17
0.0003 0.0880 18
0.0003 0.0880 19

Framework versions