Multi-lingual BERT Bengali Name Entity Recognition
mBERT-Bengali-NER
is a transformer-based Bengali NER model build with bert-base-multilingual-uncased model and Wikiann Datasets.
How to Use
from transformers import AutoTokenizer, AutoModelForTokenClassification
from transformers import pipeline
tokenizer = AutoTokenizer.from_pretrained("sagorsarker/mbert-bengali-ner")
model = AutoModelForTokenClassification.from_pretrained("sagorsarker/mbert-bengali-ner")
nlp = pipeline("ner", model=model, tokenizer=tokenizer, grouped_entities=True)
example = "আমি জাহিদ এবং আমি ঢাকায় বাস করি।"
ner_results = nlp(example)
print(ner_results)
Label and ID Mapping
Label ID |
Label |
0 |
O |
1 |
B-PER |
2 |
I-PER |
3 |
B-ORG |
4 |
I-ORG |
5 |
B-LOC |
6 |
I-LOC |
Training Details
Evaluation Results
Model |
F1 |
Precision |
Recall |
Accuracy |
Loss |
mBert-Bengali-NER |
0.97105 |
0.96769 |
0.97443 |
0.97682 |
0.12511 |