bert-large-cased-finetuned-ner-maplestory

This model is a fine-tuned version of dbmdz/bert-large-cased-finetuned-conll03-english on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 0.0530
Precision: 0.7282
Recall: 0.7316
F1: 0.7299
Accuracy: 0.9844

Model description

Based on fine-tuned bert-large-cased-finetuned-conll03-english, this model identifies Maplestory entities in addition to common entities covered in the conll03-english dataset.

Training and evaluation data

Training and evaluation data is based on Maplestory inquiries.

Abbreviation	Description
O	Outside of a named entity
B-MIS	Beginning of a miscellaneous entity right after another miscellaneous entity
I-MIS	Miscellaneous entity
B-PER	Beginning of a person’s name right after another person’s name
I-PER	Person’s name
B-ORG	Beginning of an organization right after another organization
I-ORG	organization
B-LOC	Beginning of a location right after another location
I-LOC	Location
B-INGAME_ITEM	Beginning of an in game item name right after another in game item name
I-INGAME_ITEM	In game item name
B-QUEST	Beginning of a quest name right after another quest name
I-QUEST	Quest name
B-CHANNEL	Beginning of a channel(world) name after another channel name
I-CHANNEL	Channel(world) name
B-SALE_ITEM	Beginning of a sale item(CS item) name after another sale item name
I-SALE_ITEM	Sale item(CS item) name
B-EVENT	Beginning of an event name after another event name
I-EVENT	Event name
B-JOB	Beginning of a Job (class) name after another job namee
I-JOB	Job (class) name

Training procedure

Prepared the training and evaluation dataset by first inputting the inquiry text into the base model. Taking the NER tags from the base model, identify and label Maplestory specific NER tags in the tokens and overwrite on top of the NER tags from the base model. Doing so will merge NER tags from the base model with the Maplestory specific NER tags creating a dataset that keeps the original functionality of the base model and also adds the ability to identify Maplestory entities.

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 3

Training results

Training Loss	Epoch	Step	Validation Loss	Precision	Recall	F1	Accuracy
0.0967	1.0	2948	0.0577	0.7001	0.6576	0.6782	0.9819
0.0636	2.0	5896	0.0536	0.6933	0.7285	0.7104	0.9831
0.0419	3.0	8844	0.0530	0.7282	0.7316	0.7299	0.9844

Framework versions

Transformers 4.34.0
Pytorch 2.0.1+cu118
Datasets 2.14.5
Tokenizers 0.14.1