generated_from_trainer

<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->

distilbert-base-uncased-finetuned-ner

This model is a fine-tuned version of distilbert-base-uncased on the privy dataset. It achieves the following results on the evaluation set:

Model description

Output indices map to following labels:

['O', 'B-O', 'I-O', 'L-O', 'U-O', 'B-PER', 'I-PER', 'L-PER', 'U-PER', 'B-LOC', 'I-LOC', 'L-LOC', 'U-LOC', 'B-ORG', 'I-ORG', 'L-ORG', 'U-ORG', 'B-NRP', 'I-NRP', 'L-NRP', 'U-NRP', 'B-DATE_TIME', 'I-DATE_TIME', 'L-DATE_TIME', 'U-DATE_TIME', 'B-CREDIT_CARD', 'I-CREDIT_CARD', 'L-CREDIT_CARD', 'U-CREDIT_CARD', 'B-URL', 'I-URL', 'L-URL', 'U-URL', 'B-IBAN_CODE', 'I-IBAN_CODE', 'L-IBAN_CODE', 'U-IBAN_CODE', 'B-US_BANK_NUMBER', 'I-US_BANK_NUMBER', 'L-US_BANK_NUMBER', 'U-US_BANK_NUMBER', 'B-PHONE_NUMBER', 'I-PHONE_NUMBER', 'L-PHONE_NUMBER', 'U-PHONE_NUMBER', 'B-US_SSN', 'I-US_SSN', 'L-US_SSN', 'U-US_SSN', 'B-US_PASSPORT', 'I-US_PASSPORT', 'L-US_PASSPORT', 'U-US_PASSPORT', 'B-US_DRIVER_LICENSE', 'I-US_DRIVER_LICENSE', 'L-US_DRIVER_LICENSE', 'U-US_DRIVER_LICENSE', 'B-US_LICENSE_PLATE', 'I-US_LICENSE_PLATE', 'L-US_LICENSE_PLATE', 'U-US_LICENSE_PLATE', 'B-IP_ADDRESS', 'I-IP_ADDRESS', 'L-IP_ADDRESS', 'U-IP_ADDRESS', 'B-US_ITIN', 'I-US_ITIN', 'L-US_ITIN', 'U-US_ITIN', 'B-EMAIL_ADDRESS', 'I-EMAIL_ADDRESS', 'L-EMAIL_ADDRESS', 'U-EMAIL_ADDRESS', 'B-TITLE', 'I-TITLE', 'L-TITLE', 'U-TITLE', 'B-COORDINATE', 'I-COORDINATE', 'L-COORDINATE', 'U-COORDINATE', 'B-IMEI', 'I-IMEI', 'L-IMEI', 'U-IMEI', 'B-PASSWORD', 'I-PASSWORD', 'L-PASSWORD', 'U-PASSWORD', 'B-LICENSE_PLATE', 'I-LICENSE_PLATE', 'L-LICENSE_PLATE', 'U-LICENSE_PLATE', 'B-CURRENCY', 'I-CURRENCY', 'L-CURRENCY', 'U-CURRENCY', 'B-FINANCIAL', 'I-FINANCIAL', 'L-FINANCIAL', 'U-FINANCIAL', 'B-ROUTING_NUMBER', 'I-ROUTING_NUMBER', 'L-ROUTING_NUMBER', 'U-ROUTING_NUMBER', 'B-SWIFT_CODE', 'I-SWIFT_CODE', 'L-SWIFT_CODE', 'U-SWIFT_CODE', 'B-MAC_ADDRESS', 'I-MAC_ADDRESS', 'L-MAC_ADDRESS', 'U-MAC_ADDRESS', 'B-AGE', 'I-AGE', 'L-AGE', 'U-AGE']

Intended uses & limitations

NER detection for PII anonymization

Training and evaluation data

beki/privy dataset

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

Training results

Training Loss Epoch Step Validation Loss Precision Recall F1 Accuracy
0.0028 1.0 6310 0.0025 0.9977 0.9977 0.9977 0.9995
0.0015 2.0 12620 0.0017 0.9983 0.9985 0.9984 0.9996
0.001 3.0 18930 0.0016 0.9984 0.9986 0.9985 0.9996

Framework versions