Training procedure
- Used PEFT library from huggingface and leveraged LoRA procedure to tune the model. Below are the training metrics.
Epoch |
Training Loss |
Validation Loss |
Precision |
Recall |
F1 |
Accuracy |
1 |
0.392600 |
0.347941 |
0.762406 |
0.631506 |
0.690810 |
0.882263 |
2 |
0.336300 |
0.302746 |
0.775583 |
0.702650 |
0.737317 |
0.897062 |
3 |
0.309500 |
0.294454 |
0.817472 |
0.701828 |
0.755249 |
0.905303 |
4 |
0.296700 |
0.281895 |
0.839335 |
0.695757 |
0.760831 |
0.905240 |
5 |
0.281700 |
0.273324 |
0.816995 |
0.752103 |
0.783207 |
0.914322 |
6 |
0.257300 |
0.262116 |
0.813662 |
0.758553 |
0.785142 |
0.915958 |
7 |
0.241200 |
0.255580 |
0.819946 |
0.764308 |
0.791150 |
0.918980 |
8 |
0.229900 |
0.255078 |
0.819697 |
0.771074 |
0.794643 |
0.919821 |
9 |
0.212800 |
0.248312 |
0.830942 |
0.776450 |
0.802772 |
0.922594 |
10 |
0.200900 |
0.245995 |
0.831402 |
0.780244 |
0.805011 |
0.923544 |
- Model got shrunk by nearly 60 times and with the same efficiency as distilbert-base-uncased
Inference
from transformers import AutoTokenizer, AutoModel
from peft import get_peft_config, PeftModel, PeftConfig, get_peft_model, LoraConfig, TaskType
peft_model_id = "vishnun/lora-NLIGraph"
config = PeftConfig.from_pretrained(peft_model_id)
inference_model = AutoModelForTokenClassification.from_pretrained(
config.base_model_name_or_path, num_labels=4, id2label=id2lab, label2id=lab2id
)
tokenizer = AutoTokenizer.from_pretrained(config.base_model_name_or_path)
model = PeftModel.from_pretrained(inference_model, peft_model_id)
text = "Arsenal will win the Premier League"
inputs = tokenizer(text, return_tensors="pt")
with torch.no_grad():
logits = model(**inputs).logits
tokens = inputs.tokens()
predictions = torch.argmax(logits, dim=2)
for token, prediction in zip(tokens, predictions[0].numpy()):
print((token, model.config.id2label[prediction]))
## results : ('<s>', 'O')
('Arsenal', 'SRC')
('Ġwill', 'O')
('Ġwin', 'REL')
('Ġthe', 'O')
('ĠPremier', 'TGT')
('ĠLeague', 'O')
('</s>', 'O')
Framework versions