customer-service-tickets github-issues bart-large-mnli zero-shot-classification NLP

GitHub issues classifier (using zero shot classification)

Predicts wether a statement is a feature request, issue/bug or question

This model was trained using the Zero-shot classifier distillation method with the BART-large-mnli model as teacher model, to train a classifier on Github issues from the Github Issues Prediction dataset

Labels

As per the dataset Kaggle competition, the classifier predicts wether an issue is a bug, feature or question. After playing around with different labels pre-training I've used a different mapping of labels that yielded better predictions (see notebook here for details), labels being

Training data

Results

Agreement of student and teacher predictions: 94.82%

See this notebook for more info on feature engineering choice made

How to train using your own dataset

Acknowledgements