This is a finetuned DistilBERT model for Vietnamese essay categories classification.
Overview
- At primary levels of education in Vietnam, students are introduced to 5 categories of essays:
- Argumentative - Nghị luận
- Anecdote - Biểu cảm
- Descriptive - Miêu tả
- Narrative - Tự sự
- Expository - Thuyết minh
- This model will classify sentences into these 5 categories
Pretrained model used in this pipeline:
- This pipeline includes pre-trained distilbert-base-multilingual-cased and a Multi-label Classification head trained on 8000 manually labeled sample essay sentences.
- The dataset can be found on Kaggle
- Usage of distilbert-base-multilingual-cased can be found on Huggingface
Citation:
@article{Sanh2019DistilBERTAD,
title={DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter},
author={Victor Sanh and Lysandre Debut and Julien Chaumond and Thomas Wolf},
journal={ArXiv},
year={2019},
volume={abs/1910.01108}
}