Persian Poem Classifier Based on ParsBERT

Model Description

This model, named "Persian Poem Classifier," is based on the ParsBERT architecture and has been fine-tuned to classify Persian poems. Specifically, the model can evaluate whether a given piece of text is poetic, whether it adheres to a valid poetic structure, and whether it captures the style of a specific poet.

Features

Intended Use

This model is intended to be used by researchers, poets, and NLP enthusiasts who are interested in the automated analysis of Persian poetry. It can be utilized in applications ranging from educational platforms to advanced poetry-generating algorithms.

Limitations

Installation & Usage

You can easily install the model using the Hugging Face transformers library as follows:

pip install transformers

To classify a poem, you can use the following code snippet:

from transformers import AutoModelForSequenceClassification, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("jrazi/persian-poem-classifier")
model = AutoModelForSequenceClassification.from_pretrained("jrazi/persian-poem-classifier")

text = "Your Persian poem here"
inputs = tokenizer(text, return_tensors="pt")

outputs = model(**inputs)

Data Source

The model is fine-tuned on a curated dataset of Persian poems featuring various poets. The dataset contains multi-label annotations to evaluate the poetic nature, structure, and style conformity of the text. For creating negative labels, the model uses some of the publicly available persian text corporas. In addition to that, we used data augmentation techniques to further diversify our model, in order to make it generalize better.

Evaluation Metrics

The model has been evaluated using standard classification metrics like accuracy, F1-score, and ROC AUC for each of the multi-task objectives.

Metric Is Poetic Is Valid Poem Has Poet Style
F1 0.66 0.66 0.59
Prec 0.81 0.77 0.71
Acc 0.85 0.84 0.64