code

Model Details

Classifier-Bias-TahniatKhan is a prototype model crafted to classify content into two categories: "Biased" and "Non-Biased".

Model Architecture

The model is built upon the distilbert-base-uncased architecture and has been fine-tuned on a custom dataset for the specific task of bias detection.

Dataset

The model was trained on a BABE dataset containing news articles from various sources, annotated with one of the 2 bias levels. Biased_Text = 1810 UnBiased_Test=1810

Training Procedure

The model was trained using the Adam optimizer for 6 epochs.

Performance On our validation set, the model achieved:

Accuracy: 78% F1 Score (Biased): 79% F1 Score (Non-Biased): 78%

How to Use

To use this model for text classification, use the following code

from transformers import pipeline
from transformers import AutoTokenizer, AutoModelForSequenceClassification
 
tokenizer = AutoTokenizer.from_pretrained("/tahniat/Classifier_bias_TahniatKhan")
model = AutoModelForSequenceClassification.from_pretrained("/tahniat/Classifier_bias_TahniatKhan")

classifier = pipeline("text-classification", model=model, tokenizer=tokenizer)
result = classifier("Men are better drivers")
print(result)

Caveats and Limitations

The model's training data originates from a specific dataset (BABE) which might not represent all kinds of biases or content.

The performance metrics are based on a random validation split, so the model's performance might vary in real-world applications.