autotrain text-classification social offensive speech detection moderation

Hate Speech Detector

"Hate Speech Detector" is a text classification model based on Deberta that predicts whether a text contains hate speech or not. The model is fine-tuned on the tweet_eval dataset, which consists of seven heterogeneous tasks in Twitter, all framed as multi-class tweet classification. The 'hate' subset is used for this task.

This model is part of our series in moderation models, which includes the following other models that may be of interest to you:

We believe these models can be used in tandem to support one another and thus build a more robust moderation tool, for example.

Intended uses & limitations

Offensive Speech Detector is intended to be used as a tool for detecting hate speech in texts, which can be useful for applications such as content moderation, sentiment analysis, or social media analysis. The model can be used to filter out or flag tweets that contain hate speech, or to analyze the prevalence and patterns of hate speech.

However, the model has some limitations that users should be aware of:

Ethical Considerations

This is a model that deals with sensitive and potentially harmful language. Users should consider the ethical implications and potential risks of using or deploying this model in their applications or contexts. Some of the ethical issues that may arise are:

Users should carefully consider the purpose, context, and impact of using this model, and take appropriate measures to prevent or mitigate any potential harm. Users should also respect the privacy and consent of the data subjects, and adhere to the relevant laws and regulations in their jurisdictions.

License

This model is licensed under the CodeML OpenRAIL-M 0.1 license, which is a variant of the BigCode OpenRAIL-M license. This license allows you to freely access, use, modify, and distribute this model and its derivatives, for research, commercial or non-commercial purposes, as long as you comply with the following conditions:

By accessing or using this model, you agree to be bound by the terms of this license. If you do not agree with the terms of this license, you must not access or use this model.

Model Training Info

Validation Metrics

Usage

You can use cURL to access this model:

$ curl -X POST -H "Authorization: Bearer YOUR_API_KEY" -H "Content-Type: application/json" -d '{"inputs": "I love AutoTrain"}' https://api-inference.huggingface.co/models/KoalaAI/HateSpeechDetector

Or Python API:

from transformers import AutoModelForSequenceClassification, AutoTokenizer

model = AutoModelForSequenceClassification.from_pretrained("KoalaAI/HateSpeechDetector", use_auth_token=True)

tokenizer = AutoTokenizer.from_pretrained("KoalaAI/HateSpeechDetector", use_auth_token=True)

inputs = tokenizer("I love AutoTrain", return_tensors="pt")

outputs = model(**inputs)