text-classification

Model Card for depression-classification

Distilbert model fine-tuned on depression comments for the task of depression classification.

Model Details

Model Description

Distilbert model fine-tuned on depression comments for the task of depression classification.

Uses

<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. --> This project was done for fun on the side. It is not intended to diagnose, treat, or make decisions.

Direct Use

<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. --> <!-- If the user enters content, print that. If not, but they enter a task in the list, use that. If neither, say "more info needed." -->

from transformers import (
    AutoTokenizer,
)
from transformers import pipeline

from depression_classification import (
    DepressionDetectionConfig,
    DepressionDetectionModel,
    DepressionPipeline
)
directory = "../depression-classification"
tokenizer = AutoTokenizer.from_pretrained(directory)

config = DepressionDetectionConfig.from_pretrained(
    directory,
)
model = DepressionDetectionModel.from_pretrained(
    pretrained_model_name_or_path=directory,
    config=config,
)

pipe = pipeline("text-classification",
                model=model,
                tokenizer=tokenizer,
                pipeline_class=DepressionPipeline,
                )

result = pipe("I don't feel hopeful.")
print(result)

Downstream Use [Optional]

<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app --> <!-- If the user enters content, print that. If not, but they enter a task in the list, use that. If neither, say "more info needed." -->

Out-of-Scope Use

<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. --> <!-- If the user enters content, print that. If not, but they enter a task in the list, use that. If neither, say "more info needed." --> Any real-life, production use case is out-of-scope for this model. It is a toy project.

Bias, Risks, and Limitations

<!-- This section is meant to convey both technical and sociotechnical limitations. -->

Significant research has explored bias and fairness issues with language models (see, e.g., Sheng et al. (2021) and Bender et al. (2021)). Predictions generated by the model may include disturbing and harmful stereotypes across protected classes; identity characteristics; and sensitive, social, and occupational groups.

Recommendations

<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->

I do not recommend anyone use this model.

Training Details

Training Data

<!-- This should link to a Data Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->

Dataset originally from Depression: Reddit Dataset.

Training Procedure

<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->

Preprocessing

See preprocessing script.

Speeds, Sizes, Times

<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->

Training for 1 epoch takes ~20 minutes on local machine.

Evaluation

<!-- This section describes the evaluation protocols and provides the results. -->

Testing Data, Factors & Metrics

Testing Data

<!-- This should link to a Data Card if possible. -->

More information needed

Factors

<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->

More information needed

Metrics

<!-- These are the evaluation metrics being used, ideally with a description of why. -->

More information needed

Results

More information needed