setfit sentence-transformers text-classification

Satoken

This is a SetFit model trained on multilingual datasets (mentioned below) for Sentiment classification.

The model has been trained using an efficient few-shot learning technique that involves:

  1. Fine-tuning a Sentence Transformer with contrastive learning.
  2. Training a classification head with features from the fine-tuned Sentence Transformer.

It is utilized by Germla for it's feedback analysis tool. (specifically the Sentiment analysis feature)

For other models (specific language-basis) check here

Usage

To use this model for inference, first install the SetFit library:

python -m pip install setfit

You can then run inference as follows:

from setfit import SetFitModel

# Download from Hub and run inference
model = SetFitModel.from_pretrained("germla/satoken")
# Run inference
preds = model(["i loved the spiderman movie!", "pineapple on pizza is the worst 🤮"])

Training Details

Training Data

Training Procedure

We made sure to have a balanced dataset. The model was trained on only 35% (50% for chinese) of the train split of all datasets.

Preprocessing

Speeds, Sizes, Times

The training procedure took 6hours on the NVIDIA T4 GPU.

Evaluation

Testing Data, Factors & Metrics

Environmental Impact