Finetuned distilBERT model for stock news classification

This distilbert model was fine-tuned on 50.000 stock news articles using the HuggingFace adapter from Kern AI refinery. The articles consisted of the headlines plus abstract of the article. For the finetuning, a single NVidia K80 was used for about four hours.

Join our Discord if you have questions about this model: https://discord.gg/MdZyqSxKbe

DistilBERT is a smaller, faster and lighter version of BERT. It was trained by distilling BERT base and has 40% less parameters than bert-base-uncased. It runs 60% faster while preserving over 95% of BERT’s performances as measured on the GLUE language understanding benchmark. DistilBERT does not have token-type embeddings, pooler and retains only half of the layers from Google’s BERT.

Features

Usage

To use the model, you need to install the HuggingFace Transformers library:

pip install transformers

Then you can load the model and the tokenizer from the HuggingFace Hub:

from transformers import AutoModelForSequenceClassification, AutoTokenizer

model = AutoModelForSequenceClassification.from_pretrained("KernAI/stock-news-distilbert")
tokenizer = AutoTokenizer.from_pretrained("KernAI/stock-news-distilbert")

To classify a single sentence or a sentence pair, you can use the HuggingFace Pipeline API:

from transformers import pipeline

classifier = pipeline("text-classification", model=model, tokenizer=tokenizer)
result = classifier("This is a positive sentence.")
print(result)
# [{'label': 'POSITIVE', 'score': 0.9998656511306763}]