text-classification fake-news pytorch

Model description:

Distilbert is created with knowledge distillation during the pre-training phase which reduces the size of a BERT model by 40%, while retaining 97% of its language understanding. It's smaller, faster than Bert and any other Bert-based model.

Distilbert-base-uncased finetuned on the fake news dataset with below Hyperparameters

 learning rate 5e-5, 
 batch size 32,
 num_train_epochs=2,

Full code available @ DistilBert-FakeNews

Dataset available @ Fake News dataset