<!-- This model card has been generated automatically according to the information Keras had access to. You should probably proofread and complete it, then remove this comment. -->
Tweets disaster type classification model
This model was trained from part of Disaster Tweet Corpus 2020 (Analysis of Filtering Models for Disaster-Related Tweets, Wiegmann,M. et al, 2020) dataset It achieves the following results on the evaluation set:
- Train Loss: 0.0875
- Train Accuracy: 0.8783
- Validation Loss: 0.2980
- Validation Accuracy: 0.8133
- Epoch: 5
Model description
Labels <br> disease --- 1 <br> earthquake --- 2 <br> flood --- 3 <br> hurricane & tornado --- 4 <br> wildfire --- 5 <br> industrial accident --- 6 <br> societal crime --- 7 <br> transportation accident --- 8 <br> meteor crash --- 9 <br> haze --- 0
Intended uses & limitation
This model is able to detect 10 different type of disaster (nature and human-made), but it shows problem to detect the type 0 disaster due to the insignificant tweets and similarity to type 5 in the training dataset
Training hyperparameters
The following hyperparameters were used during training:
- optimizer: <br> batch_size = 16 <br> num_epochs = 5 <br> batches_per_epoch = len(tokenized_tweet["train"])//batch_size <br> total_train_steps = int(batches_per_epoch * num_epochs) <br> optimizer, schedule = create_optimizer(init_lr=2e-5, num_warmup_steps=0, num_train_steps=total_train_steps)
- training_precision: float32
Framework versions
- Transformers 4.16.2
- TensorFlow 2.9.2
- Datasets 2.4.0
- Tokenizers 0.12.1
How to use it
from transformers import AutoTokenizer, TFAutoModelForSequenceClassification
tokenizer = AutoTokenizer.from_pretrained("sacculifer/dimbat_disaster_type_distilbert")
model = TFAutoModelForSequenceClassification.from_pretrained("sacculifer/dimbat_disaster_type_distilbert")