TACO-Plus -- TACO with extended pre-training
Introducing TACO-Plus, an advanced classification model built upon AutoModelForSequenceClassification
, designed to identify tweets belonging to
four distinct classes: Reason, Statement, Notification, and None of the TACO dataset.
Designed specifically for extracting information and inferences from Twitter data, this specialized classification model utilizes
BERTweet, but further pre-trained on 555 augmented tweets over 5 epochs with a batch size 32.
During the extended pre-training of BERTweet, AutoModelForSequenceClassification
was trained.
This procedure included fine-tuning AutoModelForSequenceClassification
with augmentations to optimize the embeddings, and subsequently,
retraining the TACO-Plus embeddings and its classification head using the original TACO tweets.
<a href="https://huggingface.co/TomatenMarc/WRAP"> <blockquote style="border-left: 5px solid grey; background-color: #bf0a30; padding: 10px;"> Notice: This model was only used as a reference model for WRAP and should not be used! </blockquote> </a>
Environmental Impact
- Hardware Type: A100 PCIe 40GB
- Hours used: 10 min
- Cloud Provider: Google Cloud Platform
- Compute Region: asia-southeast1 (Singapore)
- Carbon Emitted: 0.02kg CO2
Licensing
TACO-Plus © 2023 is licensed under CC BY-NC-SA 4.0