Dataset procedure
- Dataset used: /tweets_hate_speech_detection
- size: 3196 (only 10% of dataset used)
- batch_size = 32
- num_epochs = 20
- learning_rate = 3e-4
- num_warmup_steps = 0.06 * (3196 * num_epochs)
- num_training_steps = (3196 * num_epochs)
Training procedure
LoraConfig procedure
r=8, #attention heads
lora_alpha=16, #alpha scaling
lora_dropout=0.1,
bias="none",
task_type="SEQ_CLS" # set this for CLM or Seq2Seq
Framework versions
- PEFT 0.6.0.dev0