Fine tuning Llama 2 by a generated dataset to respond sarcastically

The main idea behind the model is to add behaviour to an LLM so that for a given input(news headline) the model responds back with output(sarcastic_headline) in a funny, sarcastic way.<br> All the existing open datasets available related to sarcasm are either extracted from social media like twitter or reddit which were mostly replies to parent post or just a labelled dataset which have sarcastic, non-sarcastic sentences. we are looking for dataset which has normal sentence and corresponding sarcastic version for the model to understand. We can generate such dataset using a LLM by giving a random sentence and ask the model to generate sarcastic version of it. Once we get the generated dataset, we can fine tune a LLM model and to give sarcastic response.

Model Details

We are using Llama 2 13B version to generate the sarcastic sentence by using an appropriate prompt template, for the input sentences we are referring to a news headline category dataset. once we generate dataset, we format the dataset and do PEFT on pretrained Llama 2 7B weights. The fine tuned model can behave sarcastically and generate satirical responses. To ensure the quality and diversity of the training data, we are picking news headline category dataset so that we can cover multiple different random sentences without worrying about grammatic mistakes in input sentence.

Model Fine tuning code

Huggingface team developed a python library autotrain-advanced with which we can fine tune any LLM with just one line of code. You can find python code to generate the data, to fine tune the model in below repo

Uses

Direct Use

Refer to the Inference code available in repo: https://github.com/SriRamGovardhanam/Sarcastic-Headline-Llama2

Downstream Use

Recommendations

How to Get Started with the Model

Training Details

autotrain llm --train --project_name 'sarcastic-headline-gen' --model TinyPixel/Llama-2-7B-bf16-sharded \
--data_path '/content/sarcastic-headline' \
--use_peft \
--use_int4 \
--learning_rate 2e-4 \
--train_batch_size 8 \
--num_train_epochs 8 \
--trainer sft \
--model_max_length 340 > training.log &

Training Data

image/png

Results

Input headline: mansoons are best for mosquitoes <br>Input Formatted Template to the fine tuned LLM:

You are a savage, disrespectful and witty agent. You convert below news headline into a funny, humiliating, creatively sarcastic news headline while still maintaining the original context.
### headline: mansoons are best for mosquitoes
### sarcastic_headline:

<br>Output after Inferencing:

You are a savage, disrespectful and witty agent. You convert below news headline into a funny, humiliating, creatively sarcastic news headline while still maintaining the original context.
### headline: mansoons are best for mosquitoes
### sarcastic_headline:  Another Study Proves That Men's Sweaty Bums Are The Best Repellent Against Mosquitoes

Summary

Model Objective

This model is not intended to target specific race, gender, region etc., Sole purpose of this model is to understand LLM's and tap the LLM's ability to entertain, engage.

Compute Infrastructure

Google colab pro is needed if you are planning to tune more than 5 epochs for a 2100 samples of model_max_length < 650.

Citation

The source dataset - news headlines are taken from https://www.kaggle.com/datasets/rmisra/news-category-dataset <br> Misra, Rishabh. "News Category Dataset." arXiv preprint arXiv:2209.11429 (2022).

Model Card Authors

Sriram Govardhanam <br> http://www.linkedin.com/in/SriRamGovardhanam

Model Card Contact

sriramgov3@gmail.com