mistral

mistral-7b-instruct-peft

This instruction model was built via parameter-efficient QLoRA finetuning of Mistral-7B-v0.1 on the first 5k rows of ehartford/dolphin and the first 5k rows of garage-bAInd/Open-Platypus. Finetuning was executed on 1x A100 (40 GB SXM) for roughly 2 hours on the Lambda Labs platform.

Benchmark metrics

Metric Value
MMLU (5-shot) Coming
ARC (25-shot) Coming
HellaSwag (10-shot) Coming
TruthfulQA (0-shot) Coming
Avg. Coming

We use Eleuther.AI's Language Model Evaluation Harness to run the benchmark tests below, the same version as Hugging Face's Open LLM Leaderboard.

Helpful links

Loss curve

loss curve

The above loss curve was generated from the run's private wandb.ai log.

Example prompts and responses

Example 1:

User:

Write me a numbered list of things to do in New York City.

mistral-7b-instruct-peft:

  1. Visit the Statue of Liberty - This iconic symbol of freedom stands tall on Liberty Island in New York Harbor. Visitors can take a ferry ride over to see it up close and learn about its history.

  2. Take a Walk Through Central Park - One of the most beautiful parks in the world, Central Park offers plenty of activities for visitors, including boating, ice skating, and picnicking.

  3. Catch a Broadway Show - The Great White Way is home to some of the best theater productions in the world. From musicals to plays, there's something for everyone.

  4. Explore Chinatown - A vibrant neighborhood filled with authentic Chinese restaurants, shops, and cultural landmarks, Chinatown is a must-visit destination for anyone interested in experiencing another culture within the city.

  5. Shop at Popular Stores and Markets - New York City has countless shopping opportunities, ranging from high-end designer boutiques to bustling street vendors selling everything imaginable. Some notable places to check out include Macy's Herald Square, Fifth Avenue, and Chelsea Market.

  6. Try Delicious Local Food Options - No visit to NYC is complete without sampling some of their signature dishes. Pizza from Lombardi's (the oldest pizzeria in America), bagels from Ess-a-Bagel, and cheesecake from Junior's are just a few examples of what you should try while in the Big Apple!

In conclusion, these six items make up an excellent starting point for your adventure in New York City.

<br>

Example 2:

User:

Write a short email inviting my friends to a dinner party on Friday. Respond succinctly.

mistral-7b-instruct-peft:

Dear Friends,

Please join me for an evening of delicious food and great company at my home this Friday night! Bring your favorite dish or beverage to share with everyone else. I can't wait to see you all there!

Best regards, Your Name

P.S.: Don't forget to RSVP by Wednesday so we know how much food to prepare! 😋

<br>

Example 3:

User:

What is a good recipe for vegan banana bread?

mistral-7b-instruct-peft:

Let's start by gathering all of our ingredients: 2 cups flour, 1 teaspoon baking soda, 1/4 teaspoon salt, and 3 very ripe bananas (the more brown spots they have, the better). Then we need some wet ingredients to mix with the dry ones: 1 cup sugar, 1/2 cup vegetable oil or applesauce, and 1 tablespoon vanilla extract. Now let's make sure everything is ready before we begin mixing it together. Preheat your oven to 350 degrees Fahrenheit and grease an 8x4 inch loaf pan with cooking spray or butter. Next, in a large bowl, whisk together the flour, baking soda, and salt until well combined. In another smaller bowl, mash up those bananas really well using a fork or potato masher - you want them to be almost liquid-y. Add the sugar, oil or applesauce, and vanilla extract to the bananas and stir until smooth. Pour this mixture into the larger bowl with the dry ingredients and use a wooden spoon or spatula to gently fold everything together until just combined - don't overmix! The batter will still look slightly lumpy but that's okay. Finally, pour the batter into your prepared loaf pan and bake for about one hour or until a toothpick inserted into the center comes out clean. Allow the bread to cool completely before slicing and serving. Enjoy your delicious vegan banana bread! 🍌😋

Note: If you prefer a sweeter taste, feel free to add chocolate chips or nuts like walnuts or pecans to the batter before baking. Just remember not to overdo it as too many additions can affect how the bread rises during baking. Happy baking! 😊

<br>

Limitations and biases

The following language is modified from EleutherAI's GPT-NeoX-20B

This model can produce factually incorrect output, and should not be relied on to produce factually accurate information. This model was trained on various public datasets. While great efforts have been taken to clean the pretraining data, it is possible that this model could generate lewd, biased or otherwise offensive outputs.

Basic usage

!pip install -q -U huggingface_hub peft transformers torch accelerate
from huggingface_hub import notebook_login
import torch
from peft import PeftModel, PeftConfig
from transformers import (
    AutoModelForCausalLM,
    AutoTokenizer,
    BitsAndBytesConfig,
    pipeline,
)

notebook_login()
peft_model_id = "dfurman/mistral-7b-instruct-peft"
config = PeftConfig.from_pretrained(peft_model_id)

bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.bfloat16,
)

model = AutoModelForCausalLM.from_pretrained(
    config.base_model_name_or_path,
    quantization_config=bnb_config,
    use_auth_token=True,
    device_map="auto",
)

tokenizer = AutoTokenizer.from_pretrained(config.base_model_name_or_path, use_fast=True)
tokenizer.pad_token = tokenizer.eos_token

model = PeftModel.from_pretrained(model, peft_model_id)

format_template = "You are a helpful assistant. Write a response that appropriately completes the request. {query}\n"
# First, format the prompt
query = "Tell me a recipe for vegan banana bread."
prompt = format_template.format(query=query)

# Inference can be done using model.generate
print("\n\n*** Generate:")

input_ids = tokenizer(prompt, return_tensors="pt").input_ids.cuda()
with torch.autocast("cuda", dtype=torch.bfloat16):
    output = model.generate(
        input_ids=input_ids,
        max_new_tokens=512,
        do_sample=True,
        temperature=0.7,
        return_dict_in_generate=True,
        eos_token_id=tokenizer.eos_token_id,
        pad_token_id=tokenizer.pad_token_id,
        repetition_penalty=1.2,
    )

print(tokenizer.decode(output["sequences"][0], skip_special_tokens=True))

Runtime tests

runtime / 50 tokens (sec) GPU attn torch dtype VRAM (GB)
3.1 1x A100 (40 GB SXM) torch fp16 13

Acknowledgements

This model was finetuned by Daniel Furman on Sep 27, 2023 and is for research applications only.

Disclaimer

The license on this model does not constitute legal advice. We are not responsible for the actions of third parties who use this model. Please cosult an attorney before using this model for commercial purposes.

mistralai/Mistral-7B-v0.1 citation

coming

Training procedure

The following bitsandbytes quantization config was used during training:

Framework versions