Mistral-7B-Instruct-SQL
- Model creator: Mistral AI
- Original model: Mistral 7B Instruct v0.1
Description
This repo contains LoRA finetuned model files for Mistral AI's Mistral Instruct 7B v0.1.
<!-- prompt-template start -->
Prompt template: Mistral
<s>[INST] {prompt} [/INST]
<!-- prompt-template end -->
Instruction format
In order to leverage instruction fine-tuning, your prompt should be surrounded by [INST]
and [/INST]
tokens. The very first instruction should begin with a sentence id. The next instructions should not. The assistant generation will be ended by the end-of-sentence token id.
For example:
from transformers import AutoTokenizer, AutoModelForCausalLM
import transformers
import torch
model = "machinists/Mistral-7B-Instruct-SQL"
tokenizer = AutoTokenizer.from_pretrained(model)
pipeline = transformers.pipeline(
"text-generation",
model=model,
tokenizer=tokenizer,
torch_dtype=torch.bfloat16,
trust_remote_code=True,
device_map="auto",
)
table_schema = "CREATE TABLE head (age INTEGER)"
question = "How many heads of the departments are older than 56 ?"
system_msg = f" Generate a correct SQL query from the following database schema. \n {table_schema} "
prompt = f"<s>[INST] {system_msg} \n{question} [/INST]"
sequences = pipeline(
prompt,
max_length=1000,
do_sample=True,
top_k=10,
num_return_sequences=1,
eos_token_id=tokenizer.eos_token_id,
)
for seq in sequences:
print(f"Result: {seq['generated_text']}")
Model Architecture
This instruction model is based on Mistral-7B-v0.1, a transformer model with the following architecture choices:
- Grouped-Query Attention
- Sliding-Window Attention
- Byte-fallback BPE tokenizer
Finetuning
Finetuning technique involving rank of matrices to reduce computational and memory requirements. Read more: LoRA Hugging Face article
Epoch - 10
Dataset Name - b-mc2/sql-create-context
No. of records - 78.6k
Model Loading - bf16 and flash attention2
Finetuning Technique - Using LoRA on all linear layers
MaxSeqLength - 2048
Effective Batch Size - 4
Mixed Precision Training - tf32
Hardware and Software
- Training Hardware: 1 X Nvidia A100 80GB GPU
Logs
{'loss': 0.1378, 'learning_rate': 0.00019648798715270855, 'epoch': 9.05}
{'loss': 0.1302, 'learning_rate': 0.00019647224292296874, 'epoch': 9.05}
{'loss': 0.1256, 'learning_rate': 0.00019645649869322896, 'epoch': 9.05}
{'train_runtime': 22039.452, 'train_samples_per_second': 35.653, 'train_steps_per_second': 8.913, 'train_loss': 0.3001254962614568, 'epoch': 9.05}
The Machinists Team
Manish Kumar, Aakash Sarin
Troubleshooting
- If you see the following error:
Traceback (most recent call last):
File "", line 1, in
File "/transformers/models/auto/auto_factory.py", line 482, in from_pretrained
config, kwargs = AutoConfig.from_pretrained(
File "/transformers/models/auto/configuration_auto.py", line 1022, in from_pretrained
config_class = CONFIG_MAPPING[config_dict["model_type"]]
File "/transformers/models/auto/configuration_auto.py", line 723, in getitem
raise KeyError(key)
KeyError: 'mistral'
Installing transformers from source should solve the issue pip install git+https://github.com/huggingface/transformers
This should not be required after transformers-v4.33.4.
ReadMe References
https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1
https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.1-AWQ