text-generation-inference

Falcon 7B LLM Fine Tune Model

Model description

This model is a fine-tuned version of the tiiuae/falcon-7b model using the QLoRa library and the PEFT library.

Intended uses & limitations

How to use

# Import necessary classes and functions
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftConfig, PeftModel

# Specify the model
PEFT_MODEL = "hipnologo/falcon-7b-qlora-finetune-chatbot"

# Load the PEFT config
config = PeftConfig.from_pretrained(PEFT_MODEL)

# Load the base model and tokenizer
model = AutoModelForCausalLM.from_pretrained(
    config.based_model_name_or_path,
    return_dict=True,
    quantization_config=bnb_config,
    device_map="auto",
    trust_remote_code=True,
)
tokenizer = AutoTokenizer.from_pretrained(config.base_model_name_or_path)

# Set the padding token to be the same as the EOS token
tokenizer.pad_token = tokenizer.eos_token

# Load the PEFT model
model = PeftModel.from_pretrained(model, PEFT_MODEL)

# Set the generation parameters
generation_config = model.generation_config
generation_config.max_new_tokens = 200
generation_config.temperature = 0.7
generation_config.top_p = 0.7
generation_config.num_return_sequences = 1
generation_config.pad_token_id = tokenizer.eos_token_id
generation_config.eos_token_id = tokenizer.eos_token_id

# Define the prompt
prompt = """
<human>: How can I create an account?
<assistant>:
""".strip()
print(prompt)

# Encode the prompt
encoding = tokenizer(prompt, return_tensors="pt").to(model.device)

# Generate a response
with torch.inference_mode():
  outputs = model.generate(
      input_ids=encoding.input_ids,
      attention_mask=encoding.attention_mask,
      generation_config=generation_config,
  )

# Print the generated response
print(tokenizer.decode(outputs[0],skip_special_tokens=True))

Training procedure

The model was fine-tuned on the Ecommerce-FAQ-Chatbot-Dataset using the bitsandbytes quantization config:

Framework versions

Evaluation results

The model was trained for 80 steps, with the training loss decreasing from 0.184 to nearly 0. The final training loss was 0.03094411873175886.

License

This model is licensed under Apache 2.0. Please see the LICENSE for more information.