Model Card for OpenOrca-Phi
Finetuning of the Phi model on the OpenOrca-Best dataset. The model is full finetuned (not LoRA or QLoRA).
Model Sources
This model was trained on the 300k samples subset from OpenOrca-Best Dataset, made by shahules786. The dataset is slightly modified by removind all the samples longer that 2k tokens.
How to Get Started with the Model
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer
prompt = "### System:\n{system_prompt}\n\n### User:\n{user_prompt}\n\n### Assistant:\n"
system_prompt = "The assistant gives helpful, detailed, and polite answers to the user's questions."
user_prompt = "Who was the president of the United States in 2014?"
prompt = prompt.format(system_prompt=system_prompt, user_prompt=user_prompt)
model = AutoModelForCausalLM.from_pretrained("pansophic/OpenOrca-Phi", trust_remote_code=True, torch_dtype=torch.bfloat16).to("cuda")
tokenizer = AutoTokenizer.from_pretrained("pansophic/OpenOrca-Phi", trust_remote_code=True, torch_dtype=torch.bfloat16)
inputs = tokenizer(prompt, return_tensors="pt", return_attention_mask=False).to("cuda")
streamer = TextStreamer(tokenizer)
_ = model.generate(**inputs, max_length=512, top_k=40, top_p=0.9, do_sample=True, temperature=0.25, repetition_penalty=1.2, use_cache=True, eos_token_id=tokenizer.eos_token_id, streamer=streamer)
Prompt formatting
The prompt format is the following:
### System:
<system_prompt>
### User:
<user_prompt>
### Assistant: