text-generation-inference

This is a model that uses kimiko lora but merges it with llama-2-chat 7b instead of the base llama-2 model.

It performs pretty well and could be thought of as a uncensored llama-2-chat model.

The prompt template is similar to the normal kimiko. I haven't tested out all possible prompts but this one works best for me. The system prompt should just describe a character, not say something like act like a character. An example system prompt is

John is a buisnessman in paris. He is tired and in his home right now.

<<SYSTEM>>
##system prompt for the ai(where you would put personas as well)

<<HUMAN>>
##Chat with the bot

<<Character Name>>

To use with huggingface, just check out llama docs and would work since its the same architecture.