What is PetrolLM?
PetrolLM is Mistral-7B-v0.1 model fine-tune using QLoRA (4-bit precision) for the purposes of creative writing and roleplay.
The dataset consists of 5800 samples, with the composition as follows:
- AICG Logs (~17%)
- PygmalionAI/PIPPA (~17%)
- Squish42/bluemoon-fandom-1-1-rp-cleaned (~13%)
- OpenLeecher/Teatime (~2%)
- Norquinal/claude_multiround_chat_1k (~17%)
- jundurbin/airoboros-gpt4-1.4 (~17%)
- totally-not-an-llm/EverythingLM-data-V2-sharegpt (~17%)
These samples were then back-filled using gpt-4/gpt-3.5-turbo-16k or otherwise converted to fit the prompt format.
Prompt Format
The model was finetuned with a prompt format similar to the original SuperHOT prototype:
---
style: roleplay
characters:
[char]: [description]
summary: [scenario]
---
<chat_history>
Format:
[char]: [message]
Human: [message]
Use in Text Generation Web UI
Install the bleeding-edge version of transformers
from source:
pip install git+https://github.com/huggingface/transformers
Or, alternatively, change model_type
in config.json
from mistral
to llama
.
Use in SillyTavern UI
As an addendum, you can include one of the following as the
Last Output Sequence
:
Human: In your next reply, write at least two paragraphs. Be descriptive and immersive, providing vivid details about {{char}}'s actions, emotions, and the environment.
{{char}}:
{{char}} (2 paragraphs, engaging, natural, authentic, descriptive, creative):
[System note: Write at least two paragraphs. Be descriptive and immersive, providing vivid details about {{char}}'s actions, emotions, and the environment.]
{{char}}:
The third one seems to work the best. I would recommend experimenting with creating your own to best suit your needs.
Finetuing Parameters
- LoRA Rank: 64
- LoRA Alpha: 16
- LoRA Dropout: 0.1
- BF16 Training
- Cutoff Length: 2048
- Training Epoch(s): 2