PygmalionCoT-7b

Model description

Warning: THIS model is NOT suitable for use by minors. The model will output X-rated content.

This is a merge between PygmalionAI's pygmalion-7b https://huggingface.co/PygmalionAI/pygmalion-7b

And kaiokendev's 7b SuperCOT-LoRA (Chain of thought) https://huggingface.co/kaiokendev/SuperCOT-LoRA

In safetensor format. Model may be less repetitive and follow events more logically yet outputs might be smaller and rely heavily on example dialogues. Using anything other than Pygmalion's prompt format might exacerbate this.

Quantization Information

GPTQ cuda quantized with: https://github.com/0cc4m/GPTQ-for-LLaMa

python llama.py --wbits 4 models/PygmalionCoT-7b c4 --true-sequential --groupsize 128 --save_safetensors models/PygmalionCoT-7b/PygmalionCoT-7b-4bit-128g.safetensors

llama.cpp quantizations: ggml-q4_2, ggml-q5_1, ggml-q8_0

./quantize ./models/PygmalionCoT-7b/ggml-model-f16.bin ./models/PygmalionCoT-7b/PygmalionCoT-7b-ggml-q4_2.bin q4_2
./quantize ./models/PygmalionCoT-7b/ggml-model-f16.bin ./models/PygmalionCoT-7b/PygmalionCoT-7b-ggml-q5_1.bin q5_1
./quantize ./models/PygmalionCoT-7b/ggml-model-f16.bin ./models/PygmalionCoT-7b/PygmalionCoT-7b-ggml-q8_0.bin q8_0