This is a quantised version in safetensor format of the oasst-llama-13b-2-epochs model from dvruette/oasst-llama-13b-2-epochs

It has a siginficant speed up for inference when used on oobabooga.

Run with.. python server.py --model oasst-llama-13b-2-epochs-GPTQ-4bit-128g --wbits 4 --groupsize 128