This is a 4-bit GPTQ version of airoboros-65b-gpt4-1.2

It was created with GPTQ-for-LLaMA with group size 32 and act order true as parameters, to get the maximum perplexity vs FP16 model.

It may have issues to fit on system with 2x24 GB VRAM cards if using GPTQ-for-LLaMA or AutoGPTQ and max context. Works fine on a single 48GB VRAM card (RTX A6000)

It works fine with 2x24GB VRAM cards when using exllama/exllama_HF at 2048 context.