This is a 4bit quant of https://huggingface.co/MetaIX/GPT4-X-Alpasta-30b

My secret sauce:

Using comit <a href="https://github.com/0cc4m/GPTQ-for-LLaMa/tree/3c16fd9c7946ebe85df8d951cb742adbc1966ec7">3c16fd9</a> of 0cc4m's GPTQ fork
Using C4 as the calibration dataset
Act-order, True-sequential, percdamp 0.1 (<i>the default percdamp is 0.01</i>)
No groupsize
Will run with CUDA, does not need triton.
Quant completed on a 'Premium GPU' and 'High Memory' Google Colab.

<b>Model<b>	<b>C4<b>	<b>WikiText2<b>	<b>PTB<b>
MetaIX's FP16	6.98400259	4.607768536	9.414786339
This Quant	7.292364597	4.954069614	9.754593849