4-bit (32 groupsize) quantized files for ICTNLP/bayling-13b-v1.1
BayLing (百聆, bǎi líng) is an instruction-following LLM equipped with advanced language alignment, showing superior capability in English/Chinese generation, instruction following and multi-turn interaction.
Quantized using GPTQ-for-LLaMa.
Command used to quantize: python llama.py /my/model/directory c4 --wbits 4 --true-sequential --act-order --groupsize 32 --save_safetensors /my/output/file.safetensors