tiny_starcoder_py-GGML
Quantized version of https://huggingface.co/bigcode/tiny_starcoder_py
Which one should I use?
fp16 (best quality) or q8_0 (~80% faster)
How to Use
Use https://github.com/the-crypt-keeper/ggml/tree/starcoder_repeat_penalty until https://github.com/ggerganov/ggml/pull/311 is merged
Run inference with --top_k 1 --repeat-penalty 1.176
for best results. For example:
$ ./bin/starcoder -m ~/ai/models/tiny_starcoder_py-fp16.bin -p 'def fibonnaci' --repeat-penalty 1.176 --top_k 1
main: seed = 1687866970
starcoder_model_load: loading model from '/home/miner/ai/models/tiny_starcoder_py-fp16.bin'
starcoder_model_load: n_vocab = 49152
starcoder_model_load: n_ctx = 8192
starcoder_model_load: n_embd = 768
starcoder_model_load: n_head = 12
starcoder_model_load: n_layer = 20
starcoder_model_load: ftype = 1
starcoder_model_load: qntvr = 0
starcoder_model_load: ggml ctx size = 1398.89 MB
starcoder_model_load: memory size = 960.00 MB, n_mem = 163840
starcoder_model_load: model size = 438.77 MB
extract_tests_from_file : No test file found.
test_gpt_tokenizer : 0 tests failed out of 0 tests.
main: temp = 0.900
main: top_k = 1
main: top_p = 0.900
main: repeat_last_n = 64
main: repeat_penalty = 1.176
main: prompt: 'def fibonnaci'
main: number of tokens in prompt = 5
main: token[0] = 589, def
main: token[1] = 28176, fib
main: token[2] = 267, on
main: token[3] = 46278, nac
main: token[4] = 91, i
def fibonnaci_2(n):
"""Fibonacci series of n."""
if n == 0:
return 1
else:
return fibonnaci_2(n-1) + fibonnaci_2(n-2)
if __name__ == '__main__':
print(fibonnaci_2(5))<|endoftext|>
main: mem per token = 290312 bytes
main: load time = 203.32 ms
main: sample time = 69.94 ms
main: predict time = 1262.22 ms / 15.98 ms per token
main: total time = 1560.19 ms
Memory Usage
fp16
starcoder_model_load: ggml ctx size = 1398.89 MB
starcoder_model_load: memory size = 960.00 MB, n_mem = 163840
starcoder_model_load: model size = 438.77 MB
q8_0
starcoder_model_load: ggml ctx size = 1204.83 MB
starcoder_model_load: memory size = 960.00 MB, n_mem = 163840
starcoder_model_load: model size = 244.71 MB