Introduction

Extending llama 13B context length 2k to 8k without finetuning via the ntk recipe

Example Usage

from transformers import AutoModelForCausalLM, LlamaTokenizer

model = AutoModelForCausalLM.from_pretrained("kz919/ntk_scaled_llama_13b_8k")
tokenizer = LlamaTokenizer.from_pretrained("kz919/llama_13b")