Model Card for Model ID

<!-- Provide a quick summary of what the model is/does. -->

Llama 13B finetuned to process longer (4K) contexts using interpolation and then further instruct finetuned.

Model Details

Model Description

This is one of the models trained and evaluated as part of the experiments described in the repo http://github.com/abacusai/Long-Context. This version was trained with a scaling factor of 4 on 4K samples. The results of evaluating it on upto 16K tokens is available in GitHub repository.

Note: The model is supplied as delta weights with respect Llama 13B. You should use the code in repository as a starting point for loading the model as the code needs to be patched.

Model Sources [optional]

<!-- Provide the basic links for the model. -->

Uses

The model can be further finetuned on 4K+ training samples to adapt to tasks that require longer contexts.

Direct Use

Since the model is instruct finetuned it can also be directly used for various prompted tasks. We have tested it on open book question answering using the long context to supply search results.

Bias, Risks, and Limitations

The model has not been evaluated for safety and is only intended for research and experiments.

How to Get Started with the Model

See the repository for instruction on how to load the model and scripts that can be used for further training.