Model Card for Model ID
<!-- Provide a quick summary of what the model is/does. -->
Llama 13B finetuned to process longer (4K) contexts using interpolation and then further instruct finetuned.
Model Details
Model Description
This is one of the models trained and evaluated as part of the experiments described in the repo http://github.com/abacusai/Long-Context. This version was trained with a scaling factor of 16 on 4K samples. The results of evaluating it on upto 22K tokens is available in GitHub repository.
Note: The model is supplied as delta weights with respect Llama 13B. You should use the code in repository as a starting point for loading the model as the code needs to be patched.
- Developed by: Abacus.AI
- Model type: Transformer based autoregressive causal language model
- License: Non-commercial use
- Finetuned from model: Llama V1 13B
Model Sources [optional]
<!-- Provide the basic links for the model. -->
- Repository: http://github.com/abacusai/Long-Context
Uses
The model can be further finetuned on 4K+ training samples to adapt to tasks that require longer contexts.
Direct Use
Since the model is instruct finetuned it can also be directly used for various prompted tasks. We have tested it on open book question answering using the long context to supply search results.
Bias, Risks, and Limitations
The model has not been evaluated for safety and is only intended for research and experiments.
How to Get Started with the Model
See the repository for instruction on how to load the model and scripts that can be used for further training.