llm-rs ggml

GGML converted versions of EleutherAI's GPT-J model

Description

GPT-J 6B is a transformer model trained using Ben Wang's Mesh Transformer JAX. "GPT-J" refers to the class of model, while "6B" represents the number of trainable parameters.

<figure>

Hyperparameter Value
\(n_{parameters}\) 6053381344
\(n_{layers}\) 28*
\(d_{model}\) 4096
\(d_{ff}\) 16384
\(n_{heads}\) 16
\(d_{head}\) 256
\(n_{ctx}\) 2048
\(n_{vocab}\) 50257/50400† (same tokenizer as GPT-2/3)
Positional Encoding Rotary Position Embedding (RoPE)
RoPE Dimensions 64
<figcaption><p><strong>*</strong> Each layer consists of one feedforward block and one self attention block.</p>
<p><strong>†</strong> Although the embedding matrix has a size of 50400, only 50257 entries are used by the GPT-2 tokenizer.</p></figcaption></figure>

The model consists of 28 layers with a model dimension of 4096, and a feedforward dimension of 16384. The model dimension is split into 16 heads, each with a dimension of 256. Rotary Position Embedding (RoPE) is applied to 64 dimensions of each head. The model is trained with a tokenization vocabulary of 50257, using the same set of BPEs as GPT-2/GPT-3.

Converted Models

Name Based on Type Container GGML Version
gpt-j-6b-f16.bin EleutherAI/gpt-j-6b F16 GGML V3
gpt-j-6b-q4_0.bin EleutherAI/gpt-j-6b Q4_0 GGML V3
gpt-j-6b-q4_0-ggjt.bin EleutherAI/gpt-j-6b Q4_0 GGJT V3
gpt-j-6b-q5_1.bin EleutherAI/gpt-j-6b Q5_1 GGML V3
gpt-j-6b-q5_1-ggjt.bin EleutherAI/gpt-j-6b Q5_1 GGJT V3

Usage

Python via llm-rs:

Installation

Via pip: pip install llm-rs

Run inference

from llm_rs import AutoModel

#Load the model, define any model you like from the list above as the `model_file`
model = AutoModel.from_pretrained("rustformers/gpt-j-ggml",model_file="gpt-j-6b-q4_0-ggjt.bin")

#Generate
print(model.generate("The meaning of life is"))

Rust via Rustformers/llm:

Installation

git clone --recurse-submodules https://github.com/rustformers/llm.git
cd llm
cargo build --release

Run inference

cargo run --release -- gptj infer -m path/to/model.bin  -p "Tell me how cool the Rust programming language is:"