Ava small

Training Details

The fine-tuning process for this model involved several key parameters and settings:

The model was trained on a GPU P100 to expedite the training process and take advantage of the hardware's parallel processing capabilities. The learning rate was set to 1e-3 to balance the trade-off between fast convergence and avoiding overshooting.

Model Performance

After 10 epochs of training, the model achieved improved performance in generating coherent and contextually relevant responses in conversations. However, it's important to note that the model's responses might still exhibit occasional inaccuracies or inconsistencies.

Custom Tokens and Contextualization

To facilitate structured conversations and improve response generation, the following custom tokens were added:

Here is example of prompting:

<startoftext><user>Hello</user><ava>Hello there, How can i assist you today?</ava></endoftext>

Use Cases and Applications

Given its training on dialogues and conversations, this fine-tuned model is particularly well-suited for the following use cases:

Inference script

from transformers import GPT2LMHeadModel, GPT2Tokenizer

def inference(text, model, tokenizer):
    data = tokenizer.encode(f'<startoftext><user>{text}</user><ava>', return_tensors='pt')
    input_ids = data.to(device)
    output = model.generate(

    decoded_output = tokenizer.decode(output[0], skip_special_tokens=True)
    ava_response = decoded_output.split('<ava>')[1].split('</ava>')[0]
    clean_response = ava_response.split('.')[0].strip()
    return clean_response

model_name = 'Kuduxaaa/ava-small'
model = GPT2LMHeadModel.from_pretrained(model_name)
tokenizer = GPT2Tokenizer.from_pretrained(model_name)

device = 'cuda' if torch.cuda.is_available() else 'cpu'

user_input = "What's the weather like today?"
response = inference(user_input, model, tokenizer)

print('Ava: ', response)