generated_from_trainer

<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->

gpt-neo-1.3B-sft-peft-oasst1history

This model is a fine-tuned version of EleutherAI/gpt-neo-1.3B on a reworked version of the oasst1 dataset. This version has the same base data, but it is trained to use the thread history as recommended by the OpenAssistant GitHub repository. It achieves the following results on the evaluation set:

Model description

GPT-based model from EleutherAI/gpt-neo-1.3B. Supervised fine tuned on the oasst1 history format.

Intended uses & limitations

This model is supervised fine tuned with the OpenAssistant/oasst1 dataset. Therefore, it is not ready for deployment! It should be aligned using techniques of reinforcement learning from human feedback (RLHF) and domain adaptation before using it in production!

Training and evaluation data

This model uses the OpenAssistant/oasst1 dataset. The dataset is formated in different steps of a chat thread history to work with contextual information as recommended here.

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

Training results

Training Loss Epoch Step Validation Loss
1.3141 0.04 100 1.2298
1.1594 0.09 200 1.1669
1.6568 0.13 300 1.1491
1.239 0.18 400 1.1396
1.3717 0.22 500 1.1339
1.215 0.26 600 1.1302
1.1518 0.31 700 1.1278
1.0241 0.35 800 1.1263
1.0264 0.4 900 1.1250
0.9591 0.44 1000 1.1244
0.9054 0.48 1100 1.1243
1.146 0.53 1200 1.1245

Framework versions