generated_from_trainer

Mistral-11B-OmniMix-pippa-sharegpt-11b-qlora

This is a repository of my Mistral-11B-OmniMix Qlora checkpoints of the PIPPA-ShareGPT dataset.

You can read more about the dataset on its relevant page. It's a ShareGPT reformat of the PIPPA dataset by PygmalionAI. The reformat was done to allow for axolotl compatability.

Architecture

Training Details

Instruct Format

ShareGPT gets converted to vicuna format. The dataset uses modified roles of USER and CHARACTER instead of USER and ASSISTANT.

SYSTEM: Enter roleplay mode...
USER: {prompt}
CHARACTER:

Notes

This Qlora was produced as an experiment to see how the public version of PIPPA can affect a model. Also, Mistral is fairly new and training/finetune can be broken. As a result, I have no idea if this lora is of great quality or absolute garbage and was mean to be only used with OmniMix.

Acknowledgments

Thanks to:

Donate?

If you'd like to donate to Kingbri, you can do so here: https://ko-fi.com/kingbri

If you'd like to donate to me, you can also do it here: https://ko-fi.com/undiai

You should not feel obligated to donate, but if you do, we'll appreciate it.

Axolotl stuff

Training procedure

The following bitsandbytes quantization config was used during training:

Training hyperparameters

The following hyperparameters were used during training:

Training results

Training Loss Epoch Step Validation Loss
1.6447 0.34 50 1.6321
1.6243 0.68 100 1.5702
1.527 1.01 150 1.5406
1.4873 1.35 200 1.5275
1.5005 1.69 250 1.5196
1.4054 2.03 300 1.5153
1.4145 2.36 350 1.5149
1.4867 2.7 400 1.5138

Framework versions