Overview

This is a test of qlora fine-tuning of the mpt-30b model, with 3 epochs.

qlora compatible model: https://huggingface.co/jondurbin/mpt-30b-qlora-compatible

My fork of qlora with mpt-30b support: https://github.com/jondurbin/qlora

Differences in the qlora scripts:

I think there's a bug in gradient accumulation, so if you try this, maybe set gradient accumulation steps to 1

See the mpt-30b-qlora-compatible model card for training details.

This is not as high quality as the llama-33b versions unfortunately, but I don't have a great answer as to why. Perhaps there are fewer forward layers that can be tuned?

License and usage

This is a real gray area, here's why:

I am purposingly not placing a license on here because I am not a lawyer and refuse to attempt to interpret all of the terms accordingly. Your best bet is probably to avoid using this commercially, especially since it didn't perform quite as well as expected using qlora.