StableLManticore 7B
Yeah, don't use this. It was mostly an experiment if it's even plausible. Unfortunately StableLM has poor support for SFT with the huggingface trainer, so no things like flash attention, etc. Ed result is this is nearly impossible to train efficiently. Yes, it's plausible to try to train this with LoRA, but it's not very usable at all.
WandB: https://wandb.ai/wing-lian/stable-manticore-7b/runs/b1qqzf2s