LLaMa-7b The Pit Project/Cesspit.

Lora(s) Details

Backbone Model: LLaMA
Language(s): English
Library: HuggingFace Transformers
License: This lora is under a Non-commercial Bespoke License and governed by the Meta license. You should only use this repository if you have been granted access to the model by filling out this form.

Datasets Details

Scraped posts of a particular subject within an image board.
The dataset was heavily augmented with various types of filtering to improve coherency and relevency to the origin and our goals.
For our Cesspit model, it contains 272,637 entries.

Prompt Template

The model was not trained in an instructional or chat-style format, please ensure your inference program does not attempt to inject anything more than your sole input when inferencing, simply type whatever comes to mind and the model will attempt to complete it.

Hardware and Software

Hardware: We utilized 20 Nvidia RTX 4090 hours for training our lora.
Training Factors: We created this lora using HuggingFace trainer

Training details

The rank and alpha we used was 128 and 256 alpha.
Our learning rate was 3e-4 with 125 warmups steps with a cosine scheduler for 1 epoch.
Our batch size was 25 microbatch, 50 batch size, GA 2

Limitations

It is strongly recommend to not deploy this model into a real-world environment unless its behavior is well-understood and explicit and strict limitations on the scope, impact, and duration of the deployment are enforced.