This is the result of training with a mixed style/object dataset kindly provided by @Pashahlis at a learning rate of 1e-6 for 30 epochs (70 steps/epoch) with batch size 18.

validation

This model was trained using Victor C Hall's excellent Stable Diffusion finetuner EveryDream2. EveryDream2 configuration files for this training session are in this repo, here.

The configuration files enable a validation pass using a 15% split of the dataset with the noise seed held fixed during validation, to give the following loss curve (stitched together from two runs of 60 epochs each):

validation graph

Although the training ran for 120 epochs in total, the validation graph suggests that the best results are going to be at some point between epoch 10 and epoch 30:

validation graph

This repository contains a diffusers format model for epoch 30: validation graph

It's available in InvokeAI by adding the diffusers repo id damian0815/pashahlis-val-test-1e-6-ep30, or for manual download in .ckpt format if you're using a clumsier web UI: pashahlis-1e-6-ep30.ckpt.

... but is it finished training?

Training an SD model is subjective. Picking when to stop is a trade-off between an evaluation about how well the model reproduces the training data the way you want it to, vs how flexibly it is able to apply the new training data to novel outputs.

There are some generated image samples from each epoch to look at (generated with my python tool grate). For example, this one (warning: huuuge image, 20,000x10,000 pixels): grid of images I'm satisfied that the training quality roughly follows the shape of the validation graph, but you might want to look at this image closely to verify for yourself that the best model is probably somewhere between epoch 30 and epoch 40.

try them yourself

If you want to try them out for yourself, other epochs are available at damian0815/pashahlis-val-test-1e-6-ep40, damian0815/pashahlis-val-test-1e-6-ep80, damian0815/pashahlis-val-test-1e-6-ep110.


license: openrail pipeline_tag: text-to-image