stable-diffusion stable-diffusion-diffusers

<style> .title-container { display: flex; justify-content: center; align-items: center; height: 100vh; /* Adjust this value to position the title vertically */ } .title { font-size: 3em; text-align: center; color: #333; font-family: Arial, sans-serif; text-transform: uppercase; letter-spacing: 0.05em; padding: 0.5em 0; box-shadow: 0px 0px 20px 0px rgba(0,0,0,0.15); background: transparent; } .title span { background: -webkit-linear-gradient(45deg, #fe6b8b 30%, #ff8e53 90%); -webkit-background-clip: text; -webkit-text-fill-color: transparent; } .image-grid { display: grid; grid-template-columns: repeat(3, 1fr); gap: 0.5em; } .image-item { box-shadow: 0px 0px 10px 0px rgba(0,0,0,0.15); padding: 10px; } .image-item img { width: 100%; height: 100%; object-fit: cover; border-radius: 10px; transition: transform .2s; } .image-item img:hover { transform: scale(1.1); } .custom-table { table-layout: fixed; width: 100%; border-collapse: collapse; } .custom-table td { width: 50%; vertical-align: top; padding: 10px; box-shadow: 0px 0px 10px 0px rgba(0,0,0,0.15); } .custom-image { width: 100%; height: 100%; object-fit: cover; border-radius: 10px; transition: transform .2s; } .custom-image:hover { transform: scale(1.1); } </style>

<h1 class="title"><span>Hermitage XL</span></h1>

<div class="image-grid"> <div class="image-item"> <a href="https://huggingface.co/Linaqruf/hermitage-xl/blob/main/sample_images/sample1.png"> <img src="https://huggingface.co/Linaqruf/hermitage-xl/resolve/main/sample_images/sample1.png"> </a> </div> <div class="image-item"> <a href="https://huggingface.co/Linaqruf/hermitage-xl/blob/main/sample_images/sample2.png"> <img src="https://huggingface.co/Linaqruf/hermitage-xl/resolve/main/sample_images/sample2.png"> </a> </div> <div class="image-item"> <a href="https://huggingface.co/Linaqruf/hermitage-xl/blob/main/sample_images/sample3.png"> <img src="https://huggingface.co/Linaqruf/hermitage-xl/resolve/main/sample_images/sample3.png"> </a> </div> <div class="image-item"> <a href="https://huggingface.co/Linaqruf/hermitage-xl/blob/main/sample_images/sample4.png"> <img src="https://huggingface.co/Linaqruf/hermitage-xl/resolve/main/sample_images/sample4.png"> </a> </div> <div class="image-item"> <a href="https://huggingface.co/Linaqruf/hermitage-xl/blob/main/sample_images/sample5.png"> <img src="https://huggingface.co/Linaqruf/hermitage-xl/resolve/main/sample_images/sample5.png"> </a> </div> <div class="image-item"> <a href="https://huggingface.co/Linaqruf/hermitage-xl/blob/main/sample_images/sample6.png"> <img src="https://huggingface.co/Linaqruf/hermitage-xl/resolve/main/sample_images/sample6.png"> </a> </div> </div>

<hr>

Overview

Hermitage XL is a high-resolution, latent text-to-image diffusion model. The model has been fine-tuned using a learning rate of 4e-7 over 5000 steps with a batch size of 16 on a curated dataset of superior-quality anime-style images. This model is derived from Stable Diffusion XL 1.0.

e.g. 1girl, white hair, golden eyes, beautiful eyes, detail, flower meadow, cumulonimbus clouds, lighting, detailed sky, garden

<hr>

Features

  1. High-Resolution Images: The model trained with 1024x1024 resolution. The model is trained using NovelAI Aspect Ratio Bucketing Tool so that it can be trained at non-square resolutions.
  2. Anime-styled Generation: Based on given text prompts, the model can create high quality anime-styled images.
  3. Fine-Tuned Diffusion Process: The model utilizes a fine-tuned diffusion process to ensure high quality and unique image output.

<hr>

Model Details

How to Use:

lowres, bad anatomy, bad hands, text, error, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality, normal quality, jpeg artifacts, signature, watermark, username, blurry
masterpiece, best quality, illustration, beautiful detailed, finely detailed, dramatic light, intricate details

<hr>

🧨 Diffusers

Make sure to upgrade diffusers to >= 0.18.2:

pip install diffusers --upgrade

In addition make sure to install transformers, safetensors, accelerate as well as the invisible watermark:

pip install invisible_watermark transformers accelerate safetensors

Running the pipeline (if you don't swap the scheduler it will run with the default EulerDiscreteScheduler in this example we are swapping it to EulerAncestralDiscreteScheduler:

import torch
from torch import autocast
from diffusers.models import AutoencoderKL
from diffusers import StableDiffusionXLPipeline, EulerAncestralDiscreteScheduler

model = "Linaqruf/hermitage-xl"
vae = AutoencoderKL.from_pretrained("stabilityai/sdxl-vae")

pipe = StableDiffusionXLPipeline.from_pretrained(
    model, 
    torch_dtype=torch.float16, 
    use_safetensors=True, 
    variant="fp16",
    vae=vae
    )

pipe.scheduler = EulerAncestralDiscreteScheduler.from_config(pipe.scheduler.config)
pipe.to('cuda')

prompt = "masterpiece, best quality, 1girl, green hair, sweater, looking at viewer, upper body, beanie, outdoors, watercolor, night, turtleneck"
negative_prompt = "lowres, bad anatomy, bad hands, text, error, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality, normal quality, jpeg artifacts, signature, watermark, username, blurry"

image = pipe(
    prompt, 
    negative_prompt=negative_prompt, 
    width=1024,
    height=1024,
    guidance_scale=12,
    target_size=(1024,1024),
    original_size=(4096,4096),
    num_inference_steps=50
    ).images[0]

image.save("anime_girl.png")

<hr>

Limitation

  1. This model inherit Stable Diffusion XL 1.0 limitation
  2. This model is overfitted and cannot follow prompts well, because it's fine-tuned for 5000 steps with small scale datasets.
  3. It's only a preview model to find good hyperparameter and training config for Stable Diffusion XL 1.0

<hr>

Example

Here is some cherrypicked samples and comparison between available models:

<table class="custom-table"> <tr> <td> <a href="https://huggingface.co/Linaqruf/hermitage-xl/blob/main/sample_images/image1.png"> <img class="custom-image" src="https://huggingface.co/Linaqruf/hermitage-xl/resolve/main/sample_images/image1.png" alt="sample1"> </a> <a href="https://huggingface.co/Linaqruf/hermitage-xl/blob/main/sample_images/image3.png"> <img class="custom-image" src="https://huggingface.co/Linaqruf/hermitage-xl/resolve/main/sample_images/image3.png" alt="sample3"> </a> </td> <td> <a href="https://huggingface.co/Linaqruf/hermitage-xl/blob/main/sample_images/image2.png"> <img class="custom-image" src="https://huggingface.co/Linaqruf/hermitage-xl/resolve/main/sample_images/image2.png" alt="sample2"> </a> <a href="https://huggingface.co/Linaqruf/hermitage-xl/blob/main/sample_images/image4.png"> <img class="custom-image" src="https://huggingface.co/Linaqruf/hermitage-xl/resolve/main/sample_images/image4.png" alt="sample4"> </a> </td> </tr> </table>