coreml stable-diffusion text-to-image

Core ML Converted SDXL Model:

<br>

SDXL-v10-Base+Refiner:

Source(s): CivitAI<br>

SDXL v1.0 Base + Refiner

The official SDXL base and refiner models converted to Core ML, and packaged as a single combined model.

The base model components will be used for the first 80% of the specified steps. Then the refiner model components will complete the remaining 20% of the steps. This is managed by Mochi Diffusion when it finds this type of combined SDXL model.

The individual model files in this repo have different bit depths and resolutions which are noted in the file names. The nominal model is 16 bit and 1024x1024. Other versions are reduced bits and/or lower resolution.

These are large models and are zipped into smaller parts. Be sure to download all of the parts for a particular model, and combine the pieces into a single folder on your end. The 3 part full size model zip files may not unzip correctly with Apple's native archive tool. They were made with "BetterZip". The 2 part 8 bit model files will unzip with Apple's archive tool.<br>

Model Description

SDXL consists of a two-step pipeline for latent diffusion: First, we use a base model to generate latents of the desired output size. In the second step, we use a specialized high-resolution model and apply a technique called SDEdit (https://arxiv.org/abs/2108.01073, also known as "img2img") to the latents generated in the first step, using the same prompt.

Model Sources

Uses

image

image

image

image