๐ŸŽจ Image generation

anime_girl anime_girl (Generated with AnimagineXL)

LocalAI supports generating images with Stable diffusion, running on CPU using a C++ implementation, Stable-Diffusion-NCNN (binding) and ๐Ÿงจ Diffusers.

Usage

OpenAI docs: https://platform.openai.com/docs/api-reference/images/create

To generate an image you can send a POST request to the /v1/images/generations endpoint with the instruction as the request body:

# 512x512 is supported too
curl http://localhost:8080/v1/images/generations -H "Content-Type: application/json" -d '{
  "prompt": "A cute baby sea otter",
  "size": "256x256"
}'

Available additional parameters: mode, step.

Note: To set a negative prompt, you can split the prompt with |, for instance: a cute baby sea otter|malformed.

curl http://localhost:8080/v1/images/generations -H "Content-Type: application/json" -d '{
  "prompt": "floating hair, portrait, ((loli)), ((one girl)), cute face, hidden hands, asymmetrical bangs, beautiful detailed eyes, eye shadow, hair ornament, ribbons, bowties, buttons, pleated skirt, (((masterpiece))), ((best quality)), colorful|((part of the head)), ((((mutated hands and fingers)))), deformed, blurry, bad anatomy, disfigured, poorly drawn face, mutation, mutated, extra limb, ugly, poorly drawn hands, missing limb, blurry, floating limbs, disconnected limbs, malformed hands, blur, out of focus, long neck, long body, Octane renderer, lowres, bad anatomy, bad hands, text",
  "size": "256x256"
}'

stablediffusion-cpp

mode=0 mode=1 (winograd/sgemm)
test test b643343452981 b643343452981
b6441997879 b6441997879 winograd2 winograd2
winograd winograd winograd3 winograd3

Note: image generator supports images up to 512x512. You can use other tools however to upscale the image, for instance: https://github.com/upscayl/upscayl.

Setup

Note: In order to use the images/generation endpoint with the stablediffusion C++ backend, you need to build LocalAI with GO_TAGS=stablediffusion. If you are using the container images, it is already enabled.

While the API is running, you can install the model by using the /models/apply endpoint and point it to the stablediffusion model in the models-gallery:

curl http://localhost:8080/models/apply -H "Content-Type: application/json" -d '{
  "url": "github:go-skynet/model-gallery/stablediffusion.yaml"
}'

You can set the PRELOAD_MODELS environment variable:

PRELOAD_MODELS=[{"url": "github:go-skynet/model-gallery/stablediffusion.yaml"}]

or as arg:

local-ai --preload-models '[{"url": "github:go-skynet/model-gallery/stablediffusion.yaml"}]'

or in a YAML file:

local-ai --preload-models-config "/path/to/yaml"

YAML:

- url: github:go-skynet/model-gallery/stablediffusion.yaml
  1. Create a model file stablediffusion.yaml in the models folder:
name: stablediffusion
backend: stablediffusion
parameters:
  model: stablediffusion_assets
  1. Create a stablediffusion_assets directory inside your models directory
  2. Download the ncnn assets from https://github.com/EdVince/Stable-Diffusion-NCNN#out-of-box and place them in stablediffusion_assets.

The models directory should look like the following:

models
โ”œโ”€โ”€ stablediffusion_assets
โ”‚ย ย  โ”œโ”€โ”€ AutoencoderKL-256-256-fp16-opt.param
โ”‚ย ย  โ”œโ”€โ”€ AutoencoderKL-512-512-fp16-opt.param
โ”‚ย ย  โ”œโ”€โ”€ AutoencoderKL-base-fp16.param
โ”‚ย ย  โ”œโ”€โ”€ AutoencoderKL-encoder-512-512-fp16.bin
โ”‚ย ย  โ”œโ”€โ”€ AutoencoderKL-fp16.bin
โ”‚ย ย  โ”œโ”€โ”€ FrozenCLIPEmbedder-fp16.bin
โ”‚ย ย  โ”œโ”€โ”€ FrozenCLIPEmbedder-fp16.param
โ”‚ย ย  โ”œโ”€โ”€ log_sigmas.bin
โ”‚ย ย  โ”œโ”€โ”€ tmp-AutoencoderKL-encoder-256-256-fp16.param
โ”‚ย ย  โ”œโ”€โ”€ UNetModel-256-256-MHA-fp16-opt.param
โ”‚ย ย  โ”œโ”€โ”€ UNetModel-512-512-MHA-fp16-opt.param
โ”‚ย ย  โ”œโ”€โ”€ UNetModel-base-MHA-fp16.param
โ”‚ย ย  โ”œโ”€โ”€ UNetModel-MHA-fp16.bin
โ”‚ย ย  โ””โ”€โ”€ vocab.txt
โ””โ”€โ”€ stablediffusion.yaml

Diffusers

This is an extra backend - in the container is already available and there is nothing to do for the setup.

Model setup

The models will be downloaded the first time you use the backend from huggingface automatically.

Create a model configuration file in the models directory, for instance to use Linaqruf/animagine-xl with CPU:

name: animagine-xl
parameters:
  model: Linaqruf/animagine-xl
backend: diffusers

# Force CPU usage - set to true for GPU
f16: false
diffusers:
  pipeline_type: StableDiffusionXLPipeline
  cuda: false # Enable for GPU usage (CUDA)
  scheduler_type: euler_a