Skip to content

Basic image generation and the Gallery

yownas edited this page May 29, 2023 · 5 revisions

There are a couple of different ways to generate images. They are slightly different and you can choose (or even mix) depending on what you want to do.

Simplest

This is the simplest way to generate an image. sd.process() takes a text prompt and sends it to the webui's function for processing. It will return an image that we then add to the Gallery with ui.gallery.add(). (It will also be able to use a "Processing object" instead of a prompt. (see below))

img = sd.process('a cute cat')
ui.gallery.add(img)

Using a Processing object

Slightly more advanced example is sd.pipeline(). The function is basically the same as sd.process() but will take a "Processing object", letting you change a lot more settings before generating the image. The reason why there are two functions is because sd.process() uses the builtin function directly while sd.pipeline() is a deconstructed version, letting us get access to more details in the example further down.

In this example we start by getting a Processing object, set the prompt, negative prompt and steps before sending it to sd.pipeline(). This time when we add it to the Gallery we also set a caption for the image by using ui.gallery.addc().

p = sd.getp()
p.prompt = 'a cute puppy'
p.negative_prompt = 'angry bear'
p.steps = 25
img = sd.pipeline(p)
ui.gallery.addc(img, 'Not an angry bear')

Do all the steps manually

It is also possible to do most of the steps manually. We start by getting a Processing object but we will not use it for the prompts this time. This is just used for image size, steps, select sampler and things like that. We then get conditional and unconditional (negative) encodings. The functions sd.cond()& sd.negcond() use functions from the webui, so things like weights and AND will work. sd.sample() will take these and generate a latent image. We then convert the latent into a proper image and add it to the Gallery. (To get something that is closer to an image we will add 1.0 to it and then divide it by 2 and force the number to be between 0.0 and 1.0. Basically converting the numbers in the range -1.0 to 1.0 into 0.0 to 1.0.) This image will probably just look mostly like noise. We run the latent through the Variational Autoencoder, sd.vae(), to get an image for humans. The last step is to convert the array we get into an image format with sd.toimage(), and we add that too to the Gallery.

p = sd.getp()
c = sd.cond('bunny')
uc = sd.negcond('banana')

latent = sd.sample(p, c, uc)
tmp = torch.clamp(torch.div(torch.add(latent, 1.0), 2.0), 0.0, 1.0)
img = sd.toimage(tmp)
ui.gallery.add(img)

vae = sd.vae(latent)
img = sd.toimage(vae)
ui.gallery.add(img)

sd.sample(p, prompt, negative_prompt) takes three arguments, a Processing object, a prompt and a negative prompt. The argument for the prompt can be one of the following:

  • A normal text string
  • Output from sd.textencode() (which will run a prompt through Clip. Doesn't support weight and other prompt tricks.)
  • Output from sd.negcond(), supports some webui prompt tricks.
  • Output from sd.cond(), supports all the webui prompt tricks. (Like AND) (Will not work for the negative prompt)

Please look in the examples folder for more examples.