Stable Diffusion - TextToImage Python SDK Inferences

Stable Diffusion

For examples of the types of outputs to expect, you can visit the Stable Diffusion Demo at OctoAI.

As a next step, stable diffusion is an excellent candidate for Asynchronous Inference Using Python SDK.

Requirements

Please follow How to create an OctoAI API token if you don't have one already.

Stable Diffusion: Text2Image

Let’s use another QuickStart pre-accelerated template example. Stable Diffusion is a model that generates images when it is given a prompt. We’ll need to rely on the base64 library to encode the image into bytes. You can create an image using the code snippet below.

Please reference the Overview: QuickStart Templates on SDK to Run Inferences for details on finding endpoint URLs for QuickStart and cloned templates and the SDK Reference for more information on classes.

from octoai.client import Client
from octoai.types import Image

if __name__ == "__main__":
    OCTOAI_TOKEN = "API Token goes here from guide on creating OctoAI API token"
    # The client will also identify if OCTOAI_TOKEN is set as an environment variable
    client = Client(token=OCTOAI_TOKEN)

    stable_diffusion_url = "https://stable-diffusion-demo-kk0powt97tmb.octoai.run/text2img"
    sd_health_check = "https://stable-diffusion-demo-kk0powt97tmb.octoai.run/healthcheck"
    # These are the inputs we'll send to the endpoint.
    inputs = {
        # The prompt input is required, while the rest are optional.
        "prompt": "A photo of an octopus in space",
        # What we don't want to see
        "negative_prompt": "Blurry photo, distortion, low-res, bad quality",
        # Classifier free guidance, 1 to 20
        "guidance_scale": 7.5,
        # Number of denoising steps, 1 to 500
        "num_inference_steps": 40,
        # Algorithm used for denoising.  Also accepts PNDM, KLMS, DDIM, etc.
        # Please view the QuickStart templates page for stable diffusion for more information.
        "scheduler": "DPMSolverMultistep"
    }

    if client.health_check(sd_health_check) == 200:
        outputs = client.infer(endpoint_url=stable_diffusion_url, inputs=inputs)
        # The output image is a base64 encoded string.
        # This can easily be passed to other models for inference as well.
        sd_image_base64 = outputs.get("completion").get("image_0")
        # Let's write it to a file so we can see our astro octopus.
        Image.from_base64(sd_image_base64).to_file("astro.png")
astropus.png

astro.png

Once you’ve ran this program, you'll have made an AI generated image, converted from a base64 string and then written to a file. You can open astro.png to see the results!

Stable Diffusion Outputs

In the example above, if you log the outputs, you'll receive an object like the below. We recommend using interfaces for your expected outputs to help guide the behavior of your application.

completion.image_0 contains the base64 string that represents the generated image.

{
  ckpt_load_time_ms: 0.00371,
  run_pipeline_time_ms: 1278.74618,
  completion: {
    image_0: 'base64 string representing the image generated'
  },
  prediction_time_ms: 1371.803945
}