• Tutorials >
  • Deploying a PyTorch Stable Diffusion model as a Vertex AI Endpoint

Deploying a PyTorch Stable Diffusion model as a Vertex AI Endpoint

Deploying large models, like Stable Diffusion, can be challenging and time-consuming.

In this recipe, we will show how you can streamline the deployment of a PyTorch Stable Diffusion model by leveraging Vertex AI.

PyTorch is the framework used by Stability AI on Stable Diffusion v1.5. Vertex AI is a fully-managed machine learning platform with tools and infrastructure designed to help ML practitioners accelerate and scale ML in production with the benefit of open-source frameworks like PyTorch.

In four steps you can deploy a PyTorch Stable Diffusion model (v1.5).

Deploying your Stable Diffusion model on a Vertex AI Endpoint can be done in four steps:

  • Create a custom TorchServe handler.

  • Upload model artifacts to Google Cloud Storage (GCS).

  • Create a Vertex AI model with the model artifacts and a prebuilt PyTorch container image.

  • Deploy the Vertex AI model onto an endpoint.

Let’s have a look at each step in more detail. You can follow and implement the steps using the Notebook example.

NOTE: Please keep in mind that this recipe requires a billable Vertex AI as explained in more details in the notebook example.

Create a custom TorchServe handler

TorchServe is an easy and flexible tool for serving PyTorch models. The model deployed to Vertex AI uses TorchServe to handle requests and return responses from the model. You must create a custom TorchServe handler to include in the model artifacts uploaded to Vertex AI. Include the handler file in the directory with the other model artifacts, like this: model_artifacts/handler.py.

After creating the handler file, you must package the handler as a model archiver (MAR) file. The output file must be named model.mar.

!torch-model-archiver \
-f \
--model-name <your_model_name> \
--version 1.0 \
 --handler model_artifacts/handler.py \
--export-path model_artifacts

Upload model artifacts to Google Cloud Storage (GCS)

In this step we are uploading model artifacts to GCS, like the model file or handler. The advantage of storing your artifacts on GCS is that you can track the artifacts in a central bucket.

BUCKET_NAME = "your-bucket-name-unique"  # @param {type:"string"}

# Will copy the artifacts into the bucket
!gsutil cp -r model_artifacts $BUCKET_URI

Create a Vertex AI model with the model artifacts and a prebuilt PyTorch container image

Once you’ve uploaded the model artifacts into a GCS bucket, you can upload your PyTorch model to Vertex AI Model Registry. From the Vertex AI Model Registry, you have an overview of your models so you can better organize, track, and train new versions. For this you can use the Vertex AI SDK and this pre-built PyTorch container.

from google.cloud import aiplatform as vertexai
MODEL_DISPLAY_NAME = "stable_diffusion_1_5-unique"
MODEL_DESCRIPTION = "stable_diffusion_1_5 container"

vertexai.init(project='your_project', location='us-central1', staging_bucket=BUCKET_NAME)

model = aiplatform.Model.upload(

Deploy the Vertex AI model onto an endpoint

Once the model has been uploaded to Vertex AI Model Registry you can then take it and deploy it to an Vertex AI Endpoint. For this you can use the Console or the Vertex AI SDK. In this example you will deploy the model on a NVIDIA Tesla P100 GPU and n1-standard-8 machine. You can specify your machine type.

endpoint = aiplatform.Endpoint.create(display_name=ENDPOINT_DISPLAY_NAME)


If you follow this notebook you can also get online predictions using the Vertex AI SDK as shown in the following snippet.

instances = [{"prompt": "An examplePup dog with a baseball jersey."}]
response = endpoint.predict(instances=instances)

with open("img.jpg", "wb") as g:


Create a Vertex AI model with the model artifacts and a prebuilt PyTorch container image

More resources

This tutorial was created using the vendor documentation. To refer to the original documentation on the vendor site, please see torchserve example.

더 궁금하시거나 개선할 내용이 있으신가요? 커뮤니티에 참여해보세요!

이 튜토리얼이 어떠셨나요? 평가해주시면 이후 개선에 참고하겠습니다! :)

© Copyright 2018-2024, PyTorch & 파이토치 한국 사용자 모임(PyTorch Korea User Group).

Built with Sphinx using a theme provided by Read the Docs.

PyTorchKorea @ GitHub

파이토치 한국 사용자 모임을 GitHub에서 만나보세요.

GitHub로 이동

한국어 튜토리얼

한국어로 번역 중인 PyTorch 튜토리얼입니다.

튜토리얼로 이동


다른 사용자들과 의견을 나누고, 도와주세요!

커뮤니티로 이동