Skip to content

Deploy the Kubeflow Pipelines Service

Pavel Dournov edited this page Nov 8, 2018 · 10 revisions

This page guides you through the steps to deploy Kubeflow, including the Kubeflow pipelines service.

Requirements

This guide assumes that you already have a GCP project. You can use Cloud Shell to run all the commands in this guide.

Alternatively, if you prefer to install and interact with GKE from your local machine, make sure you have gcloud CLI and kubectl installed locally.

Setup a GKE cluster

Follow the instructions to create a GCP project.

Enable the GKE API in this page. You can find more details about enabling billing, as well as activating the GKE API.

We recommend that you use Cloud Shell from the GCP console to run the below commands. Cloud Shell starts with an environment already logged in to your account and set to the currently selected project. The following two commands are required only in a workstation shell environment; they are not needed in the Cloud Shell.

gcloud auth login
gcloud config set project [your-project-id]

You need a GKE cluster to run Kubeflow pipelines. To start a new GKE cluster, first set a default compute zone (us-central1-a in this case):

gcloud config set compute/zone us-central1-a

Then start a GKE cluster:

# Specify your cluster name
CLUSTER_NAME=[YOUR-CLUSTER-NAME]
gcloud container clusters create $CLUSTER_NAME \
  --zone us-central1-a \
  --scopes cloud-platform \
  --enable-cloud-logging \
  --enable-cloud-monitoring \
  --machine-type n1-standard-2 \
  --num-nodes 4

Here we choose the cloud-platform scope so the cluster can invoke GCP APIs. You can find all the options for creating a cluster in here.

Next, grant your user account permission to create new cluster roles. This step is necessary because installing Kubeflow pipelines includes installing a few clusterroles.

kubectl create clusterrolebinding ml-pipeline-admin-binding --clusterrole=cluster-admin --user=$(gcloud config get-value account)

Deploy Kubeflow Pipelines

Go to the release page to find a version of the pipelines library. Deploy Kubeflow pipelines to your cluster.

For example:

PIPELINE_VERSION=0.1.2
kubectl create -f https://storage.googleapis.com/ml-pipeline/release/$PIPELINE_VERSION/bootstrapper.yaml

By running kubectl get job, you should see a job created that deploys Kubeflow pipelines along with all dependencies in the cluster. Wait for the number of successful job runs to reach 1:

NAME                      DESIRED   SUCCESSFUL   AGE
deploy-ml-pipeline-wjqwt  1         1            9m

You can check the deployment log in case of any failure

kubectl logs $(kubectl get pods -l job-name=[JOB_NAME] -o jsonpath='{.items[0].metadata.name}')

By default, the Kubeflow pipelines service is deployed with usage collection turned on. We use Spartakus which does not report any personal identifiable information (PII).

When deployment is successful, forward a local port to visit the Kubeflow pipelines UI dashboard:

export NAMESPACE=kubeflow
kubectl port-forward -n ${NAMESPACE} $(kubectl get pods -n ${NAMESPACE} --selector=service=ambassador -o jsonpath='{.items[0].metadata.name}') 8080:80

Open your browser and point to localhost:8080/pipeline.

Run your first pipeline

Navigate to the Pipelines section in the UI, create a new experiment and run a sample pipeline. For the project name parameter - please use your GCP project name.

To build your own pipelines - please see the SDK guide

Disable usage reporting

If you want to turn off the usage report, you can download the bootstrapper file and change the arguments to the deployment job.

For example, download bootstrapper

PIPELINE_VERSION=0.0.42
curl https://storage.googleapis.com/ml-pipeline/release/$PIPELINE_VERSION/bootstrapper.yaml --output bootstrapper.yaml

and then update argument in the file

        args: [
          ... 
          # uncomment following line
          "--report_usage", "false",
          ...
        ]

then create job using the updated YAML by running kubectl create -f bootstrapper.yaml

Uninstall

To uninstall Kubeflow pipelines, download the bootstrapper file and change the arguments to the deployment job.

For example, download bootstrapper

PIPELINE_VERSION=0.0.42
curl https://storage.googleapis.com/ml-pipeline/release/$PIPELINE_VERSION/bootstrapper.yaml --output bootstrapper.yaml

and then update argument in the file

        args: [
          ... 
          # uncomment following line
          "--uninstall",
          ...
        ]

then create job using the updated YAML by running kubectl create -f bootstrapper.yaml

Developer Guide

Clone this wiki locally