|
| 1 | +Manage machine learning models with Seldon Core |
| 2 | +========== |
| 3 | + |
| 4 | +* [Install Seldon Core service](#install-seldon-core-service) |
| 5 | +* [Deploy your model](#deploy-your-model) |
| 6 | + * [1. Package your model](#1-package-your-model) |
| 7 | + * [2. Create your inference graph](#2-create-your-inference-graph) |
| 8 | + * [3. Deploy the model to the Kubernetes cluster](#3-deploy-the-model-to-the-kubernetes-cluster) |
| 9 | + |
| 10 | +<p align="left"> |
| 11 | + <a href="https://www.seldon.io/tech/products/core/" title="Seldon Core"> |
| 12 | + <img src="./images/logos/seldon_logo.jpg" align="center" alt="Seldon Core Logo" width="200px" /> |
| 13 | + </a> |
| 14 | +</p> |
| 15 | + |
| 16 | +[Seldon Core](https://www.seldon.io/tech/products/core/) is an open source platform for deploying machine learning models on a Kubernetes cluster. It extends Kubernetes with **its own custom resource `SeldonDeployment`** where you can define your runtime inference graph made up of models and other components that Seldon will manage. |
| 17 | + |
| 18 | +## Install Seldon Core service |
| 19 | + |
| 20 | +To deploy the Seldon Core service inside your FADI installation, set `seldon-core-operator.enabled` option to `true` in your FADI `values.yaml` configuration file and reapply the chart: |
| 21 | + |
| 22 | +```yaml |
| 23 | +seldon-core-operator: |
| 24 | + enabled: true |
| 25 | + usageMetrics: |
| 26 | + enabled: false |
| 27 | +``` |
| 28 | +
|
| 29 | +## Deploy your model |
| 30 | +
|
| 31 | +### 1. Package your model |
| 32 | +
|
| 33 | +To allow your component (model, router etc.) to be managed by Seldon Core it needs to be built into a **Docker container** and to expose the appropriate [service microservice APIs over REST or gRPC](https://docs.seldon.io/projects/seldon-core/en/latest/reference/apis/internal-api.html). |
| 34 | +
|
| 35 | +To wrap your model follow the [official Seldon instructions](https://docs.seldon.io/projects/seldon-core/en/v1.1.0/python/index.html). |
| 36 | +
|
| 37 | +NB: currently only Python is ready for production use, but other languages ([Java, R, Go, ...](https://docs.seldon.io/projects/seldon-core/en/latest/wrappers/language_wrappers.html)) are compatible. |
| 38 | +
|
| 39 | +### 2. Create your inference graph |
| 40 | +
|
| 41 | +Seldon Core extends Kubernetes with its own custom resource `SeldonDeployment` where you can define your runtime [inference graph](https://docs.seldon.io/projects/seldon-core/en/latest/graph/inference-graph.html) made up of models and other components that Seldon will manage. |
| 42 | + |
| 43 | +A `SeldonDeployment` is a JSON or YAML file that allows you to define your graph of component images and the resources each of those images will need to run (using a Kubernetes PodTemplateSpec). Below is a minimal example for a single model, in YAML: |
| 44 | + |
| 45 | +```yaml |
| 46 | +apiVersion: machinelearning.seldon.io/v1alpha2 |
| 47 | +kind: SeldonDeployment |
| 48 | +metadata: |
| 49 | + name: seldon-model |
| 50 | +spec: |
| 51 | + name: test-deployment |
| 52 | + predictors: |
| 53 | + - componentSpecs: |
| 54 | + - spec: |
| 55 | + containers: |
| 56 | + - name: classifier |
| 57 | + image: seldonio/mock_classifier:1.0 |
| 58 | + graph: |
| 59 | + children: [] |
| 60 | + endpoint: |
| 61 | + type: REST |
| 62 | + name: classifier |
| 63 | + type: MODEL |
| 64 | + name: example |
| 65 | + replicas: 1 |
| 66 | +``` |
| 67 | + |
| 68 | +[ref](https://docs.seldon.io/projects/seldon-core/en/v1.1.0/graph/inference-graph.html) |
| 69 | + |
| 70 | +The key components are: |
| 71 | + |
| 72 | +* A list of **`predictors`**, each with a specification for the number of replicas. |
| 73 | + * Each predictor defines a graph and its set of deployments. Having multiple predictors is useful when you want to split traffic between a main graph and a [canary](https://martinfowler.com/bliki/CanaryRelease.html), or for other production rollout scenarios. |
| 74 | +* For each predictor, a **list of `componentSpecs`**. Each `componentSpec` is a Kubernetes `PodTemplateSpec` that Seldon will build into a Kubernetes Deployment. Place here the images from your graph and their requirements, e.g. `Volumes`, `ImagePullSecrets`, Resources Requests, etc. |
| 75 | +* A **`graph`** specification that describes how the components are joined together. |
| 76 | + |
| 77 | +To understand the inference graph definition in detail see the [Seldon Deployment Reference Types |
| 78 | + reference](https://docs.seldon.io/projects/seldon-core/en/latest/reference/seldon-deployment.html) |
| 79 | + |
| 80 | +### 3. Deploy the model to the Kubernetes cluster |
| 81 | + |
| 82 | +Once the inference graph is created as a JSON or YAML Seldon Deployment resource, you can deploy it to the Kubernetes cluster: |
| 83 | + |
| 84 | +```bash |
| 85 | +kubectl apply -f my_deployment.yaml |
| 86 | +``` |
| 87 | + |
| 88 | +To delete ( or manage ) your `SeldonDeployment` you can use kubectl for the custom resource `SeldonDeployment`, for example to see if there are any models deployed: |
| 89 | + |
| 90 | +```bash |
| 91 | +kubectl get seldondeployment |
| 92 | +``` |
| 93 | + |
| 94 | +To delete the model `seldon-model`: |
| 95 | + |
| 96 | +```bash |
| 97 | +kubectl delete seldondeployment seldon-model |
| 98 | +``` |
0 commit comments