Skip to content

Commit 27b7d2a

Browse files
authored
Update docs and pipeline status badge (#303)
* docs * fix pipeline status badge and tf naming uniqueness * add a note about how to change the name of the pipeline * extra clarification on workspace connection
1 parent 6d02555 commit 27b7d2a

9 files changed

+36
-89
lines changed

README.md

+3-1
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,9 @@ description: "Code which demonstrates how to set up and operationalize an MLOps
1111

1212
# MLOps with Azure ML
1313

14-
[![Build Status](https://aidemos.visualstudio.com/MLOps/_apis/build/status/microsoft.MLOpsPython?branchName=master)](https://aidemos.visualstudio.com/MLOps/_build/latest?definitionId=151&branchName=master)
14+
CI: [![Build Status](https://aidemos.visualstudio.com/MLOps/_apis/build/status/Model-Train-Register-CI?branchName=master)](https://aidemos.visualstudio.com/MLOps/_build/latest?definitionId=160&branchName=master)
15+
16+
CD: [![Build Status](https://aidemos.visualstudio.com/MLOps/_apis/build/status/microsoft.MLOpsPython-CD?branchName=master)](https://aidemos.visualstudio.com/MLOps/_build/latest?definitionId=161&branchName=master)
1517

1618
MLOps will help you to understand how to build a Continuous Integration and Continuous Delivery pipeline for an ML/AI project. We will be using the Azure DevOps Project for build and release/deployment pipelines along with Azure ML services for model retraining pipeline, model management and operationalization.
1719

bootstrap/bootstrap.py

-1
Original file line numberDiff line numberDiff line change
@@ -98,7 +98,6 @@ def replace_project_name(project_dir, project_name, rename_name):
9898
r"ml_service/pipelines/diabetes_regression_build_train_pipeline_with_r_on_dbricks.py", # NOQA: E501
9999
r"ml_service/pipelines/diabetes_regression_build_train_pipeline_with_r.py", # NOQA: E501
100100
r"ml_service/pipelines/diabetes_regression_build_train_pipeline.py", # NOQA: E501
101-
r"ml_service/pipelines/diabetes_regression_verify_train_pipeline.py", # NOQA: E501
102101
r"ml_service/util/create_scoring_image.py",
103102
r"diabetes_regression/conda_dependencies.yml",
104103
r"diabetes_regression/evaluate/evaluate_model.py",

data/README.md

+3
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
This folder is used for example data, and it is not meant to be used for storing training data.
2+
3+
Follow steps to [Configure Training Data]('docs/custom_model.md#configure-training-data.md') to use your own data for training.

docs/code_description.md

+5-3
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ High level directory structure for this repository:
88
├── .pipelines <- Azure DevOps YAML pipelines for CI, PR and model training and deployment.
99
├── bootstrap <- Python script to initialize this repository with a custom project name.
1010
├── charts <- Helm charts to deploy resources on Azure Kubernetes Service(AKS).
11-
├── data <- Initial set of data to train and evaluate model.
11+
├── data <- Initial set of data to train and evaluate model. Not for use to store data.
1212
├── diabetes_regression <- The top-level folder for the ML project.
1313
│ ├── evaluate <- Python script to evaluate trained ML model.
1414
│ ├── register <- Python script to register trained ML model with Azure Machine Learning Service.
@@ -52,7 +52,10 @@ The repository provides a template with folders structure suitable for maintaini
5252
- `.pipelines/code-quality-template.yml` : a pipeline template used by the CI and PR pipelines. It contains steps performing linting, data and unit testing.
5353
- `.pipelines/diabetes_regression-ci-image.yml` : a pipeline building a scoring image for the diabetes regression model.
5454
- `.pipelines/diabetes_regression-ci.yml` : a pipeline triggered when the code is merged into **master**. It performs linting, data integrity testing, unit testing, building and publishing an ML pipeline.
55-
- `.pipelines/diabetes_regression-get-model-version-template.yml` : a pipeline template used by the `.pipelines/diabetes_regression-ci.yml` pipeline. It finds out if a new model was registered and retrieves a version of the new model.
55+
- `.pipelines/diabetes_regression-cd.yml` : a pipeline triggered when the code is merged into **master** and the `.pipelines/diabetes_regression-ci.yml` completes. It performs linting, data integrity testing, unit testing, building and publishing an ML pipeline.
56+
- `.pipelines/diabetes_regression-package-model-template.yml` : a pipeline triggered when the code is merged into **master**. It deploys the registered model to a target.
57+
- `.pipelines/diabetes_regression-get-model-id-artifact-template.yml` : a pipeline template used by the `.pipelines/diabetes_regression-cd.yml` pipeline. It takes the model metadata artifact published by the previous pipeline and gets the model ID.
58+
- `.pipelines/diabetes_regression-publish-model-artifact-template.yml` : a pipeline template used by the `.pipelines/diabetes_regression-ci.yml` pipeline. It finds out if a new model was registered and publishes a pipeline artifact containing the model metadata.
5659
- `.pipelines/helm-*.yml` : pipeline templates used by the `.pipelines/abtest.yml` pipeline.
5760
- `.pipelines/pr.yml` : a pipeline triggered when a **pull request** to the **master** branch is created. It performs linting, data integrity testing and unit testing only.
5861

@@ -62,7 +65,6 @@ The repository provides a template with folders structure suitable for maintaini
6265
- `ml_service/pipelines/diabetes_regression_build_train_pipeline_with_r.py` : builds and publishes an ML training pipeline. It uses R on ML Compute.
6366
- `ml_service/pipelines/diabetes_regression_build_train_pipeline_with_r_on_dbricks.py` : builds and publishes an ML training pipeline. It uses R on Databricks Compute.
6467
- `ml_service/pipelines/run_train_pipeline.py` : invokes a published ML training pipeline (Python on ML Compute) via REST API.
65-
- `ml_service/pipelines/diabetes_regression_verify_train_pipeline.py` : determines whether the evaluate_model.py step of the training pipeline registered a new model.
6668
- `ml_service/util` : contains common utility functions used to build and publish an ML training pipeline.
6769

6870
### Environment Definitions

docs/custom_container.md

+5-1
Original file line numberDiff line numberDiff line change
@@ -61,7 +61,11 @@ Edit the [environment_setup/docker-image-pipeline.yml](../environment_setup/dock
6161
and modify the string `'public/mlops/python'` with an name suitable to describe your environment,
6262
e.g. `'mlops/diabetes_regression'`.
6363

64-
Save and run the pipeline. This will build and push a container image to your Azure Container Registry with
64+
Save and run the pipeline, making sure to set the these runtime variables: `amlsdkversion` and `githubrelease`. The values are up to you to set depending on your environment. These will show as tags on your image.
65+
66+
![Custom Container Vars](./images/custom-container-variables.png)
67+
68+
This will build and push a container image to your Azure Container Registry with
6569
the name you have just edited. The next step is to modify the build pipeline to run the CI job on a container
6670
run from that image.
6771

docs/getting_started.md

+19-3
Original file line numberDiff line numberDiff line change
@@ -73,7 +73,7 @@ More variables are available for further tweaking, but the above variables are a
7373

7474
### Variable Descriptions
7575

76-
**BASE_NAME** is used as a prefix for naming Azure resources. When sharing an Azure subscription, the prefix allows you to avoid naming collisions for resources that require unique names, for example, Azure Blob Storage and Registry DNS. Make sure to set BASE_NAME to a unique name so that created resources will have unique names, for example, MyUniqueMLamlcr, MyUniqueML-AML-KV, and so on. The length of the BASE_NAME value shouldn't exceed 10 characters and must contain letters and numbers only.
76+
**BASE_NAME** is used as a prefix for naming Azure resources and should be unique. When sharing an Azure subscription, the prefix allows you to avoid naming collisions for resources that require unique names, for example, Azure Blob Storage and Registry DNS. Make sure to set BASE_NAME to a unique name so that created resources will have unique names, for example, MyUniqueMLamlcr, MyUniqueML-AML-KV, and so on. The length of the BASE_NAME value shouldn't exceed 10 characters and must contain letters and numbers only.
7777

7878
**LOCATION** is the name of the [Azure location](https://azure.microsoft.com/en-us/global-infrastructure/locations/) for your resources. There should be no spaces in the name. For example, central, westus, westus2.
7979

@@ -133,7 +133,7 @@ Check that the newly created resources appear in the [Azure Portal](https://port
133133

134134
At this point, you should have an Azure ML Workspace created. Similar to the Azure Resource Manager service connection, you need to create an additional one for the Azure ML Workspace.
135135

136-
Create a new service connection to your Azure ML Workspace using the [Machine Learning Extension](https://marketplace.visualstudio.com/items?itemName=ms-air-aiagility.vss-services-azureml) instructions to enable executing the Azure ML training pipeline. The connection name needs to match `WORKSPACE_SVC_CONNECTION` that you set in the variable group above.
136+
Create a new service connection to your Azure ML Workspace using the [Machine Learning Extension](https://marketplace.visualstudio.com/items?itemName=ms-air-aiagility.vss-services-azureml) instructions to enable executing the Azure ML training pipeline. The connection name needs to match `WORKSPACE_SVC_CONNECTION` that you set in the variable group above (eg. 'aml-workspace-connection').
137137

138138
![Created resources](./images/ml-ws-svc-connection.png)
139139

@@ -213,9 +213,25 @@ In order to use these pipelines:
213213

214214
These pipelines rely on the model CI pipeline and reference it by name.
215215

216+
If you would like to change the name of your model CI pipeline, you must edit this section of yml for the CD and batch scoring pipeline, where it says `source: Model-Train-Register-CI` to use your own name.
217+
```
218+
trigger: none
219+
resources:
220+
containers:
221+
- container: mlops
222+
image: mcr.microsoft.com/mlops/python:latest
223+
pipelines:
224+
- pipeline: model-train-ci
225+
source: Model-Train-Register-CI # Name of the triggering pipeline
226+
trigger:
227+
branches:
228+
include:
229+
- master
230+
```
231+
216232
---
217233

218-
These pipelines have the following behaviors:
234+
The release deployment and batch scoring pipelines have the following behaviors:
219235

220236
- The pipeline will **automatically trigger** on completion of the Model-Train-Register-CI pipeline for the master branch.
221237
- The pipeline will default to using the latest successful build of the Model-Train-Register-CI pipeline. It will deploy the model produced by that build.
15.8 KB
Loading

environment_setup/iac-create-environment-pipeline-tf.yml

+1-1
Original file line numberDiff line numberDiff line change
@@ -37,7 +37,7 @@ steps:
3737
ensureBackend: true
3838
backendAzureRmResourceGroupLocation: $(LOCATION)
3939
backendAzureRmResourceGroupName: $(RESOURCE_GROUP)
40-
backendAzureRmStorageAccountName: 'statestor'
40+
backendAzureRmStorageAccountName: '$(BASE_NAME)statestor'
4141
backendAzureRmStorageAccountSku: 'Standard_LRS'
4242
backendAzureRmContainerName: 'tfstate-cont'
4343
backendAzureRmKey: 'mlopsinfra.tfstate'

ml_service/pipelines/diabetes_regression_verify_train_pipeline.py

-79
This file was deleted.

0 commit comments

Comments
 (0)