Skip to content

Commit

Permalink
update monitoring samples (Azure#2970)
Browse files Browse the repository at this point in the history
* update monitoring samples

* remove sensitive information, make general for the public

* reformat python

* small changes

* small change data column names

* missed one
  • Loading branch information
ahughes-msft authored Jan 31, 2024
1 parent 7346ebe commit 2d22516
Show file tree
Hide file tree
Showing 15 changed files with 1,591 additions and 82 deletions.
24 changes: 15 additions & 9 deletions cli/monitoring/README.md
Original file line number Diff line number Diff line change
@@ -1,24 +1,28 @@
# AzureML Model Monitoring

**AzureML model monitoring** enables you to track the performance of your models from a data science perspective whilst in production. This directory contains YAML configuration samples for different scenarios you may encounter when trying to monitor your models. Comprehensive documentation on model monitoring, its capabilities, and a list of supported signals & metrics can be found [here](https://learn.microsoft.com/en-us/azure/machine-learning/concept-model-monitoring?view=azureml-api-2).
**AzureML model monitoring** enables you to track the performance of your models from a data science perspective whilst in production. This directory contains YAML configuration samples and an E2E notebook with a Python SDK monitoring configuration for different scenarios you may encounter when trying to monitor your models. Comprehensive documentation on model monitoring, its capabilities, and a list of supported signals & metrics can be found [here](https://learn.microsoft.com/en-us/azure/machine-learning/concept-model-monitoring?view=azureml-api-2).

> **Note**: For monitoring your models deployed with AzureML online endpoints (kubernetes or online), you can use **Model Data Collector (MDC)** to collect production inference data from your deployed model with ease. Documentation for data collection can be found [here](https://learn.microsoft.com/en-us/azure/machine-learning/how-to-collect-production-data?view=azureml-api-2&tabs=azure-cli).
> **Note**: Comprehensive configuration schema information can be found [here](https://learn.microsoft.com/en-us/azure/machine-learning/reference-yaml-monitor?view=azureml-api-2).
## End-to-end monitoring example

Please see the notebook contained in the [azureml-e2e-model-monitoring/notebooks](azureml-e2e-model-monitoring\notebooks\model-monitoring-e2e.ipynb) folder to run an end-to-end notebook to try out **AzureML model monitoring**.

## Scenario coverage

**AzureML model monitoring** supports multiple different scenarios so you can monitor your models regardless of your deployment approach. Below, we detail each scenario and the necessary steps to configure your model monitor in each specific case.
**AzureML model monitoring** supports multiple different scenarios so you can monitor your models regardless of your deployment approach. Below, we detail each scenario and the necessary steps to configure your model monitor in each specific case.

### 1. Deploy model with AzureML online endpoints; out-of-box configuration
### 1. Deploy model with AzureML online endpoints; out-of-box monitoring configuration

In this scenario, you have deployed your model to AzureML online endpoints (managed or kubernetes). You have enabled production inference data collection (documentation for it can be found [here](https://learn.microsoft.com/en-us/azure/machine-learning/how-to-collect-production-data?view=azureml-api-2&tabs=azure-cli)) for your deployment. With the `out-of-box-monitoring.yaml`, you can create a model monitor with the default signals (data drift, prediction drift, data quality), metrics, and thresholds - all of which can be adjusted later.

Schedule your model monitor with this command: `az ml schedule create -f out-of-box-monitoring.yaml`

### 2. Deploy model with AzureML online endpoints; advanced configuration with feature importance
### 2. Deploy model with AzureML online endpoints; advanced monitoring configuration with feature importance

In this scenario, you have deployed your model to AzureML online endpoints (managed or kubernetes). You have enabled production inference data collection (documentation for it can be found [here](https://learn.microsoft.com/en-us/azure/machine-learning/how-to-collect-production-data?view=azureml-api-2&tabs=azure-cli)) for your deployment. With the `advanced-model-monitoring.yaml`, you can create a model monitor with configurable signals, metrics, and thresholds. You can adjust the configuration to only monitor for the signals (data drift, prediction drift, data quality) and respective metrics/thresholds you are interested in monitoring for. The provided sample also determines the most important features and only computes the metrics for those features (feature importance).
In this scenario, you have deployed your model to AzureML online endpoints (managed or kubernetes). You have enabled production inference data collection (documentation for it can be found [here](https://learn.microsoft.com/en-us/azure/machine-learning/how-to-collect-production-data?view=azureml-api-2&tabs=azure-cli)) for your deployment. With the `advanced-model-monitoring.yaml`, you can create a model monitor with configurable signals, metrics, and thresholds. You can adjust the configuration to only monitor for the signals (data drift, prediction drift, data quality) and respective metrics/thresholds you are interested in monitoring for. The provided sample also determines the most important features and only computes the metrics for those features (feature importance). You can adjust the details to suit the needs of your scenario in the configuration file.

Schedule your model monitor with this command: `az ml schedule create -f advanced-model-monitoring.yaml`

Expand All @@ -28,16 +32,18 @@ In this scenario, you are interested in continuously monitoring your deployed fl

Schedule your model monitor with this command: `az ml schedule create -f generation-safety-quality-monitoring.yaml`

### 4. Deploy model with AzureML batch endpoints, AKS with CLI v1, or outside of AzureML
### 4. Monitoring configuration for non-MDC collected data; deployment with AzureML batch, AKS v1, or outside of AzureML

In this scenario, you can bring your own data to use as input to your monitoring job. As long as this data is maintained and kept up to date as your model is used in production, you can monitor it to extract insights.When you bring your own production data, you need to provide a custom preprocessing component to tabularize the data into MLTable format with a timestamp for each row. First, bring your data to Azure Blob and create an AzureML data asset for your model inputs and model outputs.

In this scenario, you can bring your own data to use as input to your monitoring job. When you bring your own production data, you need to provide a custom preprocessing component to get the data into MLTable format for the monitoring job to use. An example custom preprocessing component can be found in the `components/custom_preprocessing` directory. From that directory, you can use the command `az ml component create -f spec.yaml --subscription <subscription_id> --workspace <workspace> --resource-group <resource_group>` to register the component to Azure Machine Learning, which is a required prerequisite.
Next, create a preprocessing component to preprocess the data. An example custom preprocessing component can be found in the `components/custom_preprocessing` directory. From that directory, you can use the command `az ml component create -f spec.yaml --subscription <subscription_id> --workspace <workspace> --resource-group <resource_group>` to register the component to Azure Machine Learning, which is a required prerequisite. Ensure that you modify the component code based on the nature of how your data is stored.

Schedule your model monitor with this command: `az ml schedule create -f model-monitoring-with-collected-data.yaml --subscription <subscription_id> --workspace <workspace> --resource-group <resource_group>`

### 5. Create your own custom monitoring signal

In this scenario, you can create your own custom monitoring signal. For example, say you would like to implement your own metric, such as standard deviation. To start, you will need to create and register a custom signal component to Azure Machine Learning. The custom signal component can be found in the `components/custom_signal/` directory. From that directory, you can use the command `az ml component create -f spec.yaml --subscription <subscription_id> --workspace <workspace> --resource-group <resource_group>` to register the component.
In this scenario, you can create your own custom monitoring signal. For example, say you would like to implement your own metric, such as standard deviation. To start, you will need to create and register a custom signal component to Azure Machine Learning. The custom signal component can be found in the `components/custom_signal/` directory. From that directory, you can use the command `az ml component create -f spec.yaml --subscription <subscription_id> --workspace <workspace> --resource-group <resource_group>` to register the component.

Schedule your monitoring job (found in the main `monitoring/` directory) with the following command: `az ml schedule create -f custom_monitoring.yaml --subscription <subscription_id> --workspace <workspace> --resource-group <resource_group>`.

**Note**: The `custom-monitoring.yaml` configuration file uses both a custom preprocessing component and a custom monitoring signal. If you only want to use a custom signal (e.g., your data is being collected with the Model Data Collector (MDC) from an online endpoint), you can remove the custom preprocessing component line and AzureML model monitoring will use the default data preprocessor.
**Note**: The `custom-monitoring.yaml` configuration file uses both a custom preprocessing component and a custom monitoring signal. If you only want to use a custom signal (e.g., your data is being collected with the Model Data Collector (MDC) from an online endpoint), you can remove the custom preprocessing component line (`pre_processing_component: azureml:production_data_preprocessing:1`) and AzureML model monitoring will use the default MDC data preprocessor.
27 changes: 15 additions & 12 deletions cli/monitoring/advanced-model-monitoring.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -14,42 +14,44 @@ trigger:
minutes: 15 # at 15 mins after 3am

create_monitor:

compute:
instance_type: standard_e4s_v3
runtime_version: 3.2
runtime_version: "3.3"

monitoring_target:
ml_task: classification
endpoint_deployment_id: azureml:fraud-detection-endpoint:fraud-detection-deployment
endpoint_deployment_id: azureml:credit-default:main

monitoring_signals:
advanced_data_drift: # monitoring signal name, any user defined name works
type: data_drift
# reference_dataset is optional. By default referece_dataset is the production inference data associated with Azure Machine Learning online endpoint
reference_data:
input_data:
path: azureml:my_model_training_data:1 # use training data as comparison reference dataset
path: azureml:credit-reference:1 # use training data as comparison reference dataset
type: mltable
data_context: training
target_column_name: fraud_detected
data_column_names:
target_column: DEFAULT_NEXT_MONTH
features:
top_n_feature_importance: 20 # monitor drift for top 20 features
top_n_feature_importance: 10 # monitor drift for top 10 features
metric_thresholds:
numerical:
jensen_shannon_distance: 0.01
categorical:
pearsons_chi_squared_test: 0.02
advanced_data_quality:
type: data_quality
# reference_dataset is optional. By default referece_dataset is the production inference data associated with Azure Machine Learning online endpoint
# reference_dataset is optional. By default reference_dataset is the production inference data associated with Azure Machine Learning online endpoint
reference_data:
input_data:
path: azureml:my_model_training_data:1
path: azureml:credit-reference:1
type: mltable
data_context: training
features: # monitor data quality for 3 individual features only
- feature_A
- feature_B
- feature_C
- SEX
- EDUCATION
metric_thresholds:
numerical:
null_value_rate: 0.05
Expand All @@ -63,10 +65,11 @@ create_monitor:
# Azure Machine Learning model monitoring will automatically join both model_inputs and model_outputs data and used it for computation
reference_data:
input_data:
path: azureml:my_model_training_data:1
path: azureml:credit-reference:1
type: mltable
data_context: training
target_column_name: is_fraud
data_column_names:
target_column: DEFAULT_NEXT_MONTH
metric_thresholds:
normalized_discounted_cumulative_gain: 0.9

Expand Down
138 changes: 138 additions & 0 deletions cli/monitoring/azureml-e2e-model-monitoring/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,138 @@
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class

# C extensions
*.so

# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
pip-wheel-metadata/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST

# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec

# Installer logs
pip-log.txt
pip-delete-this-directory.txt

# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
*.py,cover
.hypothesis/
.pytest_cache/

# Translations
*.mo
*.pot

# Django stuff:
*.log
local_settings.py
db.sqlite3
db.sqlite3-journal

# Flask stuff:
instance/
.webassets-cache

# Scrapy stuff:
.scrapy

# Sphinx documentation
docs/_build/

# PyBuilder
target/

# Jupyter Notebook
.ipynb_checkpoints

# IPython
profile_default/
ipython_config.py

# pyenv
.python-version

# pipenv
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
# However, in case of collaboration, if having platform-specific dependencies or dependencies
# having no cross-platform support, pipenv may install dependencies that don't work, or not
# install all needed dependencies.
#Pipfile.lock

# PEP 582; used by e.g. github.com/David-OConnor/pyflow
__pypackages__/

# Celery stuff
celerybeat-schedule
celerybeat.pid

# SageMath parsed files
*.sage.py

# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/

# Spyder project settings
.spyderproject
.spyproject

# Rope project settings
.ropeproject

# mkdocs documentation
/site

# mypy
.mypy_cache/
.dmypy.json
dmypy.json

# Pyre type checker
.pyre/

# Other
.vscode
.DS_Store
.amlignore
*.amltmp
notebooks/.ipynb_checkpoints/
data/reference
data/production
10 changes: 10 additions & 0 deletions cli/monitoring/azureml-e2e-model-monitoring/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
# AzureML E2E Model Monitoring

In this sample notebook, you will observe the end-to-end lifecycle of the Machine Learning (ML) operationalization process. You will follow the following steps to train your ML model, deploy it to production, and monitor it to ensure its continuous performance:

1) Setup environment
2) Register data assets
3) Train the model
4) Deploy the model
5) Simulate inference requests
6) Monitor the model
98 changes: 98 additions & 0 deletions cli/monitoring/azureml-e2e-model-monitoring/code/score.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,98 @@
"""Script for an azureml online deployment"""
import json
import logging
import os
import uuid
from typing import Dict, List

import mlflow
import pandas as pd
from azureml.ai.monitoring import Collector
from inference_schema.parameter_types.standard_py_parameter_type import (
StandardPythonParameterType,
)
from inference_schema.schema_decorators import input_schema, output_schema

# define global variables
MODEL = None
INPUTS_COLLECTOR = None
OUTPUTS_COLLECTOR = None
INPUTS_OUTPUTS_COLLECTOR = None

INPUT_SAMPLE = [
{
"LIMIT_BAL": 20000,
"SEX": 2,
"EDUCATION": 2,
"MARRIAGE": 1,
"AGE": 24,
"PAY_0": 2,
"PAY_2": 2,
"PAY_3": -1,
"PAY_4": -1,
"PAY_5": -2,
"PAY_6": -2,
"BILL_AMT1": 3913,
"BILL_AMT2": 3102,
"BILL_AMT3": 689,
"BILL_AMT4": 0,
"BILL_AMT5": 0,
"BILL_AMT6": 0,
"PAY_AMT1": 0,
"PAY_AMT2": 689,
"PAY_AMT3": 0,
"PAY_AMT4": 0,
"PAY_AMT5": 0,
"PAY_AMT6": 0,
}
]

# define sample response for inference
OUTPUT_SAMPLE = {"DEFAULT_NEXT_MONTH": [0]}


def init() -> None:
"""Startup event handler to load an MLFLow model."""
global MODEL, INPUTS_COLLECTOR, OUTPUTS_COLLECTOR, INPUTS_OUTPUTS_COLLECTOR

# instantiate collectors
INPUTS_COLLECTOR = Collector(name="model_inputs")
OUTPUTS_COLLECTOR = Collector(name="model_outputs")
INPUTS_OUTPUTS_COLLECTOR = Collector(name="model_inputs_outputs")

# Load MLFlow model
MODEL = mlflow.sklearn.load_model(os.getenv("AZUREML_MODEL_DIR") + "/model_output")


@input_schema("data", StandardPythonParameterType(INPUT_SAMPLE))
@output_schema(StandardPythonParameterType(OUTPUT_SAMPLE))
def run(data: List[Dict]) -> str:
"""Perform scoring for every invocation of the endpoint"""

# Append datetime column to predictions
input_df = pd.DataFrame(data)

# Preprocess payload and get model prediction
model_output = MODEL.predict(input_df).tolist()
output_df = pd.DataFrame(model_output, columns=["DEFAULT_NEXT_MONTH"])

# Make response payload
response_payload = json.dumps({"DEFAULT_NEXT_MONTH": model_output})

# --- Azure ML Data Collection ---

# collect inputs data
context = INPUTS_COLLECTOR.collect(input_df)

# collect outputs data
OUTPUTS_COLLECTOR.collect(output_df, context)

# create a dataframe with inputs/outputs joined - this creates a URI folder (not mltable)
input_output_df = input_df.join(output_df)

# collect both your inputs and output
INPUTS_OUTPUTS_COLLECTOR.collect(input_output_df, context)

# ----------------------------------

return response_payload
Loading

0 comments on commit 2d22516

Please sign in to comment.