update monitoring samples (Azure#2970)

* update monitoring samples * remove sensitive information, make general for the public * reformat python * small changes * small change data column names * missed one
abdulrahman305 · Jan 31, 2024 · 2d22516 · 2d22516
1 parent 7346ebe
commit 2d22516
Show file tree

Hide file tree

Showing 15 changed files with 1,591 additions and 82 deletions.
diff --git a/cli/monitoring/README.md b/cli/monitoring/README.md
@@ -1,24 +1,28 @@
 # AzureML Model Monitoring
 
-**AzureML model monitoring** enables you to track the performance of your models from a data science perspective whilst in production. This directory contains YAML configuration samples for different scenarios you may encounter when trying to monitor your models. Comprehensive documentation on model monitoring, its capabilities, and a list of supported signals & metrics can be found [here](https://learn.microsoft.com/en-us/azure/machine-learning/concept-model-monitoring?view=azureml-api-2). 
+**AzureML model monitoring** enables you to track the performance of your models from a data science perspective whilst in production. This directory contains YAML configuration samples and an E2E notebook with a Python SDK monitoring configuration for different scenarios you may encounter when trying to monitor your models. Comprehensive documentation on model monitoring, its capabilities, and a list of supported signals & metrics can be found [here](https://learn.microsoft.com/en-us/azure/machine-learning/concept-model-monitoring?view=azureml-api-2). 
 
 > **Note**: For monitoring your models deployed with AzureML online endpoints (kubernetes or online), you can use **Model Data Collector (MDC)** to collect production inference data from your deployed model with ease. Documentation for data collection can be found [here](https://learn.microsoft.com/en-us/azure/machine-learning/how-to-collect-production-data?view=azureml-api-2&tabs=azure-cli).
 
 > **Note**: Comprehensive configuration schema information can be found [here](https://learn.microsoft.com/en-us/azure/machine-learning/reference-yaml-monitor?view=azureml-api-2). 
 
+## End-to-end monitoring example
+
+Please see the notebook contained in the [azureml-e2e-model-monitoring/notebooks](azureml-e2e-model-monitoring\notebooks\model-monitoring-e2e.ipynb) folder to run an end-to-end notebook to try out **AzureML model monitoring**.
+
 ## Scenario coverage
 
-**AzureML model monitoring** supports multiple different scenarios so you can monitor your models regardless of your deployment approach. Below, we detail each scenario and the necessary steps to configure your model monitor in each specific case. 
+**AzureML model monitoring** supports multiple different scenarios so you can monitor your models regardless of your deployment approach. Below, we detail each scenario and the necessary steps to configure your model monitor in each specific case.
 
-### 1. Deploy model with AzureML online endpoints; out-of-box configuration
+### 1. Deploy model with AzureML online endpoints; out-of-box monitoring configuration
 
 In this scenario, you have deployed your model to AzureML online endpoints (managed or kubernetes). You have enabled production inference data collection (documentation for it can be found [here](https://learn.microsoft.com/en-us/azure/machine-learning/how-to-collect-production-data?view=azureml-api-2&tabs=azure-cli)) for your deployment. With the `out-of-box-monitoring.yaml`, you can create a model monitor with the default signals (data drift, prediction drift, data quality), metrics, and thresholds - all of which can be adjusted later.
 
 Schedule your model monitor with this command: `az ml schedule create -f out-of-box-monitoring.yaml`
 
-### 2. Deploy model with AzureML online endpoints; advanced configuration with feature importance
+### 2. Deploy model with AzureML online endpoints; advanced monitoring configuration with feature importance
 
-In this scenario, you have deployed your model to AzureML online endpoints (managed or kubernetes). You have enabled production inference data collection (documentation for it can be found [here](https://learn.microsoft.com/en-us/azure/machine-learning/how-to-collect-production-data?view=azureml-api-2&tabs=azure-cli)) for your deployment. With the `advanced-model-monitoring.yaml`, you can create a model monitor with configurable signals, metrics, and thresholds. You can adjust the configuration to only monitor for the signals (data drift, prediction drift, data quality) and respective metrics/thresholds you are interested in monitoring for. The provided sample also determines the most important features and only computes the metrics for those features (feature importance).
+In this scenario, you have deployed your model to AzureML online endpoints (managed or kubernetes). You have enabled production inference data collection (documentation for it can be found [here](https://learn.microsoft.com/en-us/azure/machine-learning/how-to-collect-production-data?view=azureml-api-2&tabs=azure-cli)) for your deployment. With the `advanced-model-monitoring.yaml`, you can create a model monitor with configurable signals, metrics, and thresholds. You can adjust the configuration to only monitor for the signals (data drift, prediction drift, data quality) and respective metrics/thresholds you are interested in monitoring for. The provided sample also determines the most important features and only computes the metrics for those features (feature importance). You can adjust the details to suit the needs of your scenario in the configuration file.
 
 Schedule your model monitor with this command: `az ml schedule create -f advanced-model-monitoring.yaml`
 
@@ -28,16 +32,18 @@ In this scenario, you are interested in continuously monitoring your deployed fl
 
 Schedule your model monitor with this command: `az ml schedule create -f generation-safety-quality-monitoring.yaml`
 
-### 4. Deploy model with AzureML batch endpoints, AKS with CLI v1, or outside of AzureML
+### 4. Monitoring configuration for non-MDC collected data; deployment with AzureML batch, AKS v1, or outside of AzureML
+
+In this scenario, you can bring your own data to use as input to your monitoring job. As long as this data is maintained and kept up to date as your model is used in production, you can monitor it to extract insights.When you bring your own production data, you need to provide a custom preprocessing component to tabularize the data into MLTable format with a timestamp for each row. First, bring your data to Azure Blob and create an AzureML data asset for your model inputs and model outputs.
 
-In this scenario, you can bring your own data to use as input to your monitoring job. When you bring your own production data, you need to provide a custom preprocessing component to get the data into MLTable format for the monitoring job to use. An example custom preprocessing component can be found in the `components/custom_preprocessing` directory. From that directory, you can use the command `az ml component create -f spec.yaml --subscription <subscription_id> --workspace <workspace> --resource-group <resource_group>` to register the component to Azure Machine Learning, which is a required prerequisite. 
+Next, create a preprocessing component to preprocess the data. An example custom preprocessing component can be found in the `components/custom_preprocessing` directory. From that directory, you can use the command `az ml component create -f spec.yaml --subscription <subscription_id> --workspace <workspace> --resource-group <resource_group>` to register the component to Azure Machine Learning, which is a required prerequisite. Ensure that you modify the component code based on the nature of how your data is stored.
 
 Schedule your model monitor with this command: `az ml schedule create -f model-monitoring-with-collected-data.yaml --subscription <subscription_id> --workspace <workspace> --resource-group <resource_group>`
 
 ### 5. Create your own custom monitoring signal
 
-In this scenario, you can create your own custom monitoring signal. For example, say you would like to implement your own metric, such as standard deviation. To start, you will need to create and register a custom signal component to Azure Machine Learning. The custom signal component can be found in the `components/custom_signal/` directory. From that directory, you can use the command `az ml component create -f spec.yaml --subscription <subscription_id> --workspace <workspace> --resource-group <resource_group>` to register the component. 
+In this scenario, you can create your own custom monitoring signal. For example, say you would like to implement your own metric, such as standard deviation. To start, you will need to create and register a custom signal component to Azure Machine Learning. The custom signal component can be found in the `components/custom_signal/` directory. From that directory, you can use the command `az ml component create -f spec.yaml --subscription <subscription_id> --workspace <workspace> --resource-group <resource_group>` to register the component.
 
 Schedule your monitoring job (found in the main `monitoring/` directory) with the following command: `az ml schedule create -f custom_monitoring.yaml --subscription <subscription_id> --workspace <workspace> --resource-group <resource_group>`.
 
-**Note**: The `custom-monitoring.yaml` configuration file uses both a custom preprocessing component and a custom monitoring signal. If you only want to use a custom signal (e.g., your data is being collected with the Model Data Collector (MDC) from an online endpoint), you can remove the custom preprocessing component line and AzureML model monitoring will use the default data preprocessor. 
+**Note**: The `custom-monitoring.yaml` configuration file uses both a custom preprocessing component and a custom monitoring signal. If you only want to use a custom signal (e.g., your data is being collected with the Model Data Collector (MDC) from an online endpoint), you can remove the custom preprocessing component line (`pre_processing_component: azureml:production_data_preprocessing:1`) and AzureML model monitoring will use the default MDC data preprocessor.
diff --git a/cli/monitoring/advanced-model-monitoring.yaml b/cli/monitoring/advanced-model-monitoring.yaml
@@ -14,42 +14,44 @@ trigger:
     minutes: 15 # at 15 mins after 3am
 
 create_monitor:
+
   compute: 
     instance_type: standard_e4s_v3
-    runtime_version: 3.2
+    runtime_version: "3.3"
+
   monitoring_target:
     ml_task: classification
-    endpoint_deployment_id: azureml:fraud-detection-endpoint:fraud-detection-deployment
+    endpoint_deployment_id: azureml:credit-default:main
 
   monitoring_signals:
     advanced_data_drift: # monitoring signal name, any user defined name works
       type: data_drift
       # reference_dataset is optional. By default referece_dataset is the production inference data associated with Azure Machine Learning online endpoint
       reference_data:
         input_data:
-          path: azureml:my_model_training_data:1 # use training data as comparison reference dataset
+          path: azureml:credit-reference:1 # use training data as comparison reference dataset
           type: mltable
         data_context: training
-        target_column_name: fraud_detected
+        data_column_names:
+          target_column: DEFAULT_NEXT_MONTH
       features: 
-        top_n_feature_importance: 20 # monitor drift for top 20 features
+        top_n_feature_importance: 10 # monitor drift for top 10 features
       metric_thresholds:
         numerical:
           jensen_shannon_distance: 0.01
         categorical:
           pearsons_chi_squared_test: 0.02
     advanced_data_quality:
       type: data_quality
-      # reference_dataset is optional. By default referece_dataset is the production inference data associated with Azure Machine Learning online endpoint
+      # reference_dataset is optional. By default reference_dataset is the production inference data associated with Azure Machine Learning online endpoint
       reference_data:
         input_data:
-          path: azureml:my_model_training_data:1
+          path: azureml:credit-reference:1
           type: mltable
         data_context: training
       features: # monitor data quality for 3 individual features only
-        - feature_A
-        - feature_B
-        - feature_C
+        - SEX
+        - EDUCATION
       metric_thresholds:
         numerical:
           null_value_rate: 0.05
@@ -63,10 +65,11 @@ create_monitor:
       # Azure Machine Learning model monitoring will automatically join both model_inputs and model_outputs data and used it for computation
       reference_data:
         input_data:
-          path: azureml:my_model_training_data:1
+          path: azureml:credit-reference:1
           type: mltable
         data_context: training
-        target_column_name: is_fraud
+        data_column_names:
+          target_column: DEFAULT_NEXT_MONTH
       metric_thresholds:
         normalized_discounted_cumulative_gain: 0.9
 

diff --git a/cli/monitoring/azureml-e2e-model-monitoring/.gitignore b/cli/monitoring/azureml-e2e-model-monitoring/.gitignore
@@ -0,0 +1,138 @@
+# Byte-compiled / optimized / DLL files
+__pycache__/
+*.py[cod]
+*$py.class
+
+# C extensions
+*.so
+
+# Distribution / packaging
+.Python
+build/
+develop-eggs/
+dist/
+downloads/
+eggs/
+.eggs/
+lib/
+lib64/
+parts/
+sdist/
+var/
+wheels/
+pip-wheel-metadata/
+share/python-wheels/
+*.egg-info/
+.installed.cfg
+*.egg
+MANIFEST
+
+# PyInstaller
+#  Usually these files are written by a python script from a template
+#  before PyInstaller builds the exe, so as to inject date/other infos into it.
+*.manifest
+*.spec
+
+# Installer logs
+pip-log.txt
+pip-delete-this-directory.txt
+
+# Unit test / coverage reports
+htmlcov/
+.tox/
+.nox/
+.coverage
+.coverage.*
+.cache
+nosetests.xml
+coverage.xml
+*.cover
+*.py,cover
+.hypothesis/
+.pytest_cache/
+
+# Translations
+*.mo
+*.pot
+
+# Django stuff:
+*.log
+local_settings.py
+db.sqlite3
+db.sqlite3-journal
+
+# Flask stuff:
+instance/
+.webassets-cache
+
+# Scrapy stuff:
+.scrapy
+
+# Sphinx documentation
+docs/_build/
+
+# PyBuilder
+target/
+
+# Jupyter Notebook
+.ipynb_checkpoints
+
+# IPython
+profile_default/
+ipython_config.py
+
+# pyenv
+.python-version
+
+# pipenv
+#   According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
+#   However, in case of collaboration, if having platform-specific dependencies or dependencies
+#   having no cross-platform support, pipenv may install dependencies that don't work, or not
+#   install all needed dependencies.
+#Pipfile.lock
+
+# PEP 582; used by e.g. github.com/David-OConnor/pyflow
+__pypackages__/
+
+# Celery stuff
+celerybeat-schedule
+celerybeat.pid
+
+# SageMath parsed files
+*.sage.py
+
+# Environments
+.env
+.venv
+env/
+venv/
+ENV/
+env.bak/
+venv.bak/
+
+# Spyder project settings
+.spyderproject
+.spyproject
+
+# Rope project settings
+.ropeproject
+
+# mkdocs documentation
+/site
+
+# mypy
+.mypy_cache/
+.dmypy.json
+dmypy.json
+
+# Pyre type checker
+.pyre/
+
+# Other
+.vscode
+.DS_Store
+.amlignore
+*.amltmp
+notebooks/.ipynb_checkpoints/
+data/reference
+data/production
diff --git a/cli/monitoring/azureml-e2e-model-monitoring/README.md b/cli/monitoring/azureml-e2e-model-monitoring/README.md
@@ -0,0 +1,10 @@
+# AzureML E2E Model Monitoring
+
+In this sample notebook, you will observe the end-to-end lifecycle of the Machine Learning (ML) operationalization process. You will follow the following steps to train your ML model, deploy it to production, and monitor it to ensure its continuous performance:
+
+1) Setup environment 
+2) Register data assets
+3) Train the model
+4) Deploy the model
+5) Simulate inference requests
+6) Monitor the model
diff --git a/cli/monitoring/azureml-e2e-model-monitoring/code/score.py b/cli/monitoring/azureml-e2e-model-monitoring/code/score.py
@@ -0,0 +1,98 @@
+"""Script for an azureml online deployment"""
+import json
+import logging
+import os
+import uuid
+from typing import Dict, List
+
+import mlflow
+import pandas as pd
+from azureml.ai.monitoring import Collector
+from inference_schema.parameter_types.standard_py_parameter_type import (
+    StandardPythonParameterType,
+)
+from inference_schema.schema_decorators import input_schema, output_schema
+
+# define global variables
+MODEL = None
+INPUTS_COLLECTOR = None
+OUTPUTS_COLLECTOR = None
+INPUTS_OUTPUTS_COLLECTOR = None
+
+INPUT_SAMPLE = [
+    {
+        "LIMIT_BAL": 20000,
+        "SEX": 2,
+        "EDUCATION": 2,
+        "MARRIAGE": 1,
+        "AGE": 24,
+        "PAY_0": 2,
+        "PAY_2": 2,
+        "PAY_3": -1,
+        "PAY_4": -1,
+        "PAY_5": -2,
+        "PAY_6": -2,
+        "BILL_AMT1": 3913,
+        "BILL_AMT2": 3102,
+        "BILL_AMT3": 689,
+        "BILL_AMT4": 0,
+        "BILL_AMT5": 0,
+        "BILL_AMT6": 0,
+        "PAY_AMT1": 0,
+        "PAY_AMT2": 689,
+        "PAY_AMT3": 0,
+        "PAY_AMT4": 0,
+        "PAY_AMT5": 0,
+        "PAY_AMT6": 0,
+    }
+]
+
+# define sample response for inference
+OUTPUT_SAMPLE = {"DEFAULT_NEXT_MONTH": [0]}
+
+
+def init() -> None:
+    """Startup event handler to load an MLFLow model."""
+    global MODEL, INPUTS_COLLECTOR, OUTPUTS_COLLECTOR, INPUTS_OUTPUTS_COLLECTOR
+
+    # instantiate collectors
+    INPUTS_COLLECTOR = Collector(name="model_inputs")
+    OUTPUTS_COLLECTOR = Collector(name="model_outputs")
+    INPUTS_OUTPUTS_COLLECTOR = Collector(name="model_inputs_outputs")
+
+    # Load MLFlow model
+    MODEL = mlflow.sklearn.load_model(os.getenv("AZUREML_MODEL_DIR") + "/model_output")
+
+
+@input_schema("data", StandardPythonParameterType(INPUT_SAMPLE))
+@output_schema(StandardPythonParameterType(OUTPUT_SAMPLE))
+def run(data: List[Dict]) -> str:
+    """Perform scoring for every invocation of the endpoint"""
+
+    # Append datetime column to predictions
+    input_df = pd.DataFrame(data)
+
+    # Preprocess payload and get model prediction
+    model_output = MODEL.predict(input_df).tolist()
+    output_df = pd.DataFrame(model_output, columns=["DEFAULT_NEXT_MONTH"])
+
+    # Make response payload
+    response_payload = json.dumps({"DEFAULT_NEXT_MONTH": model_output})
+
+    # --- Azure ML Data Collection ---
+
+    # collect inputs data
+    context = INPUTS_COLLECTOR.collect(input_df)
+
+    # collect outputs data
+    OUTPUTS_COLLECTOR.collect(output_df, context)
+
+    # create a dataframe with inputs/outputs joined - this creates a URI folder (not mltable)
+    input_output_df = input_df.join(output_df)
+
+    # collect both your inputs and output
+    INPUTS_OUTPUTS_COLLECTOR.collect(input_output_df, context)
+
+    # ----------------------------------
+
+    return response_payload