Skip to content

Commit 17c67f3

Browse files
authored
Update distillation job with sdk example (Azure#3440)
* Update distillation job with sdk example * Update cell type * Minor edit * Change import name * Add temp file * Update chat completion notebook * Add additional steps to math notebook * Add cli examples * Add all cli examples * Finalize cli examples * Update nlu_qa notebooks * Update the inference messages * Reformat * Address comment * Update mslearn readme
1 parent f0477e8 commit 17c67f3

27 files changed

+3775
-1235
lines changed
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
# Distillation with CLI (Conversation)
2+
3+
## 1. Create the Job
4+
Ensure you have the proper setup.
5+
1. Run `az version` and ensure the `ml` extension is installed. `ml` version should be greater or equal to 2.32.0.
6+
2. If the `ml` extension is not installed, run `az extension add -n ml`
7+
8+
Run the Distillation CLI command pointing to the .YAML file in this folder and fill out the Azure ML IDs needed:
9+
10+
```text
11+
az ml job create --file distillation_conversation.yaml --workspace-name [YOUR_AZURE_WORKSPACE] --resource-group [YOUR_AZURE_RESOURCE_GROUP] --subscription [YOUR_AZURE_SUBSCRIPTION]
12+
```
13+
14+
**Note:** To see how the train and validation files were created, see section 2 of this [notebook](/sdk/python/foundation-models/system/distillation/conversation/distillation_conversational_task.ipynb)
15+
16+
## 2. Deploy to Endpoint
17+
Once the distilled model is ready, you can deploy the model through the UI or CLI.
18+
19+
### UI Deployment
20+
1. Navigate to the `model` tab in [ml studio](https://ml.azure.com) or navigate to the `Finetuning` tab in the [ai platform](https://ai.azure.com)
21+
2. If using the ml studio, locate the model using the `name` of the `registered_model` in the yaml file used to create this job. Select deploy to deploy a serverless endpoint. If using the ai platform, search for the name of the job, which in this example is `Distillation-conversation-llama`. Click on that name, and select Deploy to deploy a serverless endpoint.
22+
23+
### CLI Deployment
24+
Fill out the serverless_endpoint.yaml file in this folder. The necessary information can be found by
25+
1. Navigating to the `model` tab in [ml studio](https://ml.azure.com).
26+
2. Using the `name` of the `registered_model` in the yaml file used to create this job, select the model with that `name`. In this example, the name to use is `llama-conversation-distilled`
27+
3. Use the `asset_id` to fill out the `model_id` in the yaml.
28+
29+
With the information filled out, run the command
30+
31+
```text
32+
az ml serverless-endpoint create -f serverless_endpoint.yaml
33+
```
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,55 @@
1+
type: distillation
2+
3+
name: "Distillation-conversation-llama"
4+
description: "Distill student model using a teacher model"
5+
experiment_name: "Distillation-Conversation"
6+
7+
# Data Generation Properties
8+
data_generation_type: label_generation
9+
data_generation_task_type: conversation
10+
11+
# Input data
12+
training_data:
13+
type: uri_file
14+
path: ./train_conversation.jsonl
15+
validation_data:
16+
type: uri_file
17+
path: ./validation_conversation.jsonl
18+
19+
# Teacher model serverless endpoint information
20+
# REPLACE WITH YOUR ENDPOINT INFORMATION
21+
teacher_model_endpoint_connection:
22+
type: serverless
23+
name: Meta-Llama-3-1-405B-Instruct-vkn
24+
endpoint: https://Meta-Llama-3-1-405B-Instruct-vkn.westus3.models.ai.azure.com/chat/completions
25+
api_key: EXAMPLE_API_KEY
26+
27+
# Model ID
28+
student_model: azureml://registries/azureml-meta/models/Meta-Llama-3.1-8B-Instruct/versions/2
29+
30+
# Output distilled model
31+
outputs:
32+
registered_model:
33+
type: mlflow_model
34+
name: llama-conversation-distilled
35+
36+
37+
# Teacher model related properties (OPTIONAL)
38+
teacher_model_settings:
39+
inference_parameters:
40+
temperature: 0.1
41+
max_tokens: 100
42+
top_p: 0.95
43+
endpoint_request_settings:
44+
request_batch_size: 10
45+
min_endpoint_success_ratio: 0.7
46+
47+
# For finetuning (OPTIONAL)
48+
hyperparameters:
49+
learning_rate_multiplier: "0.2"
50+
n_epochs: "5"
51+
batch_size: "2"
52+
53+
# Resource for Data Generation Step (OPTIONAL)
54+
resources:
55+
instance_type: Standard_D2_v2
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
name: llama-conversation-distilled
2+
model_id: azureml://locations/{AI_PROJECT_LOCATION}/workspaces/{WORKSPACE_ID}/models/llama-conversation-distilled/versions/{VERSION}

cli/foundation-models/system/distillation/conversation/train_conversation.jsonl

+100
Large diffs are not rendered by default.

cli/foundation-models/system/distillation/conversation/validation_conversation.jsonl

+100
Large diffs are not rendered by default.
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
# Distillation with CLI (Math)
2+
3+
## 1. Create the Job
4+
Ensure you have the proper setup.
5+
1. Run `az version` and ensure the `ml` extension is installed. `ml` version should be greater or equal to 2.32.0.
6+
2. If the `ml` extension is not installed, run `az extension add -n ml`
7+
8+
Run the Distillation CLI command pointing to the .YAML file in this folder and fill out the Azure ML IDs needed:
9+
10+
```text
11+
az ml job create --file distillation_math.yaml --workspace-name [YOUR_AZURE_WORKSPACE] --resource-group [YOUR_AZURE_RESOURCE_GROUP] --subscription [YOUR_AZURE_SUBSCRIPTION]
12+
```
13+
**Note:** To see how the train and validation files were created, see section 2 of this [notebook](/sdk/python/foundation-models/system/distillation/math/distillation_math.ipynb)
14+
15+
## 2. Deploy to Endpoint
16+
Once the distilled model is ready, you can deploy the model through the UI or CLI.
17+
18+
### UI Deployment
19+
1. Navigate to the `model` tab in [ml studio](https://ml.azure.com) or navigate to the `Finetuning` tab in the [ai platform](https://ai.azure.com)
20+
2. If using the ml studio, locate the model using the `name` of the `registered_model` in the yaml file used to create this job. Select deploy to deploy a serverless endpoint. If using the ai platform, search for the name of the job, which in this example is `Distillation-math-llama`. Click on that name, and select Deploy to deploy a serverless endpoint.
21+
22+
### CLI Deployment
23+
Fill out the serverless_endpoint.yaml file in this folder. The necessary information can be found by
24+
1. Navigating to the `model` tab in [ml studio](https://ml.azure.com).
25+
2. Using the `name` of the `registered_model` in the yaml file used to create this job, select the model with that `name`. In this example, the name to use is `llama-math-distilled`
26+
3. Use the `asset_id` to fill out the `model_id` in the yaml.
27+
28+
With the information filled out, run the command
29+
30+
```text
31+
az ml serverless-endpoint create -f serverless_endpoint.yaml
32+
```
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,59 @@
1+
type: distillation
2+
3+
name: "Distillation-math-llama"
4+
description: "Distill student model using a teacher model"
5+
experiment_name: "Distillation-Math"
6+
7+
# Data Generation Properties
8+
data_generation_type: label_generation
9+
data_generation_task_type: math
10+
11+
# Input data
12+
training_data:
13+
type: uri_file
14+
path: ./train_math.jsonl
15+
validation_data:
16+
type: uri_file
17+
path: ./validation_math.jsonl
18+
19+
# Teacher model serverless endpoint information
20+
# REPLACE WITH YOUR ENDPOINT INFORMATION
21+
teacher_model_endpoint_connection:
22+
type: serverless
23+
name: Meta-Llama-3-1-405B-Instruct-vkn
24+
endpoint: https://Meta-Llama-3-1-405B-Instruct-vkn.westus3.models.ai.azure.com/chat/completions
25+
api_key: EXAMPLE_API_KEY
26+
27+
# Model ID
28+
student_model: azureml://registries/azureml-meta/models/Meta-Llama-3.1-8B-Instruct/versions/2
29+
30+
# Output distilled model
31+
outputs:
32+
registered_model:
33+
type: mlflow_model
34+
name: llama-math-distilled
35+
36+
37+
# Teacher model related properties (OPTIONAL)
38+
teacher_model_settings:
39+
inference_parameters:
40+
temperature: 0.1
41+
max_tokens: 1024
42+
top_p: 0.95
43+
endpoint_request_settings:
44+
request_batch_size: 10
45+
min_endpoint_success_ratio: 0.7
46+
47+
# System prompt settings (OPTIONAL)
48+
prompt_settings:
49+
enable_chain_of_thought: true
50+
51+
# For finetuning (OPTIONAL)
52+
hyperparameters:
53+
learning_rate_multiplier: "0.2"
54+
n_epochs: "5"
55+
batch_size: "2"
56+
57+
# Resource for Data Generation Step (OPTIONAL)
58+
resources:
59+
instance_type: Standard_D2_v2
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
name: llama-math-distilled
2+
model_id: azureml://locations/{AI_PROJECT_LOCATION}/workspaces/{WORKSPACE_ID}/models/llama-math-distilled/versions/{VERSION}

0 commit comments

Comments
 (0)