Mar25/evals/aoai integration #40630

MilesHolland · 2025-04-21T14:34:42Z

WIP PR to integration AOAI features into the AI SDK:

Add several new classes that serves as handles around grader configurations.
Added logic to hydrate EvaluationConfiguration objects from the azure.ai.projects SDK back into useable grader classes for remote evaluation.
Modify the evaluate method to accept these new classes as part of the evaluators dictionary, and handle them separately via OAI SDK API calls. The results are then merged back into normal evaluation results
Include a dictionary in uploaded evaluation results that maps user-defined eval names to built-in evaluator enums or grader config IDs

Remaining TODO:

Add support for column mapping for graders
Support target/generated inputs for graders
changelog details
more testing
Better API call failure error handling.

…d/azure-sdk-for-python into mar25/evals/aoai-integration

Copilot

Pull Request Overview

This WIP PR integrates AOAI features into the AI SDK by introducing new grader classes, hydrating EvaluationConfiguration objects into grader configurations, and modifying the evaluation workflow to support both built-in evaluators and AOAI graders. Key changes include:

Adding new AOAI grader classes and related utilities.
Extending evaluation result metadata with a name map and new error targets.
Updating the evaluate method to split and process both callable evaluators and AOAI grader instances.

Reviewed Changes

Copilot reviewed 38 out of 38 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/autogen/raiclient/*	Code-generated client and configuration classes updated for both async and sync pipelines.
sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_exceptions.py	Added AOAI_GRADER as a new error target.
sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_evaluate/_evaluate_aoai.py	Introduces AOAI evaluation helper functions and polling logic.
sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_evaluate/_evaluate.py	Modified evaluation workflow to support AOAI grader instances and split evaluators from graders.
sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_aoai/*	New AOAI grader wrapper classes implemented for different grading strategies.
Other files	Supporting changes to versioning, patching, constants, and mapping to integrate AOAI features.

sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_evaluate/_evaluate_aoai.py

Copilot

Pull Request Overview

This PR integrates AOAI evaluation features into the AI SDK by adding new AOAI grader classes and extending the evaluation pipeline to handle them alongside existing evaluators.

Introduces new classes and methods for AOAI grader integration (e.g. AoaiGrader, LabelGrader, StringCheckGrader, TextSimilarityGrader).
Modifies the evaluation flow in _evaluate modules to support splitting evaluators and graders and merging AOAI evaluation results.
Updates exception handling, constants, and mapping utilities to support AOAI grader identifiers.

Reviewed Changes

Copilot reviewed 38 out of 38 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/autogen/raiclient/*	Adds generated asynchronous and synchronous client configuration and client files for evaluation.
sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_exceptions.py	Adds a new AOAI_GRADER enum value for error targeting.
sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_evaluate/*	Introduces AOAI evaluation flow via new _evaluate_aoai.py and updates the main _evaluate module to split evaluators vs. graders.
sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_eval_mapping.py	Adds mapping for built-in evaluator identifiers including AOAI grader IDs.
sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_constants.py	Adds NAME_MAP property to support mapping evaluator names.
sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_aoai/*	Adds new AOAI grader wrapper classes (LabelGrader, StringCheckGrader, TextSimilarityGrader, AoaiGrader) for asynchronous evaluation.
sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/init.py	Updates module exports to include AOAI grader classes.

sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_evaluate/_evaluate_aoai.py

azure-sdk · 2025-04-21T14:53:34Z

API Change Check

APIView identified API level changes in this PR and created the following API reviews

azure-ai-evaluation

sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_aoai/aoai_grader.py

sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_evaluate/_evaluate_aoai.py

sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_aoai/aoai_grader.py

Add a that maps to pass or fail

…oai_grader.py Co-authored-by: Nagkumar Arkalgud <[email protected]>

…d/azure-sdk-for-python into mar25/evals/aoai-integration

azure-sdk · 2025-04-25T19:23:02Z

API change check

APIView has identified API level changes in this PR and created following API reviews.

azure-ai-evaluation

MilesHolland added 9 commits March 3, 2025 10:17

add typespec autogen files

7bc5db8

initial integration

82d26bc

Merge branch 'main' into mar25/evals/aoai-integration

af2c3c2

add sub grader classes

09b2b1b

Merge branch 'main' into mar25/evals/aoai-integration

acf86b8

remove extra print statement

89479be

Merge branch 'mar25/evals/aoai-integration' of github.com:MilesHollan…

f3d9c6a

…d/azure-sdk-for-python into mar25/evals/aoai-integration

change polling interval

66ea5ec

add name to id property dictionary

4214b6a

Copilot AI review requested due to automatic review settings April 21, 2025 14:34

MilesHolland requested a review from a team as a code owner April 21, 2025 14:34

github-actions bot added the Evaluation Issues related to the client library for Azure AI Evaluation label Apr 21, 2025

Copilot AI reviewed Apr 21, 2025

View reviewed changes

sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_evaluate/_evaluate_aoai.py Outdated Show resolved Hide resolved

sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_evaluate/_evaluate_aoai.py Outdated Show resolved Hide resolved

Merge branch 'main' into mar25/evals/aoai-integration

0a82aff

nagkumar91 requested a review from Copilot April 21, 2025 14:52

Copilot AI reviewed Apr 21, 2025

View reviewed changes

sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_evaluate/_evaluate_aoai.py Outdated Show resolved Hide resolved

sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_evaluate/_evaluate_aoai.py Outdated Show resolved Hide resolved

nagkumar91 approved these changes Apr 21, 2025

View reviewed changes

singankit reviewed Apr 21, 2025

View reviewed changes

sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_aoai/aoai_grader.py Outdated Show resolved Hide resolved

MilesHolland and others added 11 commits April 22, 2025 19:30

column mapping logic

2437d36

Merge branch 'main' into mar25/evals/aoai-integration

f0bc95a

Add a that maps to pass or fail

498858c

better error handling and timeout logic

d22010b

nits

3000b75

Merge pull request #2 from nagkumar91/pass_fail_mapping_aoai

51fd22a

Add a that maps to pass or fail

Update sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_aoai/a…

9cc66ef

…oai_grader.py Co-authored-by: Nagkumar Arkalgud <[email protected]>

default AOAI version and experimental tag

0e78b80

Merge branch 'mar25/evals/aoai-integration' of github.com:MilesHollan…

0c7870a

…d/azure-sdk-for-python into mar25/evals/aoai-integration

log nits

bde2e30

testing

a7a8c32

MilesHolland added 4 commits April 25, 2025 14:29

Merge branch 'main' into mar25/evals/aoai-integration

f1987ee

recordings

4a13e64

rename graders

09642ed

CL

c56e09e

MilesHolland and others added 2 commits April 25, 2025 16:11

cspell

fa41127

Merge branch 'main' into mar25/evals/aoai-integration

d48527c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mar25/evals/aoai integration #40630

Mar25/evals/aoai integration #40630

MilesHolland commented Apr 21, 2025

Copilot AI left a comment

Copilot AI left a comment

azure-sdk commented Apr 21, 2025 •

edited by github-actions bot

Loading

azure-sdk commented Apr 25, 2025

Mar25/evals/aoai integration #40630

Are you sure you want to change the base?

Mar25/evals/aoai integration #40630

Conversation

MilesHolland commented Apr 21, 2025

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

azure-sdk commented Apr 21, 2025 • edited by github-actions bot Loading

API Change Check

azure-sdk commented Apr 25, 2025

azure-sdk commented Apr 21, 2025 •

edited by github-actions bot

Loading