Skip to content

Mar25/evals/aoai integration #40630

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 27 commits into
base: main
Choose a base branch
from

Conversation

MilesHolland
Copy link
Member

WIP PR to integration AOAI features into the AI SDK:

  • Add several new classes that serves as handles around grader configurations.
  • Added logic to hydrate EvaluationConfiguration objects from the azure.ai.projects SDK back into useable grader classes for remote evaluation.
  • Modify the evaluate method to accept these new classes as part of the evaluators dictionary, and handle them separately via OAI SDK API calls. The results are then merged back into normal evaluation results
  • Include a dictionary in uploaded evaluation results that maps user-defined eval names to built-in evaluator enums or grader config IDs

Remaining TODO:

  • Add support for column mapping for graders
  • Support target/generated inputs for graders
  • changelog details
  • more testing
  • Better API call failure error handling.

@Copilot Copilot AI review requested due to automatic review settings April 21, 2025 14:34
@MilesHolland MilesHolland requested a review from a team as a code owner April 21, 2025 14:34
@github-actions github-actions bot added the Evaluation Issues related to the client library for Azure AI Evaluation label Apr 21, 2025
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This WIP PR integrates AOAI features into the AI SDK by introducing new grader classes, hydrating EvaluationConfiguration objects into grader configurations, and modifying the evaluation workflow to support both built-in evaluators and AOAI graders. Key changes include:

  • Adding new AOAI grader classes and related utilities.
  • Extending evaluation result metadata with a name map and new error targets.
  • Updating the evaluate method to split and process both callable evaluators and AOAI grader instances.

Reviewed Changes

Copilot reviewed 38 out of 38 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/autogen/raiclient/* Code-generated client and configuration classes updated for both async and sync pipelines.
sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_exceptions.py Added AOAI_GRADER as a new error target.
sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_evaluate/_evaluate_aoai.py Introduces AOAI evaluation helper functions and polling logic.
sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_evaluate/_evaluate.py Modified evaluation workflow to support AOAI grader instances and split evaluators from graders.
sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_aoai/* New AOAI grader wrapper classes implemented for different grading strategies.
Other files Supporting changes to versioning, patching, constants, and mapping to integrate AOAI features.

@nagkumar91 nagkumar91 requested a review from Copilot April 21, 2025 14:52
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR integrates AOAI evaluation features into the AI SDK by adding new AOAI grader classes and extending the evaluation pipeline to handle them alongside existing evaluators.

  • Introduces new classes and methods for AOAI grader integration (e.g. AoaiGrader, LabelGrader, StringCheckGrader, TextSimilarityGrader).
  • Modifies the evaluation flow in _evaluate modules to support splitting evaluators and graders and merging AOAI evaluation results.
  • Updates exception handling, constants, and mapping utilities to support AOAI grader identifiers.

Reviewed Changes

Copilot reviewed 38 out of 38 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/autogen/raiclient/* Adds generated asynchronous and synchronous client configuration and client files for evaluation.
sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_exceptions.py Adds a new AOAI_GRADER enum value for error targeting.
sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_evaluate/* Introduces AOAI evaluation flow via new _evaluate_aoai.py and updates the main _evaluate module to split evaluators vs. graders.
sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_eval_mapping.py Adds mapping for built-in evaluator identifiers including AOAI grader IDs.
sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_constants.py Adds NAME_MAP property to support mapping evaluator names.
sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_aoai/* Adds new AOAI grader wrapper classes (LabelGrader, StringCheckGrader, TextSimilarityGrader, AoaiGrader) for asynchronous evaluation.
sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/init.py Updates module exports to include AOAI grader classes.

@azure-sdk
Copy link
Collaborator

azure-sdk commented Apr 21, 2025

API Change Check

APIView identified API level changes in this PR and created the following API reviews

azure-ai-evaluation

@azure-sdk
Copy link
Collaborator

API change check

APIView has identified API level changes in this PR and created following API reviews.

azure-ai-evaluation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Evaluation Issues related to the client library for Azure AI Evaluation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants