Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add new LLMEvaluator component #7401

Merged
merged 18 commits into from
Mar 25, 2024
Merged

feat: Add new LLMEvaluator component #7401

merged 18 commits into from
Mar 25, 2024

Conversation

julian-risch
Copy link
Member

@julian-risch julian-risch commented Mar 21, 2024

Related Issues

Proposed Changes:

  • Add a new component LLMEvaluator that is limited to the OpenAI api for now and evaluates inputs based on provided instructions and examples

How did you test it?

  • New unit tests
  • I used an example that can be adjusted to become an integration test or end to end test.

Notes for the reviewer

An open question currently is how to enforce proper json formatting in the rendered prompt template. Main issue I see is " or ' being used inconsistently and thus maybe causing worse generated results.

Current output is {'score': 1, 'name': 'llm'} in the style of the evaluation framework integrations.
We can make llm customizable as an additional parameter of LLMEvaluator later if it makes sense or leave it out completely.

Checklist

@github-actions github-actions bot added topic:tests 2.x Related to Haystack v2.0 type:documentation Improvements on the docs labels Mar 21, 2024
@julian-risch julian-risch changed the title LLMEvaluator feat: Add new LLMEvaluator component Mar 21, 2024
@coveralls
Copy link
Collaborator

coveralls commented Mar 21, 2024

Pull Request Test Coverage Report for Build 8411689560

Details

  • 0 of 0 changed or added relevant lines in 0 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.1%) to 89.399%

Totals Coverage Status
Change from base Build 8400580788: 0.1%
Covered Lines: 5490
Relevant Lines: 6141

💛 - Coveralls

@julian-risch julian-risch marked this pull request as ready for review March 21, 2024 21:16
@julian-risch julian-risch requested review from a team as code owners March 21, 2024 21:16
@julian-risch julian-risch requested review from dfokina, davidsbatista and shadeMe and removed request for a team and davidsbatista March 21, 2024 21:16
haystack/components/evaluators/llm_evaluator.py Outdated Show resolved Hide resolved
haystack/components/evaluators/llm_evaluator.py Outdated Show resolved Hide resolved
haystack/components/evaluators/llm_evaluator.py Outdated Show resolved Hide resolved
haystack/components/evaluators/llm_evaluator.py Outdated Show resolved Hide resolved
haystack/components/evaluators/llm_evaluator.py Outdated Show resolved Hide resolved
haystack/components/evaluators/llm_evaluator.py Outdated Show resolved Hide resolved
haystack/components/evaluators/llm_evaluator.py Outdated Show resolved Hide resolved
haystack/components/evaluators/llm_evaluator.py Outdated Show resolved Hide resolved
haystack/components/evaluators/llm_evaluator.py Outdated Show resolved Hide resolved
@julian-risch julian-risch requested a review from shadeMe March 22, 2024 15:10
Copy link
Contributor

@shadeMe shadeMe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Great work 🎉 Good to merge after fixing the lint.

@julian-risch julian-risch merged commit bfd0d3e into main Mar 25, 2024
23 checks passed
@julian-risch julian-risch deleted the llmevaluator branch March 25, 2024 06:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2.x Related to Haystack v2.0 topic:tests type:documentation Improvements on the docs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

LLM Eval - Implement custom LLM evaluator component in core
3 participants