feat: Add new LLMEvaluator component #7401

julian-risch · 2024-03-21T16:54:37Z

Related Issues

fixes LLM Eval - Implement custom LLM evaluator component in core #7023

Proposed Changes:

Add a new component LLMEvaluator that is limited to the OpenAI api for now and evaluates inputs based on provided instructions and examples

How did you test it?

New unit tests
I used an example that can be adjusted to become an integration test or end to end test.

Notes for the reviewer

An open question currently is how to enforce proper json formatting in the rendered prompt template. Main issue I see is " or ' being used inconsistently and thus maybe causing worse generated results.

Current output is {'score': 1, 'name': 'llm'} in the style of the evaluation framework integrations.
We can make llm customizable as an additional parameter of LLMEvaluator later if it makes sense or leave it out completely.

Checklist

I have read the contributors guidelines and the code of conduct
I have updated the related issue with new insights and changes
I added unit tests and updated the docstrings
I've used one of the conventional commit types for my PR title: fix:, feat:, build:, chore:, ci:, docs:, style:, refactor:, perf:, test:.
I documented my code
I ran pre-commit hooks and fixed any issue

coveralls · 2024-03-21T17:09:00Z

Pull Request Test Coverage Report for Build 8411689560

Details

0 of 0 changed or added relevant lines in 0 files are covered.
No unchanged relevant lines lost coverage.
Overall coverage increased (+0.1%) to 89.399%

Totals
Change from base Build 8400580788:	0.1%
Covered Lines:	5490
Relevant Lines:	6141

💛 - Coveralls

haystack/components/evaluators/llm_evaluator.py

shadeMe

LGTM! Great work 🎉 Good to merge after fixing the lint.

julian-risch added 4 commits March 20, 2024 11:56

draft llm evaluator

60eab3c

docstrings

795806b

flexible inputs; validate inputs and outputs

521c80a

add tests

3bf1ab8

github-actions bot added topic:tests 2.x Related to Haystack v2.0 type:documentation Improvements on the docs labels Mar 21, 2024

add release note

d0f9715

julian-risch changed the title ~~LLMEvaluator~~ feat: Add new LLMEvaluator component Mar 21, 2024

julian-risch added 3 commits March 21, 2024 21:17

remove example

b42703b

docstrings

6a216ca

make outputs parameter optional. default:

7c3c59b

julian-risch marked this pull request as ready for review March 21, 2024 21:16

julian-risch requested review from a team as code owners March 21, 2024 21:16

julian-risch requested review from dfokina, davidsbatista and shadeMe and removed request for a team and davidsbatista March 21, 2024 21:16

shadeMe suggested changes Mar 22, 2024

View reviewed changes

julian-risch added 5 commits March 22, 2024 14:28

validate init parameters

980f201

linting

8213723

remove mention of binary scores from template

0a87bf6

make examples and outputs params non-optional

3a2ad6a

removed leftover from optional outputs param

d25ad80

julian-risch requested a review from shadeMe March 22, 2024 15:10

simplify building examples section for template

e64b37f

shadeMe suggested changes Mar 22, 2024

View reviewed changes

haystack/components/evaluators/llm_evaluator.py Outdated Show resolved Hide resolved

haystack/components/evaluators/llm_evaluator.py Show resolved Hide resolved

validate inputs and outputs in examples are dict with str as key

9f00a46

shadeMe approved these changes Mar 22, 2024

View reviewed changes

julian-risch added 3 commits March 24, 2024 19:49

fix pylint too-many-boolean-expressions

45e9bd9

Merge branch 'main' into llmevaluator

903363c

increase test coverage

bf9ba0c

julian-risch merged commit bfd0d3e into main Mar 25, 2024
23 checks passed

julian-risch deleted the llmevaluator branch March 25, 2024 06:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Add new LLMEvaluator component #7401

feat: Add new LLMEvaluator component #7401

julian-risch commented Mar 21, 2024 •

edited

Loading

coveralls commented Mar 21, 2024 •

edited

Loading

shadeMe left a comment •

edited

Loading

feat: Add new LLMEvaluator component #7401

feat: Add new LLMEvaluator component #7401

Conversation

julian-risch commented Mar 21, 2024 • edited Loading

Related Issues

Proposed Changes:

How did you test it?

Notes for the reviewer

Checklist

coveralls commented Mar 21, 2024 • edited Loading

Pull Request Test Coverage Report for Build 8411689560

Details

💛 - Coveralls

shadeMe left a comment • edited Loading

Choose a reason for hiding this comment

julian-risch commented Mar 21, 2024 •

edited

Loading

coveralls commented Mar 21, 2024 •

edited

Loading

shadeMe left a comment •

edited

Loading