feat: Add `AnswerF1Evaluator` #7073

silvanocerza · 2024-02-23T11:45:05Z

Related Issues

fixes Implement function to calculate F1 metric #6068

Proposed Changes:

Add AnswerF1Evaluator, a Component that can be used to calculate the F1 score metric given a list of questions, a list of expected answers for each question and the list of predicted answers for each question.

How did you test it?

Added unit tests.

Notes for the reviewer

I didn't add the component in the package __init__.py on purpose to avoid conflicts with future PRs.
When all the evaluators are done I'll update it.

Checklist

I have read the contributors guidelines and the code of conduct
I have updated the related issue with new insights and changes
I added unit tests and updated the docstrings
I've used one of the conventional commit types for my PR title: fix:, feat:, build:, chore:, ci:, docs:, style:, refactor:, perf:, test:.
I documented my code
I ran pre-commit hooks and fixed any issue

coveralls · 2024-02-23T11:56:07Z

Pull Request Test Coverage Report for Build 8360549906

Warning: This coverage report may be inaccurate.

This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.

For more information on this, see Tracking coverage changes with pull request builds.
To avoid this issue with future PRs, see these Recommended CI Configurations.
For a quick fix, rebase this PR at GitHub. Your next report should be accurate.

Details

0 of 0 changed or added relevant lines in 0 files are covered.
6 unchanged lines in 1 file lost coverage.
Overall coverage increased (+0.03%) to 89.265%

Files with Coverage Reduction	New Missed Lines	%
dataclasses/document.py	6	93.4%

Totals
Change from base Build 8346207287:	0.03%
Covered Lines:	5430
Relevant Lines:	6083

💛 - Coveralls

haystack/components/evaluators/answer_f1.py

test/components/evaluators/test_answer_f1.py

julian-risch · 2024-03-21T08:22:01Z

Similar to my comment on the AnswerRecall PRs, we need the tokenized versions of answers to calculate F1 on answers. #7394 (review)

silvanocerza · 2024-03-21T16:08:08Z

Closing this as we're going in a different direction to calculate F1 scores.

We're probably going to have a Component that doesn't take answers as inputs but the result of other components that calculate answer precision and recall.

silvanocerza self-assigned this Feb 23, 2024

silvanocerza requested review from a team as code owners February 23, 2024 11:45

silvanocerza requested review from dfokina and davidsbatista and removed request for a team February 23, 2024 11:45

github-actions bot added topic:tests 2.x type:documentation labels Feb 23, 2024

silvanocerza requested review from shadeMe and julian-risch and removed request for davidsbatista February 23, 2024 11:46

silvanocerza marked this pull request as draft February 23, 2024 12:03

julian-risch removed their request for review March 1, 2024 09:36

silvanocerza added 3 commits March 19, 2024 14:38

Add AnswerF1Evaluator

860e09b

Add release notes

781533e

Remove to_dict and from_dict and update docstrings

Loading
Loading status checks…

eb1a48c

silvanocerza force-pushed the f1-evaluator branch from 916f55d to eb1a48c Compare March 19, 2024 16:52

silvanocerza marked this pull request as ready for review March 19, 2024 17:09

shadeMe suggested changes Mar 20, 2024

View reviewed changes

haystack/components/evaluators/answer_f1.py Outdated Show resolved Hide resolved

haystack/components/evaluators/answer_f1.py Outdated Show resolved Hide resolved

shadeMe requested a review from julian-risch March 20, 2024 10:50

Return average and individual scores

Loading
Loading status checks…

9e03a54

silvanocerza requested a review from shadeMe March 20, 2024 11:23

julian-risch reviewed Mar 20, 2024

View reviewed changes

haystack/components/evaluators/answer_f1.py Outdated Show resolved Hide resolved

Fix docstring

Loading
Loading status checks…

e003f61

shadeMe suggested changes Mar 20, 2024

View reviewed changes

test/components/evaluators/test_answer_f1.py Show resolved Hide resolved

Add test with more data

Loading
Loading status checks…

4a8ba84

silvanocerza requested a review from shadeMe March 20, 2024 14:21

shadeMe approved these changes Mar 20, 2024

View reviewed changes

silvanocerza mentioned this pull request Mar 20, 2024

feat: Change outputs of AnswerExactMatchEvaluator #7390

Merged

silvanocerza closed this Mar 21, 2024

silvanocerza deleted the f1-evaluator branch March 21, 2024 16:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Add `AnswerF1Evaluator` #7073

feat: Add `AnswerF1Evaluator` #7073

silvanocerza commented Feb 23, 2024 •

edited

Loading

coveralls commented Feb 23, 2024 •

edited

Loading

julian-risch commented Mar 21, 2024

silvanocerza commented Mar 21, 2024

feat: Add AnswerF1Evaluator #7073

feat: Add AnswerF1Evaluator #7073

Conversation

silvanocerza commented Feb 23, 2024 • edited Loading

Related Issues

Proposed Changes:

How did you test it?

Notes for the reviewer

Checklist

coveralls commented Feb 23, 2024 • edited Loading

Pull Request Test Coverage Report for Build 8360549906

Warning: This coverage report may be inaccurate.

Details

💛 - Coveralls

julian-risch commented Mar 21, 2024

silvanocerza commented Mar 21, 2024

feat: Add `AnswerF1Evaluator` #7073

feat: Add `AnswerF1Evaluator` #7073

silvanocerza commented Feb 23, 2024 •

edited

Loading

coveralls commented Feb 23, 2024 •

edited

Loading