Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add AnswerF1Evaluator #7073

Closed
wants to merge 6 commits into from
Closed

feat: Add AnswerF1Evaluator #7073

wants to merge 6 commits into from

Conversation

silvanocerza
Copy link
Contributor

@silvanocerza silvanocerza commented Feb 23, 2024

Related Issues

Proposed Changes:

Add AnswerF1Evaluator, a Component that can be used to calculate the F1 score metric given a list of questions, a list of expected answers for each question and the list of predicted answers for each question.

How did you test it?

Added unit tests.

Notes for the reviewer

I didn't add the component in the package __init__.py on purpose to avoid conflicts with future PRs.
When all the evaluators are done I'll update it.

Checklist

@silvanocerza silvanocerza self-assigned this Feb 23, 2024
@silvanocerza silvanocerza requested review from a team as code owners February 23, 2024 11:45
@silvanocerza silvanocerza requested review from dfokina and davidsbatista and removed request for a team February 23, 2024 11:45
@github-actions github-actions bot added topic:tests 2.x Related to Haystack v2.0 type:documentation Improvements on the docs labels Feb 23, 2024
@silvanocerza silvanocerza requested review from shadeMe and julian-risch and removed request for davidsbatista February 23, 2024 11:46
@coveralls
Copy link
Collaborator

coveralls commented Feb 23, 2024

Pull Request Test Coverage Report for Build 8360549906

Warning: This coverage report may be inaccurate.

This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.

Details

  • 0 of 0 changed or added relevant lines in 0 files are covered.
  • 6 unchanged lines in 1 file lost coverage.
  • Overall coverage increased (+0.03%) to 89.265%

Files with Coverage Reduction New Missed Lines %
dataclasses/document.py 6 93.4%
Totals Coverage Status
Change from base Build 8346207287: 0.03%
Covered Lines: 5430
Relevant Lines: 6083

💛 - Coveralls

@silvanocerza silvanocerza marked this pull request as draft February 23, 2024 12:03
@julian-risch julian-risch removed their request for review March 1, 2024 09:36
@silvanocerza silvanocerza marked this pull request as ready for review March 19, 2024 17:09
haystack/components/evaluators/answer_f1.py Outdated Show resolved Hide resolved
haystack/components/evaluators/answer_f1.py Outdated Show resolved Hide resolved
@shadeMe shadeMe requested a review from julian-risch March 20, 2024 10:50
@silvanocerza silvanocerza requested a review from shadeMe March 20, 2024 11:23
@silvanocerza silvanocerza requested a review from shadeMe March 20, 2024 14:21
@julian-risch
Copy link
Member

Similar to my comment on the AnswerRecall PRs, we need the tokenized versions of answers to calculate F1 on answers. #7394 (review)

@silvanocerza
Copy link
Contributor Author

Closing this as we're going in a different direction to calculate F1 scores.

We're probably going to have a Component that doesn't take answers as inputs but the result of other components that calculate answer precision and recall.

@silvanocerza silvanocerza deleted the f1-evaluator branch March 21, 2024 16:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2.x Related to Haystack v2.0 topic:tests type:documentation Improvements on the docs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement function to calculate F1 metric
4 participants