Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add AnswerExactMatchEvaluator #7381

Merged
merged 7 commits into from
Mar 19, 2024
Merged

Add AnswerExactMatchEvaluator #7381

merged 7 commits into from
Mar 19, 2024

Conversation

silvanocerza
Copy link
Contributor

Related Issues

Proposed Changes:

Add AnswerExactMatchEvaluator. This Component calculates the Exact Match metrics given a list of questions, a list of expected answers for each question and the list of predicted answers for each question.

How did you test it?

I added unit tests.

Notes for the reviewer

N/A

Checklist

@silvanocerza silvanocerza self-assigned this Mar 19, 2024
@silvanocerza silvanocerza requested review from a team as code owners March 19, 2024 14:33
@silvanocerza silvanocerza requested review from dfokina and julian-risch and removed request for a team March 19, 2024 14:33
@github-actions github-actions bot added topic:tests 2.x Related to Haystack v2.0 type:documentation Improvements on the docs labels Mar 19, 2024
Copy link
Member

@julian-risch julian-risch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we can leave out from_dict and to_dict implementations as they have the same effect as the default implementation. Otherwise looks good to me.

```
"""

def to_dict(self) -> Dict[str, Any]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can leave out the to_dict and from_dict implementation here as it is just using the default right?
For example, when component_to_dict is used it will automatically fall back to default_to_dict:

def component_to_dict(obj: Any) -> Dict[str, Any]:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I always forget that. Will remove them right away.

Copy link
Member

@julian-risch julian-risch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! 👍
We should add more test cases later. Will also make sense for consistency with test cases for other metrics. For example, we could test more than one prediction per query. Something like:

evaluator.run(
    questions=["What is the capital of Germany?", "What is the capital of France?"],
    ground_truth_answers=[["Berlin"], ["London"]],
    predicted_answers=[["Berlin", "wrong_second_answer_candidate"], ["wrong_first_answer_candidate", "London"]],
)

should result in result["result"] == 1.0

@coveralls
Copy link
Collaborator

coveralls commented Mar 19, 2024

Pull Request Test Coverage Report for Build 8345123752

Details

  • 0 of 0 changed or added relevant lines in 0 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.03%) to 89.238%

Totals Coverage Status
Change from base Build 8339327256: 0.03%
Covered Lines: 5390
Relevant Lines: 6040

💛 - Coveralls

@silvanocerza silvanocerza merged commit 610ad6f into main Mar 19, 2024
23 checks passed
@silvanocerza silvanocerza deleted the exact-match-evaluator branch March 19, 2024 15:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2.x Related to Haystack v2.0 topic:tests type:documentation Improvements on the docs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement function to calculate Exact Match metric
3 participants