When evaluating the retrieval in the NQ dataset, what is the criteria of correct retrieval? #1515

miguelwon · 2023-05-07T13:36:18Z

miguelwon
May 7, 2023

Hi,

Looking for some help to clear up my lack of full understanding on the evaluation. In the Metrics section of A Replication Study of Dense Passage Retriever, it says that "we measured effectiveness in terms of top-k retrieval accuracy, defined as the fraction of questions that have a correct answer span in the top-k retrieved contexts at least once."

What is the definition of a "correct answer"? Or to put it another way, for the NQ dataset how can we determine whether the passages retrieved from the Wikipedia corpus are accurate, considering it's highly improbable that there will be a passage that precisely matches the annotated short or long answer?

Thanks!

Answered by ronakice

May 7, 2023

Yes, the passage does not precisely have to match the annotated short/long answer. We process passages to find those that contain some variant of one of the gold answers, up to normalization, as an exact match. This set is by no means exhaustive or perfect but it is the best one can do in a compute-friendly way. I hope that clarifies things!

View full answer

ronakice · 2023-05-07T13:55:23Z

ronakice
May 7, 2023
Collaborator

Yes, the passage does not precisely have to match the annotated short/long answer. We process passages to find those that contain some variant of one of the gold answers, up to normalization, as an exact match. This set is by no means exhaustive or perfect but it is the best one can do in a compute-friendly way. I hope that clarifies things!

2 replies

miguelwon May 7, 2023
Author

I see. In that case what is the criteria? Rouge? Do you know where can I find more details?
Thanks a lot the quick answer!

ronakice May 7, 2023
Collaborator

You can trace this function call for more information

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

When evaluating the retrieval in the NQ dataset, what is the criteria of correct retrieval? #1515

{{title}}

Replies: 1 comment 2 replies

{{title}}

{{title}}

{{title}}

Select a reply

When evaluating the retrieval in the NQ dataset, what is the criteria of correct retrieval? #1515

miguelwon May 7, 2023

Replies: 1 comment · 2 replies

ronakice May 7, 2023 Collaborator

miguelwon May 7, 2023 Author

ronakice May 7, 2023 Collaborator

miguelwon
May 7, 2023

Replies: 1 comment 2 replies

ronakice
May 7, 2023
Collaborator

miguelwon May 7, 2023
Author

ronakice May 7, 2023
Collaborator