Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What are the meanings of 'hypothesis' and 'reference' ? #99

Open
liuende501 opened this issue Jan 25, 2025 · 1 comment
Open

What are the meanings of 'hypothesis' and 'reference' ? #99

liuende501 opened this issue Jan 25, 2025 · 1 comment

Comments

@liuende501
Copy link

The parameter "hypothesis" is understood by me as the result of speech recognition, and "reference" as the actual sentence.
Therefore, the minimum edit distance calculated should be the distance from "hypothesis" to "reference". However, upon testing the existing code, the edit distance is from "reference" to "hypothesis". Could this be a bug?

@nikvaessen
Copy link
Collaborator

You are right, a reference is the gold standard, while the hypothesis is a prediction from the system. You want to measure how many deletions, insertions, or substitutions need to be made FROM the prediction TO the gold standard, in order to know how many mistakes the systems made. The other way around, if you think about it, does not make sense, why should you want to modify the gold standard?

The most important aspect in the calculation is that N in the WER formula S+D+I/N is the length of the reference, and not the hypothesis, which is accomplished by calculating the distance from the hypothesis to the reference.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants