Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OOV words with small LM #115

Open
davidavdav opened this issue Nov 6, 2023 · 0 comments
Open

OOV words with small LM #115

davidavdav opened this issue Nov 6, 2023 · 0 comments

Comments

@davidavdav
Copy link

Hello,

We have an application with a very small vocabulary (~100 words). With an almost trivial bigram model (as kenlm seems not to be able to make a unigram model), we see that decoder.decode() produces words that are not in the language model.

Is there some kind of fallback to letter decoding? Is there a way to turn this off?

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant