Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loading JSON files #155

Open
wants to merge 8 commits into
base: master
Choose a base branch
from
Open

Loading JSON files #155

wants to merge 8 commits into from

Conversation

SaarthShah
Copy link

Allowing users to load JSON instead of having to load dictionaries using txt files.

Here's an example of the JSON file:

{
"funko":300,
"funk": 150,
...
}

Copy link
Owner

@mammothb mammothb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the PR. I have briefly tested your implementation and it was indeed a speedup compared to loading from text file.

However, it does not account for count_threshold and will fail this test case

def test_load_dictionary_below_threshold(self, symspell_short):
symspell_short.load_dictionary(BELOW_THRESHOLD_DICT_PATH, 0, 1)
assert 1 == len(symspell_short.below_threshold_words)
assert 8 == symspell_short.below_threshold_words["below"]
assert 2 == symspell_short.word_count
Could you please implement fix the inconsistent behavior? Thank you!

Args:
corpus: A dictionary where keys are words and values are their frequencies.
"""
self._words = corpus
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This bypasses create_dictionary_entry which contains logic to handle only adding to _words if it reaches above the count_threshold.

create_dictionary_entry should be used so that the behavior of loading a dictionary from .txt or .json file is consistent

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants