Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add nysiis python 3 support #103

Merged
merged 4 commits into from
Apr 10, 2019
Merged

Conversation

michamos
Copy link
Contributor

Several algorithms can be used for phonetic blocking. The fuzzy library provides all of them, but has a bug in v1.2.* (see yougov/fuzzy#14) that prevents soundex from working correctly. Unfortunately, that's the only version compatible with Python 3. Previously, version 1.1 was used on Python 2 and an alternative implementation of double metaphone was used on Python 3, so soundex and NYSIIS were not available. Now we install version 1.1 on Python 2 and 1.2.* on Python 3, resulting in NYSIIS being always available also (this happens to be the algorithm giving the best results).

To summarize

Before:

Algorithm Python 2 Python3
soundex ✔️
NYSIIS ✔️
double metaphone ✔️ ✔️

After:

Algorithm Python 2 Python3
soundex ✔️
NYSIIS ✔️ ✔️
double metaphone ✔️ ✔️

@michamos michamos merged commit 8d21e67 into inspirehep:master Apr 10, 2019
@michamos michamos deleted the add-nysiis-py3 branch April 10, 2019 11:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants