You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have been experimenting with PolyFuzz for a while. I have observed some weird scoring behavouir. Following is the case in which I am getting a very high score of 90 despite the string hardly equal. It is not expected to get such high scores just because of common string "america", the edit distance would be very low if you compare it with list1 strings.
list1= ["american Futures and Options Exchange","America First Credit Union"]
list2=["america"]
model = PolyFuzz("EditDistance").match(list1, list2)
data=model.get_matches()
print(data)
From To Similarity
0 american Futures and Options Exchange america 0.9
1 America First Credit Union america 0.9
Any workaround would be appreciated... Thanks!
The text was updated successfully, but these errors were encountered:
The edit distance that is being used as a default is RapidFuzz, more specifically, it uses the WRatio method for calculating the edit distance. The output is expected according to the scoring function that is being used. You can check it with the following:
I have been experimenting with PolyFuzz for a while. I have observed some weird scoring behavouir. Following is the case in which I am getting a very high score of 90 despite the string hardly equal. It is not expected to get such high scores just because of common string "america", the edit distance would be very low if you compare it with list1 strings.
Any workaround would be appreciated... Thanks!
The text was updated successfully, but these errors were encountered: