-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Incomplete Results for Strings that Include Numbers #122
Comments
Are you using fuzzy search? Try playing with tntsearch settings a bit to see if one of them makes a difference. That would at least help pinpoint the part of the search algorithm which is failing. |
Tried the "739" search documented above, with index rebuild + cache clear between tests, with the following:
Here's my settings yaml:
|
I can confirm that there is something strange going on with searches which include numbers. Let's day I have these 2 data points. If I search for 777 I get only the first result however it is not highlighted in the context. Most probably a bug in TNTSearch library itself. Maybe something with BM25 implementation. |
Hmmm. Sounds like I may have to implement simple search until this gets resolved. Model numbers are the bread and butter for this client site. |
I just did a quick debug session and it is definitely a library bug. |
Great, thanks for all your help on this. I'll open an issue with https://github.com/teamtnt/tntsearch. |
I have fixed some highlighter issues found during my tests teamtnt/tntsearch#256, but search with numbers is still broken. I will try to look at the library code in couple of days if I would find a free and will reply on tntsearch repo. This issue can be closed here. |
@thekenshow I've spent some time on this and uncovered another layer of bugs. Try these two patches: In order to active fuzzy search in the library itself for your case you will also need: Other than that, there is nothing we can do at the moment. Proper partial search needs to be implemented by someone in the library. |
@ViliusS Thanks for diving into this, I'm back to it today and will let you know what happens. |
A client site uses part numbers in page titles (e.g., SPK1000) and TNTSearch isn't returning all matches when the first three characters are used.
Test case 1 is a search for "spk", which should return "spk1000" and "spk7457", but only the first appears:
A search for "spk7", returns "spk7457", which should also appear in the previous search:
Test 2 is a search for "739", which should return three results - two instances of "7393 Horn Driver" and 1 with "739" in the body of the text, but instead only returns the latter:
A search for "7393" turns up the first two expected above (two instances of "7393 Horn Driver"):
Thought this might be related to the stemming issue describe here but @ViliusS set me straight.
The text was updated successfully, but these errors were encountered: