-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add possibility to boost per analyzer #4
Comments
An analyzer in Lucene cannot boost its tokens. There is a What you can do is split down the query into multiple queries, each using one of the sub-analyzer and setting the corresponding boost. Note that if you do not deduplicate, if a same token term is generated N times, is will therefore be boosted by N. The order of the sub-analyzers indeed changes the order of the generated tokens, but this should have no impact at all. Except if you use some filter that limits the number of tokens to the first N th. |
Sorry I'm not a Lucene expert :) Thanks for the explaination. Before using your plugin, I had a multifield on which I had the possibility to boost per field, a bit like you suggested. By the way, shouldn't deduplication be the default? And when using deduplication, does this mean the first duplicate token is kept, while others are dropped? Check that problem I found, which I didn't solve yet. |
Hello,
It would be nice to be able to give a boost per analyzer.
I mean, it I index the word "description" with edgengrams(3,7) + stemming + default
I would like to be able to say:
Because matches with "des" may be less relevant than matches with "descript" than matches with "description", so matches with "description" should be the firsts to come.
I don't know if it is possible to do, just a suggestion :)
Also, it would be nice to have some informations about the effects on scoring of using a combo analyzer. What came to me first was for exemple "is the order of sub analyzers important?". I think it doesn't since you mention some stuff about duplicate tokens.
The text was updated successfully, but these errors were encountered: