-
-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix search facet from mirepresenting the amount of ebook results #10160
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This won't quite work at this layer, since it'll filter out printdisabled books all the time, even if the mode isn't specified. We want the way the facets are computed to be modified, so that it correctly handles the has_fulltext
facet.
We currently have a rewrite rule for the has_fulltext
facet, which is what's causing the discrepancy. The facet is computed using the has_fulltext
solr field, but once it's selected, it's translated by this code to use the ebook_access
field:
openlibrary/openlibrary/plugins/worksearch/schemes/works.py
Lines 203 to 210 in 6a117fa
): lambda: f'ebook_access:[{get_fulltext_min()} TO *]', | |
( | |
'has_fulltext', | |
'false', | |
): lambda: f'ebook_access:[* TO {get_fulltext_min()}}}', | |
} | |
def is_search_field(self, field: str): |
This is the discrepancy.
Solr does have options to specify more custom facets, I believe they're called facet queries (the docs here are useful ; sorry couldn't find the specific section! https://solr.apache.org/guide/solr/latest/query-guide/faceting.html ).
If we update the the Work search scheme's facet_fields to instead of having the plain has_fulltext
field, instead have a dict
with options to specify a facet query, and then update the code that actually builds the solr query to understand/serialize this dict, that might do the trick?
Work Search Scheme facet fields:
openlibrary/openlibrary/plugins/worksearch/schemes/works.py
Lines 102 to 113 in 6a117fa
facet_fields = { | |
"has_fulltext", | |
"author_facet", | |
"language", | |
"first_publish_year", | |
"publisher_facet", | |
"subject_facet", | |
"person_facet", | |
"place_facet", | |
"time_facet", | |
"public_scan_b", | |
} |
Code that does the actual query:
openlibrary/openlibrary/plugins/worksearch/code.py
Lines 191 to 205 in 9a4e45e
facet_fields = scheme.facet_fields if isinstance(facet, bool) else facet | |
if facet and facet_fields: | |
params.append(('facet', 'true')) | |
for facet in facet_fields: # noqa: PLR1704 | |
if isinstance(facet, str): | |
params.append(('facet.field', facet)) | |
elif isinstance(facet, dict): | |
params.append(('facet.field', facet['name'])) | |
if 'sort' in facet: | |
params.append((f'f.{facet["name"]}.facet.sort', facet['sort'])) | |
if 'limit' in facet: | |
params.append((f'f.{facet["name"]}.facet.limit', facet['limit'])) | |
else: | |
# Should never get here | |
raise ValueError(f'Invalid facet type: {facet}') |
And then there might be some awkward stitching logic. Apologies this a bit more of a meatier change than expected! Let us know if you have any questions!
@cdrini Thanks for taking a look. The current change should only exclude printdisabled books when the mode is not specified and the user does not have the pd cookie, but it does remove those books from the results entirely. Is the issue that we want to avoid hiding printdisabled books to users without the pd cookie? |
@hornc Thank you for letting me know! |
Closes #9575
Hides printdisabled results for users not explicitly searching for them who do not have printdisabled settings stored in cookies.
Technical
Testing
Ensure that ebook facet counts match the amount of actual ebook results, with or without 'pd' set to true in cookies.
Screenshot
Stakeholders
@cdrini @RayBB