Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Augment search methods #79

Merged
merged 11 commits into from
Sep 19, 2024
Merged

Conversation

shilorigins
Copy link
Collaborator

@shilorigins shilorigins commented Sep 9, 2024

Description

  • Change search term format from keyword arg to namedtuple of (<attribute>, <operator>, <value>)
  • Define SearchTerm namedtuple
  • Modify search page, views, and tests to use new format

Closes #61

Motivation and Context

SearchTerm Format

A three-term format provides more flexibility than a keyword arg. For example, we'd run into trouble using keyword args if we wanted to support both exact match and fuzzy match on entry titles, but with the tuple format we can apply different operators to the same attribute / value pairs.

Using a namedtuple for the format provides the option of setting and accessing args by name without taking more memory than bare tuples.

Operators

I included the operators:

  • eq (equals)
  • lt (less than or equal to)
  • gt (greater than or equal to)
  • in (item in container)
  • like (partial match)
  • isclose (number within absolute and relative tolerances)

lt and gt are both inclusive because I felt that that was more intuitive, and that we didn't need to also support exclusive versions. like currently supports regex string and UUID matches.

isclose takes an arg with format (<value>, <rel_tolerance>, <abs_tolerance>). The client converts this SearchTerm to lt and gt; the intention is that tolerances can be taken from the user or from the Entry directly, which the client turns into upper and lower bounds.

entry_type requires special handling because type comparisons need subclass information.

Tags

FilestoreBackend implicitly supports tags using the gt operator, but a dedicated operator may be required when other backends are added. To find entries whose tag set contains all tags in a filter, use the conditional ('tags', 'gt', {<tags>...}), which calls set.__ge__. Supporting more complex tag filters, like inversion or disjunction, will also require a tag operator or using Flag functionality.

How Has This Been Tested?

Existing tests in test_page, test_window, and test_backend.
New tests for lt, gt, fuzzy match, and tag filters.

Where Has This Been Documented?

This PR

Pre-merge checklist

  • Code works interactively
  • Code follows the style guide
  • Code contains descriptive docstrings, including context and API
  • New/changed functions and methods are covered in the test suite where possible
  • Test suite passes locally
  • Test suite passes on GitHub Actions
  • Ran docs/pre-release-notes.sh and created a pre-release documentation page
  • Pre-release docs include context, functional descriptions, and contributors as appropriate

@shilorigins shilorigins self-assigned this Sep 9, 2024
@shilorigins shilorigins changed the title Augment search methods ENH: Augment search methods Sep 9, 2024
@shilorigins shilorigins force-pushed the devagr/search branch 2 times, most recently from 7eae440 to 2f1bc91 Compare September 13, 2024 16:30
@shilorigins shilorigins marked this pull request as ready for review September 13, 2024 16:43
Copy link
Contributor

@tangkong tangkong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this a lot. I think it gives us a lot of opportunity to make some robust search functionality.

I do have some concerns about user friendliness at the Client-level, but I think everything behind that (backend, GUI) is fine as written.

superscore/client.py Outdated Show resolved Hide resolved
superscore/backends/core.py Show resolved Hide resolved
superscore/tests/test_backend.py Show resolved Hide resolved
superscore/client.py Outdated Show resolved Hide resolved
Copy link
Member

@ZLLentz ZLLentz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like the flexibility this gives us in the search

SearchTerm is a namedtuple with format (<attr>, <operator>, <value>).
<attr> should be an attribute of Entry, or one of the specially handled
keywords.  <operator> is a string telling the backend which function to
use to filter <attr>. <value> is the target value of <attr>; can be a
single value or a tuple depending on <operator>.
This operator is converted in Client.search, but is provided as
a way for the UI to pass user-provided tolerances into the search
function
ZLLentz
ZLLentz previously approved these changes Sep 19, 2024
Copy link
Member

@ZLLentz ZLLentz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks clean to me. I had one side thought but do not take it as a call to action.

results = backends.search(
SearchTerm('data', 'lt', 3)
)
assert len(list(results)) == 3
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something I hadn't considered in the previous PRs: would it add value to check that the results match the searches in the test suite? Or is this not worth the effort? I'm not sure but I thought I'd mention it.

for res in results:
    assert res.data <= 3

tangkong
tangkong previously approved these changes Sep 19, 2024
Copy link
Contributor

@tangkong tangkong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm onboard, I like this implementation a lot.

A few nitpicky comments from me, none of which should hold up merging of this PR


API Breaks
----------
- Client.search takes SearchTerms as *args rather than key-value pairs as **kwargs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The github diff rendered part of this as italics... strange

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It behaves weird with my vim highlighting too. Maybe the double ** is a parsing edge case?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image

It renders funny in sphinx too, the * ends up being a hyperlink to an id anchor? This is something we can fix up when we actually publish release notes though

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Later we can add backticks around these to force them to be literals

superscore/backends/core.py Show resolved Hide resolved
superscore/backends/filestore.py Show resolved Hide resolved
superscore/backends/filestore.py Show resolved Hide resolved
@shilorigins
Copy link
Collaborator Author

I had to edit the collection builder page merged in earlier today, so I appreciate your re-approval 🙏

@shilorigins shilorigins merged commit cdda3a5 into pcdshub:master Sep 19, 2024
11 checks passed
@shilorigins shilorigins deleted the devagr/search branch September 19, 2024 21:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ENH: Augment search methods / terms
3 participants