Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Maint: parc.find_regions as module method. clarify find_regions #517

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

AhmetNSimsek
Copy link
Collaborator

@AhmetNSimsek AhmetNSimsek commented Nov 29, 2023

This PR addresses the following issues/points (see):

  • parcellation.find_regions() was a static method of Parcellation but it searches through all parcellations in the registry. For the user, when they have a parcellation instance parc, there is no clear difference between parc.find_regions() and parc.find() while they should be using find to search within the parcellation. (find is a method of Region).
  • There is also atlas.find_regions() as an instance method. It behaves differently than parcellation.find_regions(). Needs to be clarified.
  • Finally, instead of using caching with a dictionary with keys of kwarg tuples, why not use cache property?
    • lru_cache might be necessary for long-running systems like siibra-api

(Please see the discussions on the code changes for further details)

@codecov-commenter
Copy link

codecov-commenter commented Nov 29, 2023

Codecov Report

Attention: Patch coverage is 94.11765% with 1 line in your changes missing coverage. Please review.

Project coverage is 45.97%. Comparing base (dc63457) to head (650ffb9).
Report is 10 commits behind head on main.

Files with missing lines Patch % Lines
siibra/features/anchor.py 50.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #517      +/-   ##
==========================================
- Coverage   46.00%   45.97%   -0.04%     
==========================================
  Files          75       75              
  Lines        7232     7224       -8     
==========================================
- Hits         3327     3321       -6     
+ Misses       3905     3903       -2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@xgui3783 xgui3783 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly fine.

I have questions where find_region should still be in parcellation.py

and the aggressive caching strategy we are using everywhere.

siibra/core/parcellation.py Outdated Show resolved Hide resolved
@@ -364,3 +326,45 @@ def __lt__(self, other):
)
return self.name < other.name
return self.version.__lt__(other.version)


def find_regions(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it feasible that user would try to find_regions without a parcellation?

(correct me if I am wrong, but this is what this function is doing?)

edit: I think I see why this is needed (it's needed to decode region spec in anatomic anchor)

since it's not used anywhere else, I almost wonder if module level function should go to anatomical anchor (where it is used exclusively)

or at top level under region (where it makes more semantic sense)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than internal use, it is also exposed as siibra.find_regions(). We can potentially use region.find() in anchor. I think the original implementation also had species as a keyword argument, which made more sense to use there.

It needs access to parcellation.registry so I think it is better in parcellation module.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fact that it needs Parcellation.registry is not relevant to users.

If I am a user, and expect to find a method called find_regions (which, in fact, does not depend on any parcellations), I would expect it to find it under region.py module.

I do like the idea of a class/static method under Region class, which is proxy to this method (Indeed, I think rather than it being a standalone method, I think it makes more sense to be a static/class method of Region class)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm then it would still be accessible from region and parcellation instances. We can carry it to the region module but if we are going through all parcellations, why shouldn't it be under the parcellation module? And this also creates less confusion given that Region class has find method.

Since there is a also a similar function in atlas, which goes through all parcellations in an atlas, I think the design is reasonable. But we might want to discuss this design with @dickscheid , who initially implemented it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can carry it to the region module but if we are going through all parcellations, why shouldn't it be under the parcellation module?

But the user doesn't know that. (nor should they)

The promise of find_region, at least to me, is: you provide a string, and I will find you all regions which fits this string.

Whether, internally, we go through all parcellations, and iteratively, go through all regions for that parcellation, OR, we some how have a region instance table, that is not relevant to the user.

Which is why I say, either keep it private (prefix with _) or it should go in regions module.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see what you mean now. I thought you were advocating for it to be a class method or static method of Region not region module. I'll also get the opinion of Timo and make a change accordingly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that this is better placed here at the module level to avoid the confusion with Region.find()

siibra/VERSION Outdated Show resolved Hide resolved
siibra/core/parcellation.py Outdated Show resolved Hide resolved
@AhmetNSimsek AhmetNSimsek added the maintenance Not a bug or breaking issue. Code maintenance related. label Dec 7, 2023
@@ -364,3 +326,44 @@ def __lt__(self, other):
)
return self.name < other.name
return self.version.__lt__(other.version)


@lru_cache(maxsize=128)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TD: investigate how lru_cache hashes the inputs to compare with dictionary caching

Copy link
Collaborator Author

@AhmetNSimsek AhmetNSimsek Feb 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://docs.python.org/3/library/functools.html#functools.lru_cache
"Since a dictionary is used to cache results, the positional and keyword arguments to the function must be hashable."

Since the kwargs of find_regions are either a string a boolean, they are hashable. So essentially, the key of the old implementation (key = (regionspec, filter_children, find_topmost)) is used but hashed.

@dickscheid
Copy link
Contributor

dickscheid commented Feb 21, 2024

Just to explain the original logic - there were different entry points for different scopes:

  • search any region known to siibra was dessigned as a class method Parcellation.find_regions(), searching through all instances of parcellations at runtime (maybe it was implemented as static which I agree was not proper)
  • search only the regions linked to a species was implemented as an instance method of Atlas, which holds the parcellations of that species
  • search regions in one particular parcellation only was implemented as an instance method of the parcellation, actually implemented as find in Region, since region is the parent class which holds the subtree and allows for a recursive implementation.

I think the logic is clear, but I understand that the automatic exposure of the class method Parcellation.find_regions for Parcellation instances leads to confusion.

@AhmetNSimsek
Copy link
Collaborator Author

Just to explain the original logic - there were different entry points for different scopes:

* search any region known to siibra was dessigned as a class method `Parcellation.find_regions()`, searching through all instances of parcellations at runtime (maybe it was implemented as static which I agree was not proper)

* search only the regions linked to a species was implemented as an instance method of Atlas, which holds the parcellations of that species

* search regions in one particular parcellation only was implemented as an instance method of the parcellation, actually implemented as  `find` in Region, since region is the parent class which holds the subtree and allows for a recursive implementation.

I think the logic is clear, but I understand that the automatic exposure of the class method Parcellation.find_regions for Parcellation instances leads to confusion.

Thank you. Then, IMO, we should make find_regions a module level method in parcellation.py or if it is static or class method, it should be hidden and only be forwarded with siibra.find_regions. I'd prefer the first as I do not see any reason for such a method to be class or instance method.

siibra/core/region.py Outdated Show resolved Hide resolved
siibra/core/region.py Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
maintenance Not a bug or breaking issue. Code maintenance related.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants