Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lexical_index_to_sssom() EPM param #697

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

joeflack4
Copy link
Contributor

@joeflack4 joeflack4 commented Jan 19, 2024

Changes

  • Update: lexical_index_to_sssom(): Added param for extended prefix map

Context

I was working on this monarch-initiative/mondo-ingest#394 and was getting errors in both sssom-py and ontology-access-kit. For sssom-py, that resulted in some necessary changes (mapping-commons/sssom-py#485). And at first, I also thought that OAK needed some changes / bugfixes, but it turns out that is not the case. What's here are not bugfixes, but some extended prefix map (EPM) support that might be nice to have.

Why no GH issue?
The reason this does not have an associated issue is because this is the result of me working locally trying to fix a bug, and this is the resulting code at the end of that process. The only change of possible value left over is EPM support being added to lexical_index_to_sssom(). Thus, if I were to make an issue for this, it'd be something like, general "EPM support".

Why no tests?
As this PR does not include any bugfix as I'd originally thought, it is now a low priority. Thus I'm submitting it in the state I left it. If people think this is worth merging and want a test, I can try to find the time to add.

@joeflack4 joeflack4 force-pushed the epm-in-lexical_index_to_sssom branch from af6205a to c8f8bd3 Compare January 20, 2024 00:18
@@ -247,6 +247,7 @@ def lexical_index_to_sssom(
ruleset: MappingRuleCollection = None,
meta: Optional[Dict[str, t.Any]] = None,
prefix_map: Union[None, Converter, t.Mapping[str, str]] = None,
extended_prefix_map: t.Iterable[Dict[str, t.Any]] = None,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extended_prefix_map param added

I also wanted it to have the same type signature as in curies, which is LocationOr[Iterable[Union[Record, Dict[str, Any]]]] = None. However, when I did from curies.mapping_service.utils import Records, I got ModuleNotFoundError: No module named 'defusedxml'. I definitely don't think it's worth installing an additional package in OAK simply for better / more consistent typing.

I suppose an alternative to adding a new param would simply to overload the prefix_map param even more, but I opted not because itis already overloaded. I actually don't like this param because it supports either a prefix map Dict or a Converter object, and I think that makes it a little unclear / poorly named.

@@ -265,6 +266,9 @@ def lexical_index_to_sssom(
:param ensure_strict_prefixes: If true, prefixes & mappings in SSSOM MappingSetDataFrame will be filtred.
:return: SSSOM MappingSetDataFrame object.
"""
if prefix_map and extended_prefix_map:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exception handling

Throws error if both prefix_map and extended_prefix_map passed, as that creates ambiguity. The user should pick one.

@@ -313,7 +317,11 @@ def lexical_index_to_sssom(

converter = curies.chain(
[
Converter.from_prefix_map((meta or {}).pop(CURIE_MAP, {})),
Converter.from_extended_prefix_map(extended_prefix_map)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How the param is used

This is handles the various scenarios in which a user might pass a prefix map. Before my changes there were already 2 ways in which a user might pass it, either within meta or the prefix_map args. There may be some redundancy here with ensure_converter() that I may not have fully considered.

@joeflack4
Copy link
Contributor Author

joeflack4 commented Jan 20, 2024

@cmungall I don't mind if you don't merge this. If you don't mind, please start by reading the "context" in the OP. Just thought I might as well share what I'd done here and bring up the topic of EPM support to attention.

@cthoyt OAK could use more EPM support; you probably already knew that.

@twhetzel CC

@codecov-commenter
Copy link

codecov-commenter commented Jan 20, 2024

Codecov Report

Attention: 1 lines in your changes are missing coverage. Please review.

Comparison is base (66998cd) 76.39% compared to head (c8f8bd3) 76.35%.

Files Patch % Lines
src/oaklib/utilities/lexical/lexical_indexer.py 50.00% 1 Missing ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #697      +/-   ##
==========================================
- Coverage   76.39%   76.35%   -0.04%     
==========================================
  Files         256      256              
  Lines       29720    29736      +16     
==========================================
+ Hits        22705    22706       +1     
- Misses       7015     7030      +15     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Collaborator

@cmungall cmungall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @joeflack4!

I am not a fan of nested ternaries, they can make code harder to read (see for example https://medium.com/@benlmsc/stop-using-nested-ternary-operators-heres-why-53e7e078e65a). Not everyone agrees and I would happily defer to @cthoyt as my style guru here.

(I have been wrong before and have come to embrace guard expressions)

There is an additional issue here in that you appear to be mutating a dictionary the user passes in. This is usually considered somewhat rude. I think writing out your list comprehension with the logic more explicit is worth some verbosity here.

@joeflack4
Copy link
Contributor Author

RE: Nested ternaries, I was actually going to comment about that but decided not worth the breath! lol

Good point about the mutation .pop(CURIE_MAP). You're right. This is how it was before my changes.

@joeflack4
Copy link
Contributor Author

@cmungall I'm gonna close this. As I mentioned in OP this is not too important, and not a comprehensive solution to EPMs.

I've opened this issue instead:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants