Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: update validation for obs['in_tissue'] to include descendants of Visiium #1124

Merged
merged 8 commits into from
Nov 25, 2024

Conversation

ejmolinelli
Copy link
Contributor

Reason for Change

Changes

  • update _validate_spatial_tissue_position to support descendants of Visium
  • refactor _is_visium_including_descendants to work around non-idempotent dependence on ONTOLOGY_PARSER COG#246

Testing

  • test_schema_compliance.test_column_presence_in_tissue tests descendants and non descendants
pip install git+https://github.com/chanzuckerberg/single-cell-curation/@main#subdirectory=cellxgene_schema_cli

Notes for Reviewer

  • the implementation for checking descendants relies on a new utility method. This was just a way to have a single method implementation that can be used as an apply() function and be cached to avoid unnecessary graph traversals in ontology

@ejmolinelli ejmolinelli requested a review from joyceyan November 21, 2024 21:11
from scipy import sparse
from xxhash import xxh3_64_intdigest

logger = logging.getLogger(__name__)

SPARSE_MATRIX_TYPES = {"csc", "csr", "coo"}

ONTOLOGY_PARSER = OntologyParser(schema_version="v5.3.0")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i would prefer passing in the ONTOLOGY_PARSER that we create an instance of in validate.py into is_ontological_descendant_of, that way we only have one place to update when we bump the schema version

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, I see your point. I definitely agree we should minimize (to 1) the number of places a bump needs to be reflected. I'll do that.

One question I have is, why is the version in the code at all and not as a configuration option?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this was previously set to something like get_current_schema_version(). however, the current schema version is 5.2.0, because we bump the version number as part of the automated release process. and in the past, we would usually bump the ontology versions in COG completing all the feature changes to the validator. there's some more context in this slack thread but basically i decided hard-coding this was easier than needing to change some of our automated release process to make sure we're not double-bumping the version number before creating the new 5.3.0 release.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've followed your suggestion. Good looking out @joyceyan

@ejmolinelli ejmolinelli force-pushed the ejmolinelli/visium-vldt-obs-tissue branch from 5515ff6 to f049b42 Compare November 25, 2024 14:43
@ejmolinelli ejmolinelli merged commit 0c9f9af into main Nov 25, 2024
7 of 8 checks passed
@ejmolinelli ejmolinelli deleted the ejmolinelli/visium-vldt-obs-tissue branch November 25, 2024 14:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants