Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Interactive" Dotplot version: check for available var_names #3387

Open
FrancescaDr opened this issue Nov 25, 2024 · 1 comment
Open

"Interactive" Dotplot version: check for available var_names #3387

FrancescaDr opened this issue Nov 25, 2024 · 1 comment

Comments

@FrancescaDr
Copy link

What kind of feature would you like to request?

Additional function parameters / changed functionality / changed defaults?

Please describe your wishes

Feature

Change dotplot to a more interactive version such that var_names that are not in the AnnData object will be ignored. The returned dotplot should only include the var_names (e.i. genes) that are present.

This could be useful for a more interactive way of plotting in the Jupyter notebook because often canonical marker genes lists are run on different Anndata objects but not all have the same gene panels (especially also for spatial transcriptomics data).

Plan

Check for available vars in the AnnData before plotting:

available_vars = adata.var_names
        missing_vars = [name for name in var_names if name not in available_vars]
        if missing_vars:
            logg.warning(
                f"The following variables were not found in the dataset and will be ignored: {', '.join(missing_vars)}"
            )
            var_names = [name for name in var_names if name in available_vars]
            if len(var_names) == 0:
                raise ValueError("No valid variable names found in the dataset")

I am unsure whether this should be called specifically related to the DotPlot class before calling the BasePlot function or whether this is transferable to other plots and can be added to the BasePlot class before preparing the dataframe. @flying-sheep what is you take on this?

@FrancescaDr FrancescaDr added Enhancement ✨ Triage 🩺 This issue needs to be triaged by a maintainer labels Nov 25, 2024
@flying-sheep flying-sheep added Area - Plotting 🌺 and removed Enhancement ✨ Triage 🩺 This issue needs to be triaged by a maintainer labels Dec 16, 2024
@flying-sheep
Copy link
Member

Hi! var_names being passed to BasePlot.__init__ looks like the behavior is probably best handled there.

Happy to review your PR! Some tips:

  1. You can check the other subclasses of BasePlot if self.var_names if it’s used anywhere in a way where ignoring missing ones wouldn’t make sense.
  2. If it makes sense everywhere, you could do the check even before self.var_names is set in BasePlot.
  3. use set operations when the order doesn’t matter, e.g. missing_vars = adata.var_names.difference(var_names)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants