Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

concatenate tables or not #635

Open
wangjiawen2013 opened this issue Jul 12, 2024 · 1 comment
Open

concatenate tables or not #635

wangjiawen2013 opened this issue Jul 12, 2024 · 1 comment
Labels

Comments

@wangjiawen2013
Copy link

wangjiawen2013 commented Jul 12, 2024

Hi,
This is a visium_HD object from spatialdata documentation, there are three tables in this object:

SpatialData object with:
├── Images
│     ├── 'Visium_HD_Mouse_Small_Intestine_cytassist_image': SpatialImage[cyx] (3, 3000, 3200)
│     ├── 'Visium_HD_Mouse_Small_Intestine_full_image': MultiscaleSpatialImage[cyx] (3, 21943, 23618), (3, 10971, 11809), (3, 5485, 5904), (3, 2742, 2952), (3, 1371, 1476)
│     ├── 'Visium_HD_Mouse_Small_Intestine_hires_image': SpatialImage[cyx] (3, 5575, 6000)
│     └── 'Visium_HD_Mouse_Small_Intestine_lowres_image': SpatialImage[cyx] (3, 558, 600)
├── Shapes
│     ├── 'Visium_HD_Mouse_Small_Intestine_square_002um': GeoDataFrame shape: (5479660, 2) (2D shapes)
│     ├── 'Visium_HD_Mouse_Small_Intestine_square_008um': GeoDataFrame shape: (351817, 2) (2D shapes)
│     └── 'Visium_HD_Mouse_Small_Intestine_square_016um': GeoDataFrame shape: (91033, 2) (2D shapes)
└── Tables
      ├── 'square_002um': AnnData (5479660, 19059)
      ├── 'square_008um': AnnData (351817, 19059)
      └── 'square_016um': AnnData (91033, 19059)
with coordinate systems:
▸ 'downscaled_hires', with elements:
        Visium_HD_Mouse_Small_Intestine_hires_image (Images), Visium_HD_Mouse_Small_Intestine_square_002um (Shapes), Visium_HD_Mouse_Small_Intestine_square_008um (Shapes), Visium_HD_Mouse_Small_Intestine_square_016um (Shapes)
▸ 'downscaled_lowres', with elements:
        Visium_HD_Mouse_Small_Intestine_lowres_image (Images), Visium_HD_Mouse_Small_Intestine_square_002um (Shapes), Visium_HD_Mouse_Small_Intestine_square_008um (Shapes), Visium_HD_Mouse_Small_Intestine_square_016um (Shapes)
▸ 'global', with elements:
        Visium_HD_Mouse_Small_Intestine_cytassist_image (Images), Visium_HD_Mouse_Small_Intestine_full_image (Images), Visium_HD_Mouse_Small_Intestine_square_002um (Shapes), Visium_HD_Mouse_Small_Intestine_square_008um (Shapes), Visium_HD_Mouse_Small_Intestine_square_016um (Shapes)

And This is a xenium object from spatialdata documentation, there is only one table in this object:

SpatialData object with:
├── Images
│     ├── 'he_image': MultiscaleSpatialImage[cyx] (3, 45087, 11580), (3, 22543, 5790), (3, 11271, 2895), (3, 5635, 1447), (3, 2817, 723)
│     └── 'morphology_focus': MultiscaleSpatialImage[cyx] (5, 17098, 51187), (5, 8549, 25593), (5, 4274, 12796), (5, 2137, 6398), (5, 1068, 3199)
├── Labels
│     ├── 'cell_labels': MultiscaleSpatialImage[yx] (17098, 51187), (8549, 25593), (4274, 12796), (2137, 6398), (1068, 3199)
│     └── 'nucleus_labels': MultiscaleSpatialImage[yx] (17098, 51187), (8549, 25593), (4274, 12796), (2137, 6398), (1068, 3199)
├── Points
│     └── 'transcripts': DataFrame with shape: (12165021, 11) (3D points)
├── Shapes
│     ├── 'cell_boundaries': GeoDataFrame shape: (162254, 1) (2D shapes)
│     ├── 'cell_circles': GeoDataFrame shape: (162254, 2) (2D shapes)
│     └── 'nucleus_boundaries': GeoDataFrame shape: (156628, 1) (2D shapes)
└── Tables
      └── 'table': AnnData (162254, 377)
with coordinate systems:
▸ 'global', with elements:
        he_image (Images), morphology_focus (Images), cell_labels (Labels), nucleus_labels (Labels), transcripts (Points), cell_boundaries (Shapes), cell_circles (Shapes), nucleus_boundaries (Shapes)

My question is, when I have multiple spatial transcriptome datasets, should I concatenate the tables or not. If the tables are concatenate, the downstream analysis, such as PCA,UMAP will be affected, because the neighbours of each cell will be different from that in separated tables.
It's common to concatenate multiple single cell datasets, but I don't know whether it makes sense to concatenate spatial transcriptome tables.

@LucaMarconato
Copy link
Member

LucaMarconato commented Jul 12, 2024

In these cases I would not concatenate the tables for the following reasons:

  • The Xenium table is at the single-cell level and the Visium HD table at the bin levels, so concatenating the tables, even if for an hypothetical dataset constructed from the same tissue slide, would lead to technical artifacts
  • The Visium HD tables are referring to different bin sizes, so the tables have different meanings.

My advice is that in general table concatenation should be considered when:

  1. there are multiple different samples from the same technology (e.g. multiple Xenium samples)
  2. when there are two technologies for the same tissue. Here (crucial), after identifying the same entities in both datasets (e.g. by overlapping the cell boundaries from one dataset to the other), one could combine the two matrices on the 1 axis (merging the var). In this case using muon could be of help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants