You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
spatialdata.aggregate does not support 3D,
spatialdata.aggregate can be slow on large images.
spatialdata.aggregate does not allow aggregation of a labels layer with a points layer.
Describe the solution you'd like
There is currently no 3D support in spatialdata.aggregate, e.g. aggregate of two raster elements:
from spatialdata.datasets import blobs
from spatialdata import read_zarr
sdata = blobs( length = 2000 )
sdata.write( os.path.join( path, "sdata_blobs.zarr" ) )
sdata = read_zarr( sdata.path )
import spatialdata
import dask.array as da
arr = sdata["blobs_image"].data
arr = da.stack([arr, arr]).transpose(1, 0, 2, 3)
se=spatialdata.models.Image3DModel.parse(
arr
)
sdata["blobs_image_3d"] = se
if sdata.is_backed():
sdata.write_element("blobs_image_3d")
sdata = read_zarr(sdata.path)
arr = sdata["blobs_labels"].data
arr = da.stack([arr, arr])
se=spatialdata.models.Labels3DModel.parse(
arr,
)
sdata["blobs_labels_3d"] = se
if sdata.is_backed():
sdata.write_element("blobs_labels_3d")
sdata = read_zarr(sdata.path)
from spatialdata import aggregate
adata = aggregate(
values=sdata["blobs_image_3d"],
by=sdata["blobs_labels_3d"],
)
adata["table"].to_df() #-> raises NotImplementedError
Aggregate can also be slow on large images:
import spatialdata
import dask.array as da
se=spatialdata.models.Image2DModel.parse(
da.tile( sdata[ "blobs_image" ].data , (1, 20, 20)).rechunk(4000),
)
sdata["blobs_image_large"] = se
if sdata.is_backed():
sdata.write_element("blobs_image_large")
sdata = read_zarr(sdata.path)
se=spatialdata.models.Labels2DModel.parse(
da.tile( sdata[ "blobs_labels" ].data , ( 20, 20)).rechunk(4000),
)
sdata["blobs_labels_large"] = se
if sdata.is_backed():
sdata.write_element("blobs_labels_large")
sdata = read_zarr(sdata.path)
from spatialdata import aggregate
adata = aggregate(
values=sdata["blobs_image_large"],
by=sdata["blobs_labels_large"],
)
adata["table"].to_df() # -->takes around 6m on a mac M2, 32Gb ram
While providing support for 3D with the current implementation using xrspatial.zonal_stats is quite straightforward (i.e. just iterate over the z-stacks), I think it makes sense to think about a custom implementation that does not rely on xrspatial.zonal_stats for aggregation of raster data, and that is faster. E.g. the solution here to do aggregation of an image layer with a labels layer ('sum') https://github.com/saeyslab/harpy/blob/4f557433168f5b4cf8b776c884214b3254e9e6d2/src/sparrow/table/_allocation_intensity.py#L251 , is both much faster (30s on same hardware), and supports 3D.
Aggregate does not support aggregation of points layer by a labels layer:
This means we have to convert a labels layer first to a shapes layer before we can do aggregation. Also, because shapes layer has limited support in 3D, supporting aggregation using a labels layer makes sense.
I understand that full 3D support is probably not the main priority, but to future prove some features of spatialdata I think it can be worthwhile to already think about it.
The text was updated successfully, but these errors were encountered:
Is your feature request related to a problem? Please describe.
spatialdata.aggregate
does not support 3D,spatialdata.aggregate
can be slow on large images.spatialdata.aggregate
does not allow aggregation of a labels layer with a points layer.Describe the solution you'd like
There is currently no 3D support in
spatialdata.aggregate
, e.g. aggregate of two raster elements:Aggregate can also be slow on large images:
While providing support for 3D with the current implementation using
xrspatial.zonal_stats
is quite straightforward (i.e. just iterate over the z-stacks), I think it makes sense to think about a custom implementation that does not rely onxrspatial.zonal_stats
for aggregation of raster data, and that is faster. E.g. the solution here to do aggregation of an image layer with a labels layer ('sum') https://github.com/saeyslab/harpy/blob/4f557433168f5b4cf8b776c884214b3254e9e6d2/src/sparrow/table/_allocation_intensity.py#L251 , is both much faster (30s on same hardware), and supports 3D.Aggregate does not support aggregation of points layer by a labels layer:
This means we have to convert a labels layer first to a shapes layer before we can do aggregation. Also, because shapes layer has limited support in 3D, supporting aggregation using a labels layer makes sense.
I understand that full 3D support is probably not the main priority, but to future prove some features of
spatialdata
I think it can be worthwhile to already think about it.The text was updated successfully, but these errors were encountered: