Skip to content

Commit

Permalink
Anndata integration (#60)
Browse files Browse the repository at this point in the history
* removed plotting

* cunnData deprecated

* added anndata and 64 bit support

* switch from scanpy_gpu to tools

* start to updating docs

* update test to work with anndata

* swtiched harmony to pp

* switched harmony to pp

* updated docs

* fix docs

* added utils

* updated documentation

* updated docs

* move to anndata

* updated doc strings

* use next anndata

* adata version

* updated docs

* try to fix utils  docs

* update for dense hvg

* added dense normalization test

* update and test utils

* update test

* Update docs/Usage_Principles.md

Co-authored-by: Lukas Heumos <[email protected]>

* Update docs/Usage_Principles.md

Co-authored-by: Lukas Heumos <[email protected]>

* Update docs/Usage_Principles.md

Co-authored-by: Lukas Heumos <[email protected]>

* Update docs/Usage_Principles.md

Co-authored-by: Lukas Heumos <[email protected]>

* Update docs/Usage_Principles.md

Co-authored-by: Lukas Heumos <[email protected]>

* Update docs/Usage_Principles.md

Co-authored-by: Lukas Heumos <[email protected]>

* update api docs

* updated docs

* test docs

* fix docs

* added release-notes

* updated 0.8.1

* fix for realease-notedocs

* fixed links

* try to fix release-notes

* bumped scanpydoc

* test css

* make c array safe

* enables cunndata to anndata_GPU

* update notebooks

* updated docs

* maybe fix docs

* update utils docstring

* reset docs

* added extlinks

* fixes AnnData

* update AnnData

* added AnnData

* updated docs

* fixes broken link

* test releasenotes

* updated conf.py

* creates utils in docs

* renamed Params Parameters

* switched to boolean index

* added batching

* prepare for transformer API

* enhance harmony reproducability

* harmony 64 bit

* work with dense matrices

* added release node

* updated ruff

* update notebooks

* remove cpx raw import

* remove cugraph default import

* fixed import

* make doc-strings pretty

* added leiden disclaimer to notebooks

* added test for harmonypy

* added import test

* added profimp for testing

* updated anndata-dep

---------

Co-authored-by: Lukas Heumos <[email protected]>
Co-authored-by: Philipp A <[email protected]>
  • Loading branch information
3 people authored Sep 9, 2023
1 parent b65e766 commit ba36e67
Show file tree
Hide file tree
Showing 77 changed files with 3,916 additions and 3,663 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ __pycache__/
# Sphinx documentation
/docs/_build/
/docs/generated/
/docs/api/generated/

# Venvs
*venv/
2 changes: 1 addition & 1 deletion .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ repos:
hooks:
- id: black
- repo: https://github.com/astral-sh/ruff-pre-commit
rev: v0.0.286
rev: v0.0.287
hooks:
- id: ruff
args: [--fix, --exit-non-zero-on-fix]
48 changes: 33 additions & 15 deletions docs/Usage_Principles.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,36 +6,46 @@
import rapids_singlecell as rsc
```

## cunnData
## Workflow

```{image} _static/cunndata.svg
:width: 500px
```
The workflow of *rapids-singlecell* is basically the same as *scanpy's*. The main difference is the speed at which *rsc* can analyze the data. For more information please checkout the notebooks and API documentation.

The {class}`~rapids_singlecell.cunnData.cunnData` object replaces {class}`~anndata.AnnData` for preprocessing. All {mod}`~.pp` and {mod}`~.pl` functions are aimed towards cunnData. {attr}`~.X` and {attr}`~.layers` are stored on the GPU. The other components are stored in the host memory.
### AnnData setup

## Workflow
With the release of version 0.10.0 {class}`~anndata.AnnData` supports GPU arrays and Sparse Matrices.

The workflow of *rapids-singlecell* is basically the same as *scanpy's*. The main difference is the speed at which *rsc* can analyze the data. For more information please checkout the notebooks and API documentation.
Rapids-singlecell leverages this capability to perform analyses directly on GPU-enabled {class}`~anndata.AnnData` objects. This also leads to the depreciation of {class}`~rapids_singlecell.cunnData.cunnData` and it's removal in early 2024.

To get your {class}`~anndata.AnnData` object onto the GPU you can set {attr}`~anndata.AnnData.X` or each {attr}`~anndata.AnnData.layers` to a GPU based matrix.

```
adata.X = cpx.scipy.sparse.csr_matrix(adata.X) # moves `.X` to the GPU
adata.X = adata.X.get() # moves `.X` back to the CPU
```

You can also use {mod}`rapids_singlecell.utils` to move arrays and matrices.

```
rsc.utils.anndata_to_GPU(adata) # moves `.X` to the GPU
rsc.utils.anndata_to_CPU(adata) # moves `.X` to the CPU
```

### Preprocessing

The preprocessing is handled by {class}`~rapids_singlecell.cunnData.cunnData` and `cunnData_funcs`. The latter is import as {mod}`~.pp` and {mod}`~.pl` to mimic the behavior of scanpy.
The preprocessing can be handled by {class}`~anndata.AnnData` and {class}`~rapids_singlecell.cunnData.cunnData`. It offers accelerated versions of functions within {mod}`scanpy.pp`.

Example:
```
rsc.pp.highly_variable_genes(cudata,n_top_genes=5000,flavor="seurat_v3",batch_key= "PatientNumber",layer = "counts")
cudata = cudata[:,cudata.var["highly_variable"]==True]
rsc.pp.regress_out(cudata,keys=["n_counts", "percent_MT"])
rsc.pp.scale(cudata,max_value=10)
rsc.pp.highly_variable_genes(adata, n_top_genes=5000, flavor="seurat_v3", batch_key= "PatientNumber", layer = "counts")
adata = adata[:,adata.var["highly_variable"]==True]
rsc.pp.regress_out(adata,keys=["n_counts", "percent_MT"])
rsc.pp.scale(adata,max_value=10)
```
After preprocessing is done just transform the {class}`~rapids_singlecell.cunnData.cunnData` into {class}`~anndata.AnnData` and continue the analysis.

### Tools

The functions provided in {mod}`~.tl` are designed to manipulate the {class}`~anndata.AnnData` object. They serve as near drop-in replacements for the functions in *scanpy*, but offer significantly improved performance. Consequently, you can continue to use scanpy's plotting API with ease.

All {mod}`~.tl` functions operate on {class}`~anndata.AnnData`, which is why {func}`~.harmony_integrate` has transitioned from the `.pp` module to `.tl`. This also explains the existence of two distinct functions for calculating principal components: one for {class}`~rapids_singlecell.cunnData.cunnData` (within the `.pp` module) and another for {class}`~anndata.AnnData` (within the `.tl` module).
The functions provided in {mod}`~.tl` are designed to as near drop-in replacements for the functions in {mod}`scanpy.tl`, but offer significantly improved performance. Consequently, you can continue to use scanpy's plotting API.

Example:
```
Expand All @@ -55,3 +65,11 @@ rsc.dcg.run_mlm(mat=adata, net=net, source='source', target='target', weight='we
acts_mlm = dc.get_acts(adata, obsm_key='mlm_estimate')
sc.pl.umap(acts_mlm, color=['KLF5',"FOXA1", 'CellType'], cmap='coolwarm', vcenter=0)
```

### cunnData (deprecated)

```{image} _static/cunndata.svg
:width: 500px
```

The {class}`~rapids_singlecell.cunnData.cunnData` object can replace {class}`~anndata.AnnData` for preprocessing. All {mod}`~.pp` functions (except {func}`~.pp.neighbors`) are aimed towards cunnData. {attr}`~.X` and {attr}`~.layers` are stored on the GPU. The other components are stored in the host memory.
20 changes: 20 additions & 0 deletions docs/_static/css/override.css
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
/* for the sphinx design cards */
body {
--sd-color-shadow: dimgrey;
}

dl.citation>dt {
float: left;
margin-right: 15px;
font-weight: bold;
}

/* for custom small role */
.small {
font-size: 40% !important
}

.smaller,
.pr {
font-size: 70% !important
}
152 changes: 0 additions & 152 deletions docs/api.md

This file was deleted.

13 changes: 13 additions & 0 deletions docs/api/cunndata.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
# cunnData

{class}`~rapids_singlecell.cunnData.cunnData` is deprecated and will be removed in 2024. Please start switching to {class}`~anndata.AnnData`

```{eval-rst}
.. module:: rapids_singlecell.cunnData
.. currentmodule:: rapids_singlecell
.. autosummary::
:toctree: generated
cunnData.cunnData
```
14 changes: 14 additions & 0 deletions docs/api/decoupler_gpu.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
# decoupler-GPU: `dcg`

{mod}`decoupler` contains different statistical methods to extract biological activities. {mod}`rapids_singlecell.dcg` acclerates some of these methods.

```{eval-rst}
.. module:: rapids_singlecell.dcg
.. currentmodule:: rapids_singlecell
.. autosummary::
:toctree: generated
dcg.run_mlm
dcg.run_wsum
```
18 changes: 18 additions & 0 deletions docs/api/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
# API

Import rapids-singlecell as:

```
import rapids_singlecell as rsc
```


```{toctree}
:maxdepth: 2
scanpy_gpu
squidpy_gpu
decoupler_gpu
utils
cunndata
```
Loading

0 comments on commit ba36e67

Please sign in to comment.