-
Notifications
You must be signed in to change notification settings - Fork 19
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Update readme files for example data sets; update figure about data f…
…ormats; update gitignore; update main readme.
- Loading branch information
Showing
12 changed files
with
120 additions
and
83 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,5 @@ | ||
.DS_Store | ||
Docker/log.stdout | ||
source/node_modules | ||
source/Cerebro | ||
source/Rplots.pdf | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,12 +1,12 @@ | ||
# scanpy workflow for `GSE108041` data set | ||
|
||
Here, we analyze the `GSE108041` data set using [scanpy](https://scanpy.readthedocs.io), following the [basics workflow](https://scanpy-tutorials.readthedocs.io/en/latest/pbmc3k.html) described on their website which includes similar steps as those performed in Seurat. | ||
Then, import the [AnnData](https://anndata.readthedocs.io/en/stable) object produced by scanpy, import it into Seurat, and from there export it to Cerebro. | ||
Here, we analyze the `GSE108041` data set ("Extreme heterogeneity of influenza virus infection in single cells", Russell *et al.*, eLIFE (2018), [DOI](https://doi.org/10.7554/eLife.32303), [GEO submission](https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE108041)) using [scanpy](https://scanpy.readthedocs.io), following the [basics workflow described on the scanpy website](https://scanpy-tutorials.readthedocs.io/en/latest/pbmc3k.html) which includes similar steps as those performed in Seurat. | ||
Then, import the [AnnData](https://anndata.readthedocs.io/en/stable) object produced by scanpy into Seurat, and from there export it to Cerebro. | ||
|
||
## Preparation | ||
|
||
Before starting, we clone the Cerebro repository (or manually download it) because it contains the raw data of our example data set. | ||
One (optional) step of our analysis will require us to provide some gene sets in a GMT file. | ||
One (optional) step of our analysis will require us to provide some gene sets in a `GMT` file. | ||
We manually download the `c2.all.v7.0.symbols.gmt` file from [MSigDB](http://software.broadinstitute.org/gsea/downloads.jsp#msigdb) and put it in our current working directory. | ||
Then, we pull the Docker image from the Docker Hub, convert it to Singularity, and start an R session inside. | ||
|
||
|
@@ -32,7 +32,7 @@ import scanpy as sc | |
|
||
## Load data | ||
|
||
For each of the three samples we load the transcript count matrix and then merge them together. | ||
For each of the four samples we load the transcript count matrix (`.h5` format), make feature names unique (some gene IDs share the same gene name), and then merge the transcript counts together. | ||
|
||
```python | ||
adata_uninfected = sc.read_10x_h5('raw_data/GSM2888370_Uninfected.h5') | ||
|
@@ -71,7 +71,7 @@ adata.obs['sample'].cat.reorder_categories( | |
|
||
Now, we... | ||
|
||
* remove cells with less than `100` transcripts or fewer than `50` expressed genes, | ||
* remove cells with fewer than `100` transcripts or `50` expressed genes, | ||
* calculate the number of transcripts per cell, and | ||
* remove genes expressed in fewer than `10` cells. | ||
|
||
|
@@ -91,7 +91,7 @@ np.savetxt('scanpy/raw_counts_genes.tsv', adata.var.index, fmt = '%s', delimiter | |
np.savetxt('scanpy/raw_counts_cells.tsv', adata.obs.index, fmt = '%s', delimiter = '\t') | ||
``` | ||
|
||
What follows is the standard pre-processing procedure of... | ||
What follows is the standard pre-processing procedure, including the following steps... | ||
|
||
* normalizing transcript counts per cell, | ||
* bringing transcript counts to log-scale, | ||
|
@@ -151,7 +151,7 @@ sc.logging.print_versions() | |
Next,... | ||
|
||
* we hop into R, | ||
* set up some parameters, | ||
* set some parameters, | ||
* load packages, and | ||
* import the `.h5ad` file we just wrote to disk using the `ReadH5AD()` function from the Seurat package. | ||
|
||
|
@@ -181,7 +181,7 @@ levels([email protected]$phase) <- c('G1','G2M','S') | |
|
||
## Optional (but recommended) steps | ||
|
||
We could already export this object and visualize the contained in Cerebro. | ||
We could already export this object and visualize the contained data in Cerebro. | ||
However, data exploration in Cerebro would greatly benefit from additional data generated by the functions of cerebroApp. | ||
What follows is a set of (mostly) optional steps. | ||
|
||
|
@@ -234,7 +234,7 @@ [email protected]$tree.ident <- NULL | |
|
||
### Add 3D projections | ||
|
||
Let's also add 3D dimensional reductions for tSNE and UMAP. | ||
We also add 3D dimensional reductions made with tSNE and UMAP. | ||
|
||
```r | ||
seurat <- RunTSNE( | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.