Skip to content

Commit

Permalink
add scClassify vignette
Browse files Browse the repository at this point in the history
ycao6928 committed Nov 29, 2024

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
1 parent fe4693d commit d7b85a8
Showing 7 changed files with 2,089 additions and 190 deletions.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -3,6 +3,6 @@

Data can be accessed from dropbox: https://www.dropbox.com/scl/fi/6icd5vix870uoffv9p3zb/data.zip?rlkey=hu1tvpbdg0msykrud05hbclj6&st=2qbsk235&dl=0

Website at https://sydneybiox.github.io/HKU_SCDNEY_2024/
Website at https://sydneybiox.github.io/HKUST_workshop/

Slide at https://www.dropbox.com/scl/fi/3f9wxsd4rnq3a5mf44nwc/HKU_Workshop2024_v1_morning.pptx?rlkey=gq8xjeuktayokkie0q1ogeq8s&dl=0
3 changes: 2 additions & 1 deletion vignettes/VisiumVersion3.Rmd
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
---
title: "Unlocking single cell spatial omics analyses with scdney - Visium"
author: Yue Cao,
Daniel Kim,
Andy Tran,
Dario Strbenac,
Nicholas Robertson
@@ -10,7 +11,7 @@ affiliation:
- School of Mathematics and Statistics, University of Sydney, Australia;
- Charles Perkins Centre, University of Sydney, Australia;
- Laboratory of Data Discovery for Health Limited (D24H), Science Park, Hong Kong SAR, China.
date: 15 October, 2023
date: 29 November, 2024

params:
evalc: TRUE ## EDIT to TRUE when generating output, otherwise 'FALSE'
398 changes: 219 additions & 179 deletions vignettes/VisiumVersion3.html

Large diffs are not rendered by default.

4 changes: 2 additions & 2 deletions vignettes/breastCancerIMC.Rmd
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
---
title: "Unlocking single cell spatial omics analyses with scdney"
author: Yue Cao,
Andy Tran,
Daniel Kim,
Andy Tran,
Dario Strbenac,
Nicholas Robertson,
Helen Fu,
@@ -12,7 +12,7 @@ affiliation:
- School of Mathematics and Statistics, University of Sydney, Australia;
- Faculty of Medicine and Health, University of Sydney, Australia;
- Charles Perkins Centre, University of Sydney, Australia;
date: 24 July, 2024
date: 29 November, 2024
params:
evalc: TRUE ## EDIT to TRUE when generating output, otherwise 'FALSE'
show: 'hide' ## EDIT to 'as.is' when generating Suggestions, otherwise 'hide'
14 changes: 7 additions & 7 deletions vignettes/breastCancerIMC.html

Large diffs are not rendered by default.

127 changes: 127 additions & 0 deletions vignettes/scClassify.Rmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,127 @@
---
title: "Cell type classification with scClassify"
author: Yue Cao,
Daniel Kim,
Andy Tran,
Dario Strbenac,
Nicholas Robertson,
Helen Fu,
Jean Yang
affiliation:
- Sydney Precision Data Science Centre, University of Sydney, Australia;
- School of Mathematics and Statistics, University of Sydney, Australia;
- Faculty of Medicine and Health, University of Sydney, Australia;
- Charles Perkins Centre, University of Sydney, Australia;
date: 29 November, 2024
params:
evalc: TRUE ## EDIT to TRUE when generating output, otherwise 'FALSE'
show: 'hide' ## EDIT to 'as.is' when generating Suggestions, otherwise 'hide'
output:
html_document:
css: https://use.fontawesome.com/releases/v5.0.6/css/all.css
code_folding: hide
fig_height: 12
fig_width: 12
toc: yes
number_sections: false
toc_depth: 3
toc_float: yes
self_contained: true
editor_options:
markdown:
wrap: 72
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE, message=FALSE, warning= FALSE)
```



```{r}
library(scClassify)
library(ggplot2)
library(reshape2)
```


## Overview

scClassify performs classification of cells for single-cell RNA-sequencing data using single and multiple references. It takes in a normalised (i.e., log2 transformed) training data and a reference data.

For demonstration purposes, we will take a subset of single-cell pancreas datasets from two independent studies (Wang et al., and Xin et al.).


## Loading the data

```{r}
data("scClassify_example")
# training data
training_celltype <- scClassify_example$xin_cellTypes
training_data <- scClassify_example$exprsMat_xin_subset
# testing data
# here we get the cell type in the testing data
# so that we can compare between the predicted and the
testing_celltype <- scClassify_example$wang_cellTypes
testing_data <- scClassify_example$exprsMat_wang_subset
```


## Running scClassify

```{r fig.height=6, fig.width=6, warning=FALSE}
scClassify_res <- scClassify(exprsMat_train = training_data,
cellTypes_train = training_celltype,
exprsMat_test = testing_data,
cellTypes_test = testing_celltype, # or leave out if testing cell type unknown
tree = "HOPACH",
algorithm = "WKNN",
selectFeatures = c("limma"),
similarity = c("pearson"),
returnList = FALSE,
verbose = FALSE)
```

## Checking result


We can check the cell type tree generated by the reference data:

```{r fig.height=4, fig.width=4}
plotCellTypeTree(cellTypeTree(scClassify_res$trainRes))
```

Check the prediction results.

```{r}
confusion_matrix <- table(scClassify_res$testRes$test$pearson_WKNN_limma$predRes, testing_celltype)
confusion_matrix
```


Visually inspect the prediction results.



```{r fig.height=4, fig.width=6}
# Convert the table into a data frame for ggplot
conf_matrix_df <- as.data.frame(confusion_matrix )
colnames(conf_matrix_df) <- c("Predicted", "Actual", "Count")
# Create the heatmap
ggplot(conf_matrix_df, aes(x = Actual, y = Predicted, fill = Count)) +
geom_tile(color = "white") +
scale_fill_gradient(low = "white", high = "steelblue") +
geom_text(aes(label = Count), color = "black" )
```

1,731 changes: 1,731 additions & 0 deletions vignettes/scClassify.html

Large diffs are not rendered by default.

0 comments on commit d7b85a8

Please sign in to comment.