-
Notifications
You must be signed in to change notification settings - Fork 23
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
feat(pecan): gene expression added and methods and data updated
- Loading branch information
1 parent
48f8cc8
commit d7bab19
Showing
4 changed files
with
33,916 additions
and
465 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -2,65 +2,106 @@ | |
title: Expression | ||
--- | ||
|
||
![Expression](./../expression.svg) | ||
## Overview | ||
|
||
**Overview:** The expression landscape of 3432 RNA-Seq fresh frozen tumor samples (1389 blood tumors, 888 solid tumors, and 1155 brain tumors) in St. Jude Cloud is displayed via a t-SNE plot (**Figure 1**) generated using the [St. Jude Cloud RNA-Seq Expression Analysis workflow](https://platform.stjude.cloud/workflows/rnaseq-expression-classification). | ||
This facet comprises three tabs, allowing users to explore the expression landscape of 3,432 RNA-Seq fresh frozen tumor samples (1,389 blood tumors, 888 solid tumors, and 1,155 brain tumors) using a t-SNE plot (**Figure 1**), gene expression violin plots organized by subtype for a gene of interest (**Figure 2**), gene expression overlayed on the t-SNE, or collectively within a data matrix. | ||
|
||
![](./[email protected]) | ||
[NEED NEW IMAGE] | ||
|
||
**Figure 1: tSNE for Blood, Brain, and Solid Samples.** Metadata details for each sample can be accessed by mousing over the data points. This visualization is supported by D3. | ||
**Figure 1: t-SNE for Blood, Brain, and Solid Samples.** Mouse over data points to access metadata details for each sample. Visualization powered by D3. | ||
|
||
!!!note | ||
- All samples use the hg38 reference genome. | ||
- All metdata can be found by accessing our [manifest](https://platform.stjude.cloud/api/v1/manifest) | ||
!!! | ||
[NEED NEW IMAGE] | ||
|
||
**Features:** | ||
A user can explore across the 3 tSNE plots: Blood, Solid, or Brain tumor tabs and employ the features listed below: | ||
**Figure 2: Gene Expression for TP53.** Gene expression violin plots for each sample, filtered by the gene of interest. Visualization powered by Plotly. | ||
|
||
*Subtype categorization*- Subtypes are denoted by a specific color and a subset have been labeled on the plot. | ||
> **Note** | ||
> - All samples use the hg38 reference genome. | ||
> - Full metadata can be accessed through our [manifest](https://platform.stjude.cloud/api/v1/manifest). | ||
*Sample Summary*- A user can select a data point on the plot that opens a sample summary drawer annotating relavent metadata and information. | ||
--- | ||
|
||
## Features for the t-SNE Plot | ||
|
||
| Feature | Description | | ||
| -------------------------- | --------------------------------------------------------------------------------------------------------- | | ||
| **Subtype Categorization** | Subtypes are color-coded, and a subset is labeled on the plot. These can be turned off in the 3 dot menu. | | ||
| **Sample Summary** | Clicking a data point opens a drawer with metadata and sample details. | | ||
| **Filters** | Filters are categorized by Tumor Sample, Patient Phenotype, and Sample Preparation. | | ||
| **Sample Search** | Search by individual or bulk (comma-separated) sample IDs. CompBio IDs must be exact. | | ||
| **Lasso Tool** | Select a region on the plot to retrieve a list of samples for further investigation. | | ||
| **Pan/Zoom** | Zoom in or pan to examine specific regions of the plot. This will disable subtype labels. | | ||
|
||
[NEED NEW GIF] | ||
|
||
> **Warning** | ||
> Filtering by the sunburst will auto-populate the Root and Subtype filters. These can be manually edited but will not update the sunburst. | ||
*Filters* - Filters are organized by Tumor Sample, Patient Phenotype, and Sample Preparation. Once a filter is selected, the subtype labels will be disabled. Functionality of each is further described below. | ||
--- | ||
|
||
## Features for Gene Expression | ||
|
||
*Sample Search* - A user can search individual sampleIDs or bulk IDs that are comma separated. The sampleIDs must be exact and cannot be fuzzy searched. | ||
| Feature | Description | | ||
| -------------------- | -------------------------------------------------------------------------------------------- | | ||
| **Gene Sandbox** | Violin plots for the gene of interest, filtered by root and subtypes. | | ||
| **Plotly Functions** | Pan and zoom features on the right side of the gene sandbox do not affect filter components. | | ||
| **Median Sort** | Sort the gene expression sandboxes by median expression across or within individual groups. | | ||
| **Outlier Toggle** | Toggle off data points to keep outliers intact for the cohort currently being filtered. | | ||
|
||
*Lasso* - Allows a user to select a specific region on the plot to retrieve a list of samples to enable further investigation. To view the sample summary of the lassoed samples, click the "Data" icon in the top right of the subnavbar. See GIF below. | ||
For data normalization details, refer to our [Methods and Data](https://university.stjude.cloud/docs/pecan/methods-data/) page. | ||
|
||
*Pan/Zoom* - Allows a user to examine regions of the plot in more detail, this disables any labels. | ||
[NEED NEW GIF] | ||
--- | ||
|
||
![](./lasso.gif) | ||
## Gene Expression Overlay on t-SNE | ||
|
||
!!!warning | ||
Filtering by the sunburst will auto-populate the diagnosis and subtype filter. A user can edit this modal, but it will not update the sunburst. | ||
!!! | ||
Users can overlay gene expression on the t-SNE plot by selecting genes of interest. Count data is normalized using Median of Ratios (MoR). More details can be found on the [Methods and Data](https://university.stjude.cloud/docs/pecan/methods-data/) page. | ||
|
||
**Filters Explained** | ||
[NEED NEW GIF] | ||
|
||
--- | ||
|
||
## Features for the Data Matrix | ||
|
||
The data matrix displays all filtered data with sortable headers for easier exploration. | ||
|
||
[NEED NEW GIF] | ||
|
||
--- | ||
|
||
## Filters Explained | ||
|
||
### Tumor Sample | ||
|
||
1. Sample ID - A user can search individual St. Jude CompBio IDs or bulk search IDs that are comma separated. This field allows for a multi-select. | ||
2. Subtype - This is a modal whereby a user can custom select which subtypes to view in the plot. Child nodes will automatically become enabled or disabled if a parent node is (de)selected. The number of samples and the subtype color is desginated in the modal for reference. | ||
3. Subtype Biomarker - This field allows a multi-select of subtype biomarkers to be applied to the plot. *Note: the user cannot apply a general gene like "CTNNB1" to be applied across the plot. The user must select all biomarkers they are interested in seeing from the dropdown* | ||
4. Sample Type - This field is a multi-select dropdown. | ||
| Filter | Description | | ||
| --------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------- | | ||
| **Sample ID** | Search by individual or bulk St. Jude CompBio IDs (comma-separated). Allows multi-select. | | ||
| **Subtype Root** | Custom-select a root to prompt applicable subtypes. Heme is defaulted upon loading the facet unless the sunburst is employed. | | ||
| **Subtype** | Custom-select subtypes to view on the plot. Parent node selection enables or disables child nodes. | | ||
| **Subtype Biomarker** | Multi-select subtype biomarkers to apply on the plot. General genes like "CTNNB1" are not accepted; users must select biomarkers from dropdown. | | ||
| **Sample Type** | Multi-select dropdown for sample types. | | ||
|
||
### Patient Phenotype | ||
|
||
1. Sex - A multi-select dropdown. | ||
2. Age at Diagnosis - A scale whereby a user can manually type in the age parameters or use the scale (in years). A user can type in any age, even passed our "35+" parameter. | ||
3. Race - This is a multi-select dropdown. | ||
4. Ethnicity - This is multi-select dropdown. | ||
| Filter | Description | | ||
| -------------------- | -------------------------------------------------- | | ||
| **Sex** | Multi-select dropdown for biological sex. | | ||
| **Age at Diagnosis** | Adjustable scale or manual input for age in years. | | ||
| **Race** | Multi-select dropdown for race. | | ||
| **Ethnicity** | Multi-select dropdown for ethnicity. | | ||
|
||
### Sample Preparation | ||
|
||
1. Library Selection Protocol - This is a multi-select dropdown. | ||
2. Preservative - This is a mutli-select dropwdown | ||
| Filter | Description | | ||
| ------------------------------ | ---------------------------------------------------- | | ||
| **Library Selection Protocol** | Multi-select dropdown for library protocol types. | | ||
| **Preservative** | Multi-select dropdown for sample preservative types. | | ||
|
||
> **Warning** | ||
> Some fields may have a "Not Available" option for samples where the data wasn't recorded (e.g., Race, Ethnicity, Sex). | ||
!!!warning | ||
There can be fields with a "Not Available" option for samples that did not have this value recorded (e.g., Race, Ethnicity, Sex). | ||
!!! | ||
> **Tip** | ||
> For a subset of this data, refer to [Figure 4f of McLeod et al.](https://cancerdiscovery.aacrjournals.org/content/11/5/1082.long) | ||
--- | ||
|
||
!!!tip | ||
An example with a subset of this data can be found in [Figure 4f of McLeod et al](https://cancerdiscovery.aacrjournals.org/content/11/5/1082.long). | ||
!!! | ||
To see how the data was calculated and normalized, visit our [Methods and Data](https://university.stjude.cloud/docs/pecan/methods-data/) page. |
Oops, something went wrong.