Skip to content

Using Graph Neural Network to decrypt Personalized Medicare in Alzheimer's Disease

License

Notifications You must be signed in to change notification settings

u-brite/TeamADGuy

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

59 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Using Graph Neural Network to decrypt Personalized Medicare in Alzheimer's Disease

Table of Contents

Background

Our goal is to capitalize on Graph Layer Relevance Propogation method which explores a Graph Convolutional Neural Network, to decode the pathological or node level difference between Alzheimer Disease subjects and control patients.

Introduction

Alzheimer’s disease (AD) is the most common form of dementia (60-70%) mainly affecting the elderly (age >65) with an estimated annual cost of about $300 billion USD (“2020 Alzheimer’s Disease Facts and Figures,” 2020; Dementia, n.d.; Winston Wong, 2020).

There is no cure for AD, and in the past twenty years only two drugs (Aducanumab and Gantenerumab) have had a potential to show clinically meaningful results (Commissioner, 2021; Gantenerumab | ALZFORUM, n.d.; How Is Alzheimer’s Disease Treated?, n.d.; Ostrowitzki et al., 2017; Tolar et al., 2020). Exploration of additional biomarkers for this complex disease is, therefore, warranted and could potentially aid in the early detection or therapeutic intervention of AD patients.

Methods

We wish to develop a multiplex machine learning (ML) approach to identify [gene]omics biomarkers in AD and mild cognitive impairment (MCI) compared to healthy controls (HC).

  1. Identify best ML model that predicts AD or MCI versus HC
  2. Apply this model on a validation set to confirm the performance
  3. Combine multiple datasets to see if model performance improves

Data

As for the deep learning model and relevance propagation method, we will follow the GCN Paper that has applied this method in the cancer biology filed with slight changes such as:

  1. Expression Dataset from ROSMAP
  2. PPI network from HPRD, or test other suitable network
  3. Hyperparameter tuning

Dataset's Used:
The ROSMAP data was obtained from ROSMAP project and preprocessed, uploaded in the data folder while the other datasets include https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE150693 (MCI to AD converters and non converters, about 100 samples each) and https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE63063 (AD, HC, MCI).

Usage

We are planning to build this whole pipeline into python file with config for easy installation and running. It would be as simple as providing the expression set file, PPI network file and hyperparameters in a config file.

Installation

Installation simply requires fetching the source code. Following are required:

  • Git

To fetch source code, change in to directory of your choice and run:

git clone -b main \
    https://github.com/u-brite/TeamADGuy

Requirements

OS:

Works in all available OS.

Tools:

  • Anaconda3

    • Tested with version: 2020.02
  • Docker

    • Another alternative is to use the docker file
    docker build -t gnc src/Docker/
    #See running containers
    docker container ls
    
    # See all containers
    docker ps -a
    
    # Stop the container
    docker stop <container name>
    #eg.
    docker stop gnn
    
    # Start the container
    docker start <container name>
    #eg,
    docker start gnn
    
    # What if I want an interactive terminal session inside the container?
    docker exec -it <container name> /bin/bash
    cd ~
    #eg.
    docker exec -it metanets /bin/bash
    cd /root/

Activate conda environment

Change in to root directory and run the commands below to run the deep learning model:

# create conda environment. Needed only the first time.
conda env create --file configs/environment.yml

# if you need to update existing environment
conda env update --file configs/environment.yml

# activate conda environment
conda activate gcn

Steps to run

Step 1

To run the deep learning model, the first step would include downloading your required file with expression dataset having subjects as columns and genes as rows while also a column under the name 'Probe' with the gene names for reference in the future. The final output data,which is the subject disease condition is also required for the prediction and finally, we would require the network, for which we used the HPRD PPI and you can freely use the PPI network that suits best.

The rough test files are present inside the Test folder for reference.

Step 2

Running the model requires completion of the config file with self explained headers present inside. Finally, the input of the python file would just be the config file itself.

python src/DeepLearningModel.py 

The config file has to have all the values and the default values to adjust the hyperparameters for the model are also provided.

input_files:
    path_to_feature_val: "x_rosmap_whole_gene_expression_downsampled.csv"
    path_to_feature_graph: "hprd_rosmap_whole_ppi.csv"
    path_to_labels: "y_rosmap_whole_gene_expression_downsampled.csv"

dl_params:
  epochs: 200
  batch_size: 100
  test_ratio: 0.20
  eval_freq: 40
  filter: chebyshev5
  brelu: b1relu
  pool: mpool1
  graph_cnn_filters: 16
  polynomial_ord: 8
  pooling_size: 2
  regularization: 0.0001
  dropout: 0.95
  learning_rate: 0.00095
  decay_rate: 0.9625
  momentum: 0.99

output_loc:
  res_dir: "output_directory/"

Output from this step includes -

output_directory/
├── prediction.csv              
└── Relevences.csv - has the weights and relevances for each of the gene for each subject

Step 3

To run the machine learning models, add both the X and Y datasets from the corresponding data folder in Github to the content folder in Google Colab. Run the respective code blocks within file. The ML models targetted here were lasso and RandomForest for all three datasets and the main features obtained were used for exploratory analysis.

Results

ROSMAP

ROSMAP Chart Results

ROSMAP Histogram

GSE63063

GSE63063 Chart Results

GSE63063 Histogram

miRNA

miRNA Chart Results

miRNA Histogram

Team Members

Pradeep Varathan | [email protected] | Team Leader.
Karen Bonilla| [email protected] | Member.
Mehmet Enes Inam | [email protected] | Member.
Karolina Willicott | [email protected] | Member.

About

Using Graph Neural Network to decrypt Personalized Medicare in Alzheimer's Disease

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published