This repository contains a set of R scripts specifically designed for the data analysis performed in Kostyrko et al. The repository provides a set of R scripts for analyzing Methylation EPIC and RNAseq (correlation and survival analysis) used in the original paper. This respository is a resource for researchers looking to replicate or build upon the findings of Kostyrko et al.
- PMID: 37407562 PMCID: PMC10322837 DOI: 10.1038/s41467-023-39591-2
- https://pubmed.ncbi.nlm.nih.gov/37407562/
- This repo is tailored to handle Methylation EPIC data and RNAseq correlation analysis as performed in the Kostyrko et al study.
- Survival Analysis & Miscellaneous RNAseq Analysis:
The dataset used for the Kostyrko et al study can be accessed from the GEO SuperSeries under the accession number GSE198289.
- GSE198289 UHRF1 is a mediator of KRAS driven oncogenesis in lung adenocarcinoma [RNA-seq]
- GSE198446 UHRF1 is a mediator of KRAS driven oncogenesis in lung adenocarcinoma [epic_methyl]
- GSE209923 UHRF1 is a mediator of KRAS driven oncogenesis in lung adenocarcinoma [shRNA]
- GSE233401 UHRF1 is a mediator of KRAS driven oncogenesis in lung adenocarcinoma [CRISPR_screen]
-
Download and install R from here. R (3.6.2)
-
Download the dataset from GEO SuperSeries GSE198289.
-
R packages
- edgeR (3.28.1)
- Minfi (1.32)
- EpiDISH (2.2.2)
- limma package(3.42.2)
- missMethyl (1.20.4)
- DMRcate (2.0.7)
- survminer (0.4.6.999) and survival (3.1.11)
- DGCA (1.0.2 )
- methylGSA (1.4.9)
Click to expand Bioinformatics and Genomics packages!
library(biomaRt)
: Tools for BioMart databases (like Ensembl).library(BSgenome)
: Infrastructure for Bioconductor packages using large-scale genomic or other data.library(org.Hs.eg.db)
: Mapping information for human genes.library(GenomicFeatures)
: Tools for making and manipulating transcript centric annotations.library(IlluminaHumanMethylation450kanno.ilmn12.hg19)
: Annotation data for the Illumina Human Methylation 450k array.library(IlluminaHumanMethylationEPICanno.ilm10b4.hg19)
: Annotation data for the Illumina Human Methylation EPIC array.library(IlluminaHumanMethylationEPICmanifest)
: Manifest file for Illumina's EPIC methylation arrays.library(Homo.sapiens)
: Annotation data for the human genome.library(rtracklayer)
: An interface to genome annotation files and the UCSC genome browser.
library(DESeq2)
: Differential gene expression analysis based on the negative binomial distribution.library(edgeR)
: Empirical analysis of digital gene expression data in R.library(GenomicRanges)
: Representations and manipulations of genomic intervals and variables defined along a genome.library(GSVA)
: Gene set variation analysis for microarray and RNA-seq data.library(Gviz)
: Plotting data and annotation information along genomic coordinates.library(minfi)
: Tools to analyze Illumina's methylation arrays.library(missMethyl)
: Analyzes differential methylation in the context of GC content.library(methylGSA)
: Gene set testing for Illumina's methylation arrays.library(pathview)
: Plots pathway maps and overlays experimental data.library(sva)
: Surrogate Variable Analysis: identification and adjustment for hidden confounding factors.library(biovizBase)
: Basic graphic utilities for visualization of genomic data.library(ggbio)
: Visualization tools for genomic data.library(limma)
: Linear models for microarray data.library(pathfindR)
: An R package for comprehensive identification of enriched pathways in omics data through active subnetworkslibrary (DGCA)
: #Differential Gene Correlation Analysis
library(EpiDISH)
: Epigenetic Dissection of Intra-Sample-Heterogeneity.library(DMRcate)
: Detecting differentially methylated regions in CpG methylation data.
library(clusterProfiler)
: Statistical analysis and visualization of functional profiles for genes and gene clusters.library(ComplexHeatmap)
: Making complex heatmaps.library(d3heatmap)
: Interactive heatmaps.library(dendextend)
: Extending R's dendrogram functionality.library(dendroextras)
: Extra functions to cut, label and colour dendrogram clusters.library(parallelDist)
: Parallel distance matrix computation.
library(corrplot)
: Visualization of a correlation matrix.library(factoextra)
: Extract and visualize the results of multivariate data analyses.library(ggdendro)
: Create dendrograms using ggplot.library(ggplot2)
: An implementation of the Grammar of Graphics.library(ggplotify)
: Convert plot function call to 'ggplot' objects.library(ggpubr)
: 'ggplot2' based publication ready plots.library(ggpval)
: Annotate statistical significance onto 'ggplot' objects.library(ggrepel)
: Automatically position non-overlapping text labels with 'ggplot2'.library(gplots)
: Various R programming tools for plotting data.library(gridExtra)
: Miscellaneous functions for "grid" graphics.library(kableExtra)
: Build complex HTML or 'LaTeX' tables using 'kable()' and pipe syntax.library(patchwork)
: The composer of ggplots.library(RColorBrewer)
: ColorBrewer palettes.library(VennDiagram)
: Generate high-resolution Venn and Euler plots.library(Vennerable)
: Venn and Euler area-proportional diagrams.library(wesanderson)
: Wes Anderson color palettes.library(igraph)
: Network analysis and visualization.library (ggbeeswarm)
# Beeswarm plots helperlibrary(forestplot)
# forest plot helper, mostly use in meta-analysislibrary (ggridges)
# Ridgeline plotslibrary(cowplot)
# functions to align plots and arrange them into complex compound figures
library(FactoMineR)
: An R package for multivariate analysis.library(fgsea)
: Fast gene set enrichment analysis.library(MASS)
: Functions and datasets to support Venables and Ripley's MASS.library(matrixStats)
: Functions that apply to rows and columns of matrices (and to vectors).library(PerformanceAnalytics)
: Econometric tools for performance and risk analysis.library(psych)
: Procedures for psychological, psychometric, and personality research.library(survival)
: Survival analysis.library(survminer)
: Drawing survival curves using 'ggplot2'.library(vegan)
: Community Ecology Package.library(scales)
: Scale functions for visualization.library(Rtsne)
: T-distributed stochastic neighbor embedding using a Barnes-Hut implementation.library(umap)
: Uniform Manifold Approximation and Projection.
library(data.table)
: Extension ofdata.frame
.library(dplyr)
: A grammar of data manipulation.library(DT)
: A wrapper of the JavaScript library 'DataTables'.library(forcats)
: Tools for working with categorical variables (factors).library(plyr)
: Tools for splitting, applying and combining data.library(reshape)
: Flexibly reshape data.library(stringr)
: Simple, consistent wrappers for common string operations.library(tidyr)
: Easily tidy data with 'spread()' and 'gather()' functions.
library(knitr)
: A general-purpose tool for dynamic report generation in R.library(pander)
: An R Pandoc writer.library(stargazer)
# LATEX, HTML and ASCII tables from R statistical output
library(openxlsx)
: Read, write and edit XLSX files.