Skip to content

R package for enrichment/depletion analysis of ACSN maps and gmt files

License

Notifications You must be signed in to change notification settings

sysbio-curie/ACSNMineR

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Build Status:

Build Status

CRAN Status and statistics :

CRAN version CRAN downloads weekly CRAN total

ACSNEnrichment is an R package, freely available.

This package is designed for an easy analysis of gene maps (either user imported from gmt files or ACSN maps). Its aim is to allow a statistical analysis of statistically enriched or depleted pathways from a user imported gene list, as well as a graphic representation of results.

This readme contains:

1. This description

2. Usage section

2.1. Pathway analysis

2.2. Data vizualization

2.2.1. Heatmaps

2.2.2. Barplots


The gene set that was used for tests is the following:

genes_test<-c("ATM","ATR","CHEK2","CREBBP","TFDP1","E2F1","EP300","HDAC1","KAT2B","GTF2H1","GTF2H2","GTF2H2B")

Gene set enrichment for a single set can be performed by calling:

enrichment(genes_test,
    min_module_size = 10, 
    threshold = 0.05,
    maps = list(cellcycle = ACSNEnrichment::ACSN_maps$CellCycle))

Where:

  • genes_test is a character vector to test
  • min_module_size is the minimal size of a module to be taken into account
  • threshold is the maximal p-value that will be displayed in the results (all modules with p-values higher than threshold will be removed)
  • maps is a list of maps -here we take the cell cycle map from ACSN- imported through the format_from_gmt() function of the package

Output looks like:

module module_size nb_genes_in_module genes_in_module universe_size nb_genes_in_universe p.value p.value.corrected test
E2F1 19 12 ATM ATR CHEK2 CREBBP TFDP1 E2F1 104 12 4.5e-06 6.68e-05 greater
EP300 HDAC1 KAT2B GTF2H1 GTF2H2 GTF2H2B

Gene set enrichment for multiple sets/cohorts can be performed by calling:

multisample_enrichment(Genes_by_sample = list(set1 = genes_test[-1],set2=genes_test[-2]),
						maps = ACSNEnrichment::ACSN_maps$CellCycle,
						min_module_size = 15)

Where:

  • Genes_by_sample is a list of character vectors to test
  • min_module_size is the minimal size of a module to be taken into account
  • maps is a list of maps -here we take the cell cycle map from ACSN - imported through the format_from_gmt() function of the package

Results from the enrichment analysis function can be transformed to images thanks to the represent enrichment function. Two different plot are available: heatmap and barplot.


Heatmaps for single sample or multiple sample representing p-values can be easily generated thanks to the represent_enrichment function.

represent_enrichment(enrichment = list(SampleA = enrichment_test, 
                                        SampleB = enrichment_test[1:3,]), 
                        plot = "heatmap", scale = "log")

Where:

  • enrichment is the result from the enrichment or multisample_enrichment function
  • scale can be set to either identity or log and will affect the gradient of colors
  • low: the color for the low (significant) p-values
  • high: color for the high (less significant) p-values
  • na.value is the color in which tiles which have "NA" should appear

The result of this is: alt tag


A barplot can be achieved by using the following:

represent_enrichment(enrichment = enrichment_test,
						scale = "reverselog",
                        sample_name = "test",
						plot = "bar")

Where:

  • enrichment is the result from the enrichment or multisample_enrichment function
  • scale can be set to either identity or log and will affect the gradient of colors
  • nrow is the number of rows that should be used to plot all barplots (default is 1)

alt tag