Hierarchical Markov Random Field model captures spatial dependency in gene expression, demonstrating regulation via the 3D genome
Naihui Zhou, Iddo Friedberg, Mark S. Kaiser
bioRxiv 2019.12.16.878371;
doi: https://doi.org/10.1101/2019.12.16.878371
- scripts to organize and clean HiC and RNAseq data
- scripts to run PhiMRF on RNAseq and HiC data
The R package PhiMRF is the statistical model we developed to detect spatial dependency in count data observed on spatial structures. The source code of the package as well as documentation is available as a separate repository.
- PhiMRF
- python3
- R
- bash
The four parts are explained in their separate folders, and should be run sequentially.
-
TAD genes: genes located within Topologically Associating Domains, as well as neighbors isolated by TADs.
-
functional genes: genes annotated with certain Gene Ontology term.
- RNAseq data comes from ENCODE
ENCODE Project Consortium. "An integrated encyclopedia of DNA elements in the human genome." Nature 489.7414 (2012): 57.
- HiC data and normalization method comes from Rao et al, 2014
Rao, Suhas SP, et al. "A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping." Cell 159.7 (2014): 1665-1680.