-
Notifications
You must be signed in to change notification settings - Fork 21
What does POSTGAP do?
If provided a diseases description(s), it attempts to map it to the Experimental Factor Ontology (EFO) using EMBL-EBI's Zooma service.
It then uses whatever diseases descriptions and ontology terms available and queries a number of public GWAS databases, in particular:
All SNPs with a p-value association below 1e-5 are retained as GWAS SNPs.
Using the 1000 Genomes genotypes it creates a cluster around each GWAS SNP, by selecting all SNPs with LD r^2 > 0.7.
It queries a few databases for any evidence of regulatory activity on all of the SNPs:
- Ensembl for ExAC variant frequencies
- Regulome DB for evidence of epigenomic activity
It queries a few databases for any evidence of cis-regulatory interactions on all of the SNPs:
- GTEx eQTLs
- VEP for transcript overlaps
- Fantom5 for CAGE-tag activity correlation
- ENCODE for DNAse Hypersensitivity Correlation
- CHiCAGO for Promoter Capture Hi-C links
For each relevant (Gene, SNP) pair:
-
The v2g_score for that (Gene, SNP) pair is the sum of the above scores.
-
The gene_score for that (Gene, SNP) pair is the v2g_score multiplied by the LD r2 between that SNP and the most significant nearby GWAS SNP.
-
The PICS score for that SNP is (TODO)
-
The total score for that (Gene, SNP) pair is:
((f(gene_score) + f(PICS)) / 2)^3
where f is an ad hoc weighting function:
f(X) = X * X^(1/3)