Skip to content

large-scale-gxe-methods/REGEM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 
 
 

Repository files navigation

REGEM

REGEM (RE-analysis of GEM summary statistics) is a software program for re-analyzing large-scale gene-environment interaction testing results, including multi-exposure interaction, joint, and marginal tests. It uses results directly from GEM output.


Current version: 1.1


Contents


Dependencies

  • C++ compiler with C++11 support
  • Boost C++ Libraries (Versions 1.70.0 - 1.79.0)
  • Intel Math Kernal Library (MKL)

Quick Installation

To install REGEM, run the following lines of code:

git clone https://github.com/large-scale-gxe-methods/REGEM
cd REGEM
cd src
make

Usage

Running REGEM


Command Line Options

Once REGEM is compiled, the executable ./REGEM can be used to run the program. For a list of options, use ./REGEM --help.

List of Options
General Options:
 
   --help                
   		Prints available options and exits.  
    
   --version             
   		Prints the version of REGEM and exits.


File Options:
   --input-file        
   		Path to the input file containing GEM results.  
    
   --out  
   		Full path and extension to where REGEM output results.  
   		Default: regem.out  
    
   --output-style  
   		Modifies the output of REGEM. Must be one of the following:
   			minimum: Output the summary statistics for only the GxE and marginal G terms.
   			   meta: 'minimum' output plus additional fields for the main G and any GxCovariate terms.
   				 For a robust analysis, additional columns for the model-based summary statistics will be included.
   			   full: 'meta' output plus additional fields needed for re-analyses of a subset of interactions.
   			Default: full  


Input File Options:  

   --exposure-names      
   		One or more column names in the input file naming the exposure(s) to be included in interaction tests.  
        
   --int-covar-names     
   		Any column names in the input file naming the covariate(s) for which interactions should be included for adjustment (mutually exclusive with --exposure-names).


Input File

REGEM takes as input an output file from a GEM run with the --output-style flag set to "full". The --output-style flag is available in GEMv1.4.1 and later versions of the software.


Output File

REGEM will write results to the output file specified with the --out parameter (or 'regem.out' if no output file is specified). Below are details of the possible column headers in the output file.

SNPID              - The SNP identifier as retrieved from the genotype file.
CHR                - The chromosome of the SNP.
POS                - The physical position of the SNP. 
Non_Effect_Allele  - The allele not counted in association testing.  
Effect_Allele      - The allele that is counted in association testing.  
N_Samples          - The number of samples without missing genotypes.
AF                 - The allele frequency of the effect allele.
N_catE_*           - The number of non-missing samples in each combination of strata for all of the categorical exposures and interaction covariates.
AF_catE_*          - The allele frequency of the effect allele for each combination of strata for all of the catgorical exposure or interaction covariate.

Beta_Marginal           - The coefficient estimate for the marginal genetic effect (i.e., from a model with no interaction terms).
SE_Beta_Marginal        - The model-based SE associated with the marginal genetic effect estimate.  
robust_SE_Beta_Marginal - The robust SE associated with the marginal genetic effect estimate.

Beta_G             - The coefficient estimate for the genetic main effect (G).
Beta_G-*           - The coefficient estimate for the interaction or interaction covariate terms.
SE_Beta_G          - Model-based SE associated with the the genetic main effect (G).  
SE_Beta_G-*        - Model-based SE associated with any GxE or interaction covariate terms.
Cov_Beta_G_G-*     - Model-based covariance between the genetic main effect (G) and any GxE or interaction covariate terms.  
Cov_Beta_G-*_G-*   - Model-based covariance between any GxE or interaction covariate terms.
robust_SE_Beta_G   - Robust SE associated with the the genetic main effect (G).  
robust_SE_Beta_G-* - Robust SE associated with any GxE or interaction covariate terms.
robust_Cov_Beta_G_G-*   - Robust covariance between the genetic main effect (G) and any GxE or interaction covariate terms.
robust_Cov_Beta_G-*_G-* - Robust covariance between any GxE or interaction covariate terms. 

P_Value_Marginal           - Marginal genetic effect p-value from model-based SE.
P_Value_Interaction        - Interaction effect p-value (K degrees of freedom test of interaction effect) from model-based SE. (K is number of major exposures)
P_Value_Joint              - Joint test p-value (K+1 degrees of freedom test of genetic and interaction effect) from model-based SE.
robust_P_Value_Marginal    - Marginal genetic effect p-value from robust SE.
robust_P_Value_Interaction - Interaction effect p-value from robust SE.
robust_P_Value_Joint       - Joint test p-value (K+1 degrees of freedom test of genetic and interaction effect) from robust SE.

Examples

./REGEM --input-file gem.out --exposure-names cov1 --out regem_cov1.out


Contact

For comments, suggestions, bug reports and questions, please contact Han Chen ([email protected]), Alisa Manning ([email protected]), or Kenny Westerman ([email protected]). For bug reports, please include an example to reproduce the problem without having to access your confidential data.



References

If you use REGEM, please cite

  • Pham DT, Westerman KE, Pan C, Chen L, Srinivasan S, Isganaitis E, Vajravelu ME, Bacha F, Chernausek S, Gubitosi-Klug R, Divers J, Pihoker C, Marcovina SM, Manning AK, Chen H. (2023) Re-analysis and meta-analysis of summary statistics from gene-environment interaction studies. Bioinformatics 39(12):btad730. PubMed PMID: 38039147. PMCID: PMC10724851. DOI: 10.1093/bioinformatics/btad730.


License

REGEM: RE-analysis of GEM summary statistics
Copyright (C) 2021-2024 Duy T. Pham and Han Chen

This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License
along with this program.  If not, see <http://www.gnu.org/licenses/>.