METAGEM (META-analysis of GEM summary statistics) is a software program for meta-analysis of large-scale gene-environment interaction testing results, including multi-exposure interactions, joint (main effect and interactions) tests, and marginal tests. It uses results directly from GEM output.
Current version: 1.0
- Compiler with C++11 support
- Boost C++ Libraries (Versions 1.70.0 - 1.79.0)
- Intel Math Kernal Library (MKL)
To install METAGEM, run the following lines of code:
git clone https://github.com/large-scale-gxe-methods/METAGEM
cd METAGEM
cd src
make
Once METAGEM is installed, the executable ./METAGEM
can be used to run the program.
For a list of options, use ./METAGEM --help
.
List of Options
General Options:
--help
Prints available options and exits.
Input/Output File Options:
--input-files
Output files from GEM 'meta' or 'full' option separated by space. At least two files are required.
--input-file-list
A no header text file containing a single file name per line. This file should contain at least two file names.
--exposure-names
The names of the exposure(s) to be included in the meta-analysis.
--out
Full path and extension to where METAGEM output results.
Default: metagem.out
--meta-option
Integer value indicating which summary statistics should be used for meta-analysis.
0: Both model-based and robust summary statistics.
1: model-based summary statistics.
2: robust summary statistics.
Default: 0
METAGEM accepts output files from GEM (v1.4.1 or later) with '--output-style' set to 'meta' or 'full'. Multiple GEM output files can be specified using the '--input-files' flag, separated by spaces. Alternatively, the '--input-file-list' option can be used to specify a text file without headers, where each line contains a single input file name.
METAGEM will write results to the output file specified with the --out parameter (or 'metagem.out' if no output file is specified). Below are details of the possible column headers in the output file.
SNPID - The SNP identifier as retrieved from the input files.
CHR - The chromosome of the SNP.
POS - The physical position of the SNP.
Non_Effect_Allele - The reference allele in association testing.
Effect_Allele - The coding allele in association testing.
N_Samples - The combined sample size from all studies in the meta-analysis.
AF - The summary effect allele frequency in all studies combined in the meta-analysis.
Beta_Marginal - The summary marginal genetic effect estimate (i.e., from a model with no interaction terms) from univariate meta-analysis using model-based results from each study.
SE_Beta_Marginal - SE for the summary marginal genetic effect estimate from univariate meta-analysis using model-based results from each study.
robust_Beta_Marginal - The summary marginal genetic effect estimate (i.e., from a model with no interaction terms) from univariate meta-analysis using robust results from each study.
robust_SE_Beta_Marginal - SE for the summary marginal genetic effect estimate from univariate meta-analysis using robust results from each study.
Beta_G - The summary genetic main effect (G) estimate from joint (main effect and interactions) meta-analysis using model-based results from each study.
Beta_G-* - The summary GxE interaction effect estimate(s) from joint (main effect and interactions) meta-analysis using model-based results from each study.
SE_Beta_G - SE for the summary genetic main effect (G) estimate from joint meta-analysis using model-based results from each study.
SE_Beta_G-* - SE for the summary GxE interaction effect estimate(s) from joint meta-analysis using model-based results from each study.
Cov_Beta_G_G-* - Covariance(s) between the summary genetic main effect (G) estimate and the summary GxE interaction effect estimate(s) from joint meta-analysis using model-based results from each study.
Cov_Beta_G-*_G-* - Covariance(s) between the summary GxE interaction effect estimate(s) from joint meta-analysis using model-based results from each study.
robust_Beta_G - The summary genetic main effect (G) estimate from joint (main effect and interactions) meta-analysis using robust results from each study.
robust_Beta_G-* - The summary GxE interaction effect estimate(s) from joint (main effect and interactions) meta-analysis using robust results from each study.
robust_SE_Beta_G - SE for the summary genetic main effect (G) estimate from joint meta-analysis using robust results from each study.
robust_SE_Beta_G-* - SE for the summary GxE interaction effect estimate(s) from joint meta-analysis using robust results from each study.
robust_Cov_Beta_G_G-* - Covariance(s) between the summary genetic main effect (G) estimate and the summary GxE interaction effect estimate(s) from joint meta-analysis using robust results from each study.
robust_Cov_Beta_G-*_G-* - Covariance(s) between the summary GxE interaction effect estimate(s) from joint meta-analysis using robust results from each study.
P_Value_Marginal - The summary marginal genetic effect test p-value from univariate meta-analysis using model-based results from each study.
P_Value_Interaction - The summary GxE interaction effect test p-value (K degrees of freedom test) from joint (main effect and interactions) meta-analysis using model-based results from each study. (K is the number of GxE interaction terms)
P_Value_Joint - Joint (main effect and interactions) test p-value (K+1 degrees of freedom test) from joint meta-analysis using model-based results from each study.
robust_P_Value_Marginal - The summary marginal genetic effect test p-value from univariate meta-analysis using robust results from each study.
robust_P_Value_Interaction - The summary GxE interaction effect test p-value (K degrees of freedom test) from joint (main effect and interactions) meta-analysis using robust results from each study. (K is the number of GxE interaction terms)
robust_P_Value_Joint - Joint (main effect and interactions) test p-value (K+1 degrees of freedom test) from joint meta-analysis using robust results from each study.
The '--meta-option' flag can be used to specify which columns should be included in the output file:
- 0 - Meta-analyses will be performed on both model-based and robust results from each study, and all column headers listed above will be available in the output file.
- 1 - Only model-based results from each study will be used in the meta-analysis, and columns above containing the 'robust_' prefix will be excluded from the output file.
- 2 - Only robust results from each study will be used in the meta-analysis, and summary statistics columns above without the 'robust_' prefix will be excluded from the output file.
- Default: 0
./METAGEM --input-files file1.out file2.out file3.out --exposure-names cov1 --out metagem.out
For comments, suggestions, bug reports and questions, please contact Han Chen ([email protected]), Alisa Manning ([email protected]), or Kenneth Westerman ([email protected]). For bug reports, please include an example to reproduce the problem without having to access your confidential data.
If you use REGEM, please cite
- Pham DT, Westerman KE, Pan C, Chen L, Srinivasan S, Isganaitis E, Vajravelu ME, Bacha F, Chernausek S, Gubitosi-Klug R, Divers J, Pihoker C, Marcovina SM, Manning AK, Chen H. (2023) Re-analysis and meta-analysis of summary statistics from gene-environment interaction studies. Bioinformatics 39(12):btad730. PubMed PMID: 38039147. PMCID: PMC10724851. DOI: 10.1093/bioinformatics/btad730.
METAGEM: META-analysis of GEM summary statistics
Copyright (C) 2021-2023 Duy T. Pham and Han Chen
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program. If not, see <http://www.gnu.org/licenses/>.