Skip to content

Commit

Permalink
updated docs
Browse files Browse the repository at this point in the history
  • Loading branch information
Matt Lyon committed Nov 24, 2021
1 parent f7786bf commit 7017ab8
Show file tree
Hide file tree
Showing 11 changed files with 10,148 additions and 24 deletions.
4 changes: 4 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -11,5 +11,9 @@ test/data/*
!test/data/test.bgen.bgi
!test/data/data-outlier.csv
!test/data/data.csv
!test/data/example.csv
!test/data/example.bgen
!test/data/example.bgen.bgi
!test/data/example.R
sim/data
.Rapp.history
11 changes: 8 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,16 @@
# GWAS of trait variance
# varGWAS: GWAS of SNP variance effects

<!-- badges: start -->
[![Build Status](https://github.com/MRCIEU/vargwas/actions/workflows/test.yml/badge.svg)](https://github.com/MRCIEU/vargwas/actions)
![Build Status](https://github.com/MRCIEU/varGWAS/actions/workflows/test.yml/badge.svg)(https://github.com/MRCIEU/vargwas/actions)
<!-- badges: end -->

Software to perform GWAS of SNP variance effects for prioritising GxG/GxE testing
Software to perform genome-wide association study of SNP effects on trait variance

## Documentation

Full documentation available from <https://mrcieu.github.io/varGWAS>

## Citation

- Brown M and Forsythe A. Robust tests for the equality of variances. J. Am. Stat. Assoc., 1974. <https://doi.org/10.1080/01621459.1974.10482955>
- Pietrosanu M, Gao J, Kong L, Jiang B and Niu D. Advanced algorithms for penalized quantile and composite quantile regression. Comput. Stat. 2020 361 36, 333–346. <https://link.springer.com/article/10.1007/s00180-020-01010-1>
1 change: 1 addition & 0 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,4 +6,5 @@ nav:
- Home: index.md
- Install: install.md
- Usage: usage.md
- Tutorial: tutorial.md
theme: readthedocs
7 changes: 3 additions & 4 deletions mkdocs/index.md
Original file line number Diff line number Diff line change
@@ -1,13 +1,12 @@
# varGWAS: GWAS of SNP variance effects

<!-- badges: start -->
![Build Status](https://github.com/MRCIEU/varGWAS/actions/workflows/test.yml/badge.svg)
![Build Status](https://github.com/MRCIEU/varGWAS/actions/workflows/test.yml/badge.svg)(https://github.com/MRCIEU/vargwas/actions)
<!-- badges: end -->

Software to perform GWAS of SNP variance effects for prioritising GxG/GxE testing
Software to perform genome-wide association study of SNP effects on trait variance

## Citation

- Breusch T and Pagan A. A Simple Test for Heteroscedasticity and Random Coefficient Variation. Econometrica, vol. 47, no. 5, p. 1287, Sep. 1979. <https://doi.org/10.2307/1911963>
- Brown M and Forsythe A. Robust tests for the equality of variances. J. Am. Stat. Assoc., 1974. <https://doi.org/10.1080/01621459.1974.10482955>
- Pietrosanu M, Gao J, Kong L, Jiang B and Niu D. Advanced algorithms for penalized quantile and composite quantile regression. Comput. Stat. 2020 361 36, 333–346. <>
- Pietrosanu M, Gao J, Kong L, Jiang B and Niu D. Advanced algorithms for penalized quantile and composite quantile regression. Comput. Stat. 2020 361 36, 333–346. <https://link.springer.com/article/10.1007/s00180-020-01010-1>
44 changes: 35 additions & 9 deletions mkdocs/install.md
Original file line number Diff line number Diff line change
@@ -1,29 +1,33 @@
# Install

Requires UNIX environment
## Precompiled binary

SRC
The precompiled binary for Linux systems can be downloaded from [GitHub](https://github.com/MRCIEU/varGWAS/releases). This is the simplest method and will work for most users.

## Build from source

Obtain source

```shell
git clone [email protected]:MRCIEU/varGWAS.git
cd varGWAS
```

Load compiler (optional). Tested with GCC v7 & v9.
Load compiler (may be necessary on HPC systems). Tested with GCC v7 & v9.

```shell
# BC4
# BlueCrystal Phase 4
module load build/gcc-7.2.0
module load tools/cmake/3.20.0
```

Libraries
Build dependencies

```shell
bash lib.sh
```

Build
Configure cmake

```shell
mkdir -p build
Expand All @@ -38,15 +42,37 @@ cmake .. -DCMAKE_BUILD_TYPE=Release
CC=/mnt/storage/software/languages/gcc-7.2.0/bin/gcc \
CXX=/mnt/storage/software/languages/gcc-7.2.0/bin/g++ \
cmake .. -DCMAKE_BUILD_TYPE=Release
```

# build
Build

```shell
make
```

# Docker
Run

Build image
```shell
./bin/varGWAS
```

## Docker

Image

```shell
# pull image from Dockerhub
docker pull mrcieu/vargwas
### OR ###
# Build image from source
docker build -t vargwas .
```

Run

```shell
docker run \
-it \
-v $PWD:/home \
mrcieu/vargwas
```
46 changes: 46 additions & 0 deletions mkdocs/tutorial.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
# Tutorial

## Model

The outcome is ```Y = X + U + X*U + E``` where ```X``` is a genotype, ```U``` is a continuous modifier and ```X*U``` is the interaction effect

## Simulate

The script below will simulate the data and requires [qctool]() and [bgenix]() on the PATH.

```shell
Rscript test/data/example.R
```

Alternatively the data are provided in ```test/data```.

## GWAS

Test for the effect of the SNP on the variance of the outcome

```shell
./varGWAS \
-v test/data/example.csv \
-s , \
-o test/data/example.txt \
-b test/data/example.bgen \
-p Y \
-i S
```

## Output

The effect of the SNP on outcome variance is non-linear so the genotype is treated as a dummy variable in the second-stage regression. This means there are two effects of the SNP-var(Y) relationship for each level of the genotype.

| chr | pos | rsid | oa | ea | n | eaf | beta | se | t | p | theta | phi_x1 | se_x1 | phi_x2 | se_x2 | phi_f | phi_p |
|-----|-----|--------|----|----|-------|---------|--------------|-----------|-------------|----------|-------------|----------|-----------|---------|----------|---------|--------------|
| 01 | 1 | RSID_1 | G | A | 10000 | 0.39485 | -0.000127464 | 0.0144545 | -0.00881832 | 0.992964 | -0.00143247 | 0.489362 | 0.0267757 | 1.85565 | 0.095883 | 667.129 | 1.09461e-272 |

- ```chr```, ```pos```, ```rsid```, ```oa``` (non-effect allele) and ```ea``` (effect allele) describe the variant
- ```n``` and ```eaf``` are the total sample size and effect allele frequency included in the model
- ```beta```, ```se```, ```t``` and ```p``` describe the effect of the SNP on the mean of the outcome
- ```theta``` is the effect of the SNP on the median of the outcome
- ```phi_x1``` and ```phi_x2``` is the average change in variance from ```SNP=0``` to ```SNP=1``` and ```SNP=2```. ```se_x1``` and ```se_x2``` are the standard errors of these statistics.
- ```phi_f``` and ```phi_p``` are the F-statistic and P-value for the effect of the SNP on outcome variance

The trait was standardised (see ```test/data/example.R```) so the units are ```sigma^2```. ```var(Y)``` was 0.489 when SNP=1 and 1.856 when SNP=2.
26 changes: 18 additions & 8 deletions mkdocs/usage.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
# Usage

Requires UNIX environment. Use [Docker](install.md#Docker) for Windows.

```shell
./varGWAS

Expand All @@ -19,16 +21,16 @@ Usage:
-t, --threads arg Number of threads
```

- Unordered categorical variables should be one-hot encoded.
- Do not provide null values in the phenotype file - these should be filtered out.

# Covariates
## Phenotypes

In addition to standard covariates, also include the square of continuous/ordinal phenotypes to adjust the variance effect.
- Do not provide null values in the phenotype file - these should be filtered out.
- Unordered categorical variables should be one-hot encoded (dummy variables).
- Include the square of continuous/ordinal phenotypes to adjust the variance effect.
- The variance effect size is a unitless measure; standardise the outcome beforehand by dividing the trait by its SD.

# Simulations
## Output

See [README](https://github.com/MRCIEU/varGWAS/blob/master/sim/README.md)
See description of GWAS summary stats [here](tutorial.md#Output)~~~~

# Logging

Expand All @@ -52,4 +54,12 @@ cmake .. -DCMAKE_BUILD_TYPE=Debug
make
# run tests
./bin/varGWAS_test
```
```

# Simulations

See [README](https://github.com/MRCIEU/varGWAS/blob/master/sim/README.md) for simulations of test power, type 1 error, accuracy and coverage etc.

# Issues

Report issues [here](https://github.com/MRCIEU/varGWAS/issues)
32 changes: 32 additions & 0 deletions test/data/example.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
source("../../sim/funs.R")
set.seed(12345)

# QCTOOL on PATH

# sample size
n_obs <- 10000

# MAF
af <- 0.4

# covariates
data <- data.frame(
S = paste0("S", seq(1, n_obs)),
X = get_simulated_genotypes(af, n_obs),
U = rnorm(n_obs),
stringsAsFactors=F
)

# outcome
data$Y <- data$X * data$U + rnorm(n_obs)
data$Y <- data$Y / sd(data$Y)

# write out GEN file
write_gen("example.gen", "01", "SNPID_1", "RSID_1", "1", "A", "G", data$X)

# write phenotype & sample file
write.table(file = "example.csv", sep = ",", quote = F, row.names = F, data)

# convert to BGEN file & plink
system("qctool -g example.gen -og example.bgen")
system("bgenix -g example.bgen -clobber -index")
Binary file added test/data/example.bgen
Binary file not shown.
Binary file added test/data/example.bgen.bgi
Binary file not shown.
Loading

0 comments on commit 7017ab8

Please sign in to comment.