Skip to content

stephenslab/eb-linreg-dsc

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

25 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Large-scale multiple linear regression using DSC

Numerical experiments for evaluating the performance of Mr. Ash and other linear regression methods that are well suited for large-scale (possibly sparse) data sets. This is a reproduction, using DSC, of the workflow for the Mr.ASH manuscript by Kim, Wang, Carbonetto and Stephens (2020). See also dsc-linreg.

The prediction errors of the different methods can be seen here: Comparison of prediction errors

Dependencies

Methods in the pipeline require external R packages, which can be installed using

install.packages(c("devtools", "ggplot2", "glmnet", "L0Learn", "BGLR", "ncvreg"))
devtools::install_github("stephenslab/susieR")
devtools::install_github("pcarbo/varbvs",subdir = "varbvs-R")
devtools::install_github("stephenslab/mr.ash.alpha")
devtools::install_github("stephenslab/ebmr.alpha")

The external python packages can be installed using

conda install --copy nose numpy scipy matplotlib pywavelets scikit-learn
git clone [email protected]:GAMPTeam/vampyre.git
cd vampyre
pip install -e .
pip install git+git://github.com/stephenslab/ebmrPy

Note: vampyre has some module dependency issues if I try to install it directly from github.

How to run

cd dsc
dsc linreg.dsc --target [TARGET] --host [HOSTFILE]

[TARGET] can be linreg or trendfilter.

There are two [HOSTFILE]:

  • gwdg.yml for running on MPIBPC GWDG cluster
  • midway2.yml for running on UChicago RCC cluster

To check and debug

dsc linreg.dsc --truncate -o ../dsc_result_trial --host gwdg.yml

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published