├── README.md
├── data # for plots and real data analysis
├── simulation
│ ├── eval.R # simulation
│ ├── gereate_data.R # data function
│ ├── analysis4combo.R # real data analysis
├── R
│ ├── eic.R # main function
│ ├── ADMM_proj.R # ADMM projection
│ ├── cv_covariance_matrices.R # calibrated CV
│ ├── cv_covariance_noproj.R # CV
│ ├── lasso_covariance.R # lasso solution
│ ├── lasso_covariance_constrained.R
#lasso solution with constraint
├── results
│ ├── results_table.csv # main simulation
│ ├── results_sum.csv # sums in simulation
│ ├── RDAresults.csv # real data results
├── plot.Rmd # plots
├── action.sh # bash script for running simulation
This is the code for the paper "High-dimensional regression analysis of compositional covariates with measurement errors". It implements four methods for high-dimensional regression analysis of compositional covariates with measurement errors, including Eric, CoCo, Coda and vanilla lasso. The tuning parameter selection in Eric and CoCo Lasso was done by using the calibrated cross validation method and that in Coda and Vani Lasso was done by using cross validation. This code is built on the BDcocolasso package.
For the simulation, we only need to run eval.R
.
Required R package:
install.packages("MASS","boot","rlist","emdbook","dirmult")
- Model parameters:
-
constrain=T, proj=T
: Eric -
constrain=F, proj=T
: CoCo -
constrain=T, proj=F
: Coda -
constrain=F, proj=F
: vanilla lasso
- data parameters:
data_type
: chose fromlognormal
dirichlet
dirmult
n
: sample sizep
: dimensionrho
: correlationtau
: sdandard deviation of measurement errorsigma
: sdandard deviation of true covariance matrix
- simulation parameters:
N_sim
: number of simulations
Files will be automatedly saved in results/results_table.csv
.
If you are running it on linux (which is not necessary), create a new directory named log
then run the bash script
bash action.sh
The output log will be automatedly saved in log/month_day_hour.log
.