Skip to content

The nextflow config file

Scott.Hazelhurst edited this page May 26, 2017 · 4 revisions

Introduction

Nextflow uses parameters that are passed to it and contents of a configuration file to guide its behaviour. By default, the configuration file used in nextflow.config. This includes specifiying

  • where the inputs come from and outputs go to;
  • what the parameters of the various programs/steps. For example, in QC you can specify the what missingness cut-offs you want;
  • the mode of operation -- for example, are you running it on a cluster? Using Docker?

To run your workflow, you need to modify the nextflow.config file, and then run nexflow. Remember, that to make your workflow truly reproducible you need to save a copy of the config file. For this reason although you can specify many parameters from the command line, we recommend using the config file since this makes your runs reproducible. It may be useful to use git or similar tool to archive your config files.

Specifiying an alternative configuration file

You can use the -c option specify another configuration file

nextflow run -c data1.config plink-qc.nf

Creating a nextflow.config file

There is a template of a nextflow.config file called nextflow.config.template. This is a read only file. Make a copy of it, call it nextflow.config (or some suitable name).

Then fill in the details in the config that are required for your run. These are expained in more detail below.

Using the Excel spreadsheet template

For many users it may be convenient to use the Excel spreadsheet (config.xlsx and a read-only template file config.xlsx.template). This can be used just as an aide-memoire, but we also have an auxiliary program that converts the Excel spreadsheet into a config file. The program config-gen/dist/config-gen.jar takes the spreadsheet and produces a config file.

The spreadsheet has the following columns

  • A. a brief one-line description of the parameter;
  • B. the name of the parameter as found in the config file;
  • C. the default value that will be used by the config-gen program if no value specified in column E;
  • D. possibile alternate value the user might consider;
  • E. the value that the user wants to use

If you are using this semi-automated way of producing the config file, remember that to be fully reproducible the config file must be saved too. We suggest making a copy of the spreadsheet template file giving it an appropriate name.

To run the config-gen program, run

java -jar ./config-gen/dist/config-gen.jar nameofspreadsheet > newconfig.config