From da382c7acba6af75856d7cfcbcb079528c8f6a02 Mon Sep 17 00:00:00 2001 From: nvnieuwk Date: Tue, 20 Aug 2024 12:25:01 +0000 Subject: [PATCH] Deployed 928b416 to 2.1 with MkDocs 1.6.0 and mike 2.1.3 --- 2.1/configuration/configuration/index.html | 97 ++++++++++++++++++++++ 2.1/search/search_index.json | 2 +- 2 files changed, 98 insertions(+), 1 deletion(-) diff --git a/2.1/configuration/configuration/index.html b/2.1/configuration/configuration/index.html index 8871c68..f5e9381 100644 --- a/2.1/configuration/configuration/index.html +++ b/2.1/configuration/configuration/index.html @@ -1030,6 +1030,39 @@ + + +
  • + + + Summary + + + + +
  • @@ -1290,6 +1323,39 @@ + + +
  • + + + Summary + + + + +
  • @@ -1395,6 +1461,10 @@

    beforeText

    Any string provided to this option will printed before the help message.

    validation.help.beforeText = "Running pipeline version 1.0" // default: ""
     
    +
    +

    Info

    +

    All color values (like \033[0;31m, which means the color red) will be filtered out when validation.monochromeLogs is set to true

    +

    command

    This option does not affect the help message created by the paramsHelp() function

    @@ -1407,6 +1477,10 @@

    command

    nextflow run main.nf --input samplesheet.csv --outdir output
    +
    +

    Info

    +

    All color values (like \033[0;31m, which means the color red) will be filtered out when validation.monochromeLogs is set to true

    +

    afterText

    This option does not affect the help message created by the paramsHelp() function

    @@ -1414,6 +1488,29 @@

    afterText

    Any string provided to this option will be printed after the help message.

    validation.help.afterText = "Please cite the pipeline owners when using this pipeline" // default: ""
     
    +
    +

    Info

    +

    All color values (like \033[0;31m, which means the color red) will be filtered out when validation.monochromeLogs is set to true

    +
    +

    Summary

    +

    The validation.summary config scope can be used to configure the output of the paramsSummaryLog() function.

    +

    This scope contains the following options:

    +

    beforeText

    +

    Any string provided to this option will printed before the parameters log message.

    +
    validation.summary.beforeText = "Running pipeline version 1.0" // default: ""
    +
    +
    +

    Info

    +

    All color values (like \033[0;31m, which means the color red) will be filtered out when validation.monochromeLogs is set to true

    +
    +

    afterText

    +

    Any string provided to this option will be printed after the parameters log message.

    +
    validation.summary.afterText = "Please cite the pipeline owners when using this pipeline" // default: ""
    +
    +
    +

    Info

    +

    All color values (like \033[0;31m, which means the color red) will be filtered out when validation.monochromeLogs is set to true

    +
    diff --git a/2.1/search/search_index.json b/2.1/search/search_index.json index f719fd1..2205be3 100644 --- a/2.1/search/search_index.json +++ b/2.1/search/search_index.json @@ -1 +1 @@ -{"config":{"lang":["en"],"separator":"[\\s\\-]+","pipeline":["stopWordFilter"]},"docs":[{"location":"","title":"nf-schema","text":"

    A Nextflow plugin to work with validation of pipeline parameters and sample sheets.

    Info

    nf-schema is the new version of the now deprecated nf-validation. Please follow the migration guide to migrate your code to this new version.

    "},{"location":"#introduction","title":"Introduction","text":"

    This Nextflow plugin provides functionality that can be used in a Nextflow pipeline to work with parameter and sample sheet schema. The added functionality is:

    Supported sample sheet formats are CSV, TSV, JSON and YAML.

    "},{"location":"#quick-start","title":"Quick Start","text":"

    Declare the plugin in your Nextflow pipeline configuration file:

    nextflow.config
    plugins {\n  id 'nf-schema@2.1.0'\n}\n

    This is all that is needed - Nextflow will automatically fetch the plugin code at run time.

    [!NOTE] The snippet above will always try to install the specified version. We encourage always pinning the plugin version to make sure the used pipeline will keep working when a new version of nf-schema with breaking changes has been released.

    You can now include the plugin helper functions into your Nextflow pipeline:

    main.nf
    include { validateParameters; paramsSummaryLog; samplesheetToList } from 'plugin/nf-schema'\n\n// Validate input parameters\nvalidateParameters()\n\n// Print summary of supplied parameters\nlog.info paramsSummaryLog(workflow)\n\n// Create a new channel of metadata from a sample sheet passed to the pipeline through the --input parameter\nch_input = Channel.fromList(samplesheetToList(params.input, \"assets/schema_input.json\"))\n

    Or enable the creation of the help message (using --help) in the configuration file:

    nextflow.config
    validation {\n  help {\n    enabled: true\n  }\n}\n
    "},{"location":"#dependencies","title":"Dependencies","text":""},{"location":"#slack-channel","title":"Slack channel","text":"

    There is a dedicated nf-schema Slack channel in the Nextflow Slack workspace.

    "},{"location":"#credits","title":"Credits","text":"

    This plugin was written based on code initially written within the nf-core community, as part of the nf-core pipeline template.

    We would like to thank the key contributors who include (but are not limited to):

    "},{"location":"background/","title":"Background","text":"

    The Nextflow workflow manager is a powerful tool for scientific workflows. In order for end users to launch a given workflow with different input data and varying settings, pipelines are developed using a special variable type called parameters (params). Defaults are hardcoded into scripts and config files but can be overwritten by user config files and command-line flags (see the Nextflow docs).

    In addition to config params, a common best-practice for pipelines is to use a \"sample sheet\" file containing required input information. For example: a sample identifier, filenames and other sample-level metadata.

    Nextflow itself does not provide functionality to validate config parameters or parsed sample sheets. To bridge this gap, we developed code within the nf-core community to allow pipelines to work with a standard nextflow_schema.json file, written using the JSON Schema format. The file allows strict typing of parameter variables and inclusion of validation rules.

    The nf-schema plugin moves this code out of the nf-core template into a stand-alone package, to make it easier to use for the wider Nextflow community. It also incorporates a number of new features, such as native Groovy sample sheet validation.

    Earlier versions of the plugin can be found in the nf-validation repository and can still be used in the pipeline. However the nf-validation plugin is no longer supported and all development has been moved to nf-schema.

    "},{"location":"migration_guide/","title":"Migration guide","text":"

    Warning

    nf-schema currently is not supported by the nf-core tooling. Using this plugin will break the linting and schema builder. See these issues for the progress on the nf-core migration to nf-schema:

    1. https://github.com/nf-core/tools/issues/2932
    2. https://github.com/nf-core/tools/issues/2784
    3. https://github.com/nf-core/tools/issues/2429

    This guide is intended to help you migrate your pipeline from nf-validation to nf-schema.

    "},{"location":"migration_guide/#major-changes-in-the-plugin","title":"Major changes in the plugin","text":"

    Following list shows the major breaking changes introduced in nf-schema:

    1. The JSON schema draft has been updated from draft-07 to draft-2020-12. See JSON Schema draft 2020-12 release notes and JSON schema draft 2019-09 release notes for more information.
    2. The fromSamplesheet channel factory has been converted to a function called samplesheetToList. See updating fromSamplesheet for more information.
    3. The unique keyword for samplesheet schemas has been removed. Please use uniqueItems or uniqueEntries now instead.
    4. The dependentRequired keyword now works as it's supposed to work in JSON schema. See dependentRequired for more information.
    5. All configuration parameters have been converted to Nextflow configuration options. See Updating configuration for more information.
    6. Help messages are now created automatically instead of using the paramsHelp() function. (v2.1.0 feature)

    A full list of changes can be found in the changelog.

    "},{"location":"migration_guide/#updating-your-pipeline","title":"Updating your pipeline","text":"

    Updating your pipeline can be done in a couple simple steps.

    "},{"location":"migration_guide/#updating-the-name-and-version-of-the-plugin","title":"Updating the name and version of the plugin","text":"

    The name and the version of the plugin should be updated from nf-validation to nf-schema@2.0.0:

    nf-validationnf-schema
    plugins {\n    id 'nf-validation@1.1.3'\n}\n
    plugins {\n    id 'nf-schema@2.0.0'\n}\n

    Additionally, all includes from nf-validation should be updated to nf-schema. This can easily be done with the following command:

    find . -type f -name \"*.nf\" -exec sed -i -e \"s/from 'plugin\\/nf-validation'/from 'plugin\\/nf-schema'/g\" -\ne 's/from \"plugin\\/nf-validation\"/from \"plugin\\/nf-schema\"/g' {} +\n
    "},{"location":"migration_guide/#updating-the-json-schema-files","title":"Updating the JSON schema files","text":"

    If you aren't using any special features in your schemas, you can simply update your nextflow_schema.json file using the following command:

    sed -i -e 's/http:\\/\\/json-schema.org\\/draft-07\\/schema/https:\\/\\/json-schema.org\\/draft\\/2020-12\\/schema/g' -e 's/definitions/$defs/g' nextflow_schema.json\n

    This will replace the old schema draft specification (draft-07) by the new one (2020-12), and the old keyword definitions by the new notation $defs.

    Note

    Repeat this command for every JSON schema used in your pipeline. e.g. for the default samplesheet schema in nf-core pipelines: bash sed -i -e 's/http:\\/\\/json-schema.org\\/draft-07\\/schema/https:\\/\\/json-schema.org\\/draft\\/2020-12\\/schema/g' -e 's/definitions/$defs/g' assets/schema_input.json

    Warning

    This will not update changes to special fields in the schema, see the guide for special JSON schema keywords on how to update these

    "},{"location":"migration_guide/#update-the-samplesheet-conversion","title":"Update the samplesheet conversion","text":"

    The .fromSamplesheet channel factory should be converted to the samplesheetToList function. Following tabs shows how to use the function to get the same effect as the channel factory:

    nf-validationnf-schema
    include { fromSamplesheet } from 'plugin/nf-validation'\nChannel.fromSamplesheet(\"input\")\n
    include { samplesheetToList } from 'plugin/nf-schema'\nChannel.fromList(samplesheetToList(params.input, \"path/to/samplesheet/schema\"))\n

    Note

    This change was necessary to make it possible for pipelines to be used as pluggable workflows. This also enables the validation and conversion of files generated by the pipeline.

    "},{"location":"migration_guide/#updating-configuration","title":"Updating configuration","text":"

    The configuration parameters have been converted to a Nextflow configuration option. You can now access these options using the validation config scope:

    validation.<option> = <value>\n

    OR

    validation {\n    <option1> = <value1>\n    <option2> = <value2>\n}\n

    See this table for an overview of what the new configuration options are for the old parameters:

    Old parameter New config option(s) params.validationMonochromeLogs = <boolean> validation.monochromeLogs = <boolean> params.validationLenientMode = <boolean> validation.lenientMode = <boolean> params.validationFailUnrecognisedParams = <boolean> validation.failUnrecognisedParams = <boolean> params.validationShowHiddenParams = <boolean> validation.showHiddenParams = <boolean> params.validationIgnoreParams = <string> validation.defaultIgnoreParams = <list> and validation.ignoreParams = <list>

    Note

    defaultIgnoreParams is meant to be used by pipeline developers to set the parameters which should always be ignored. ignoreParams is meant for the pipeline user to ignore certain parameters.

    "},{"location":"migration_guide/#updating-special-keywords-in-json-schemas","title":"Updating special keywords in JSON schemas","text":"

    If you are using any special features in your schemas, you will need to update your schemas manually. Please refer to the JSON Schema draft 2020-12 release notes and JSON schema draft 2019-09 release notes for more information.

    However here are some guides to the more common migration patterns:

    "},{"location":"migration_guide/#updating-unique-keyword","title":"Updating unique keyword","text":"

    When you use unique in your schemas, you should update it to use uniqueItems or uniqueEntries instead.

    If you used the unique:true field, you should update it to use uniqueItems like this:

    nf-validationnf-schema
    {\n    \"$schema\": \"http://json-schema.org/draft-07/schema\",\n    \"type\": \"array\",\n    \"items\": {\n        \"type\": \"object\",\n        \"properties\": {\n            \"sample\": {\n                \"type\": \"string\",\n                \"unique\": true\n            }\n        }\n    }\n}\n
    {\n    \"$schema\": \"https://json-schema.org/draft/2020-12/schema\",\n    \"type\": \"array\",\n    \"items\": {\n        \"type\": \"object\",\n        \"properties\": {\n            \"sample\": {\n                \"type\": \"string\"\n            }\n        }\n    },\n    \"uniqueItems\": true\n}\n

    If you used the unique: [\"field1\", \"field2\"] field, you should update it to use uniqueEntries like this:

    nf-validationnf-schema
    {\n    \"$schema\": \"http://json-schema.org/draft-07/schema\",\n    \"type\": \"array\",\n    \"items\": {\n        \"type\": \"object\",\n        \"properties\": {\n            \"sample\": {\n                \"type\": \"string\",\n                \"unique\": [\"sample\"]\n            }\n        }\n    }\n}\n
    {\n    \"$schema\": \"https://json-schema.org/draft/2020-12/schema\",\n    \"type\": \"array\",\n    \"items\": {\n        \"type\": \"object\",\n        \"properties\": {\n            \"sample\": {\n                \"type\": \"string\"\n            }\n        }\n    },\n    \"uniqueEntries\": [\"sample\"]\n}\n
    "},{"location":"migration_guide/#updating-dependentrequired-keyword","title":"Updating dependentRequired keyword","text":"

    When you use dependentRequired in your schemas, you should update it like this:

    nf-validationnf-schema
    {\n    \"$schema\": \"http://json-schema.org/draft-07/schema\",\n    \"type\": \"object\",\n    \"properties\": {\n        \"fastq_1\": {\n            \"type\": \"string\",\n            \"format\": \"file-path\"\n        },\n        \"fastq_2\": {\n            \"type\": \"string\",\n            \"format\": \"file-path\",\n            \"dependentRequired\": [\"fastq_1\"]\n        }\n    }\n}\n
    {\n    \"$schema\": \"https://json-schema.org/draft/2020-12/schema\",\n    \"type\": \"object\",\n    \"properties\": {\n        \"fastq_1\": {\n            \"type\": \"string\",\n            \"format\": \"file-path\"\n        },\n        \"fastq_2\": {\n            \"type\": \"string\",\n            \"format\": \"file-path\"\n        }\n    },\n    \"dependentRequired\": {\n        \"fastq_2\": [\"fastq_1\"]\n    }\n}\n
    "},{"location":"migration_guide/#updating-the-help-message","title":"Updating the help message","text":"

    v2.1.0 feature

    The creation of the help message now needs to be enabled in the configuration file. Using --help or --helpFull will automatically print the help message and stop the pipeline execution. paramsHelp() is still available in nf-schema and can still be used like before. This could be helpful to print the help message in specific cases. Mind that this function now automatically emits a deprecation warning. This warning can be disabled using the hideWarning:true option of the function.

    nf-validationnf-schema main.nf
    if (params.help) {\n    log.info paramsHelp(\"nextflow run my_pipeline --input input_file.csv\")\n    exit 0\n}\n
    nextflow.config
    validation {\n    help {\n        enabled: true\n        command: \"nextflow run my_pipeline --input input_file.csv\"\n    }\n}\n
    "},{"location":"configuration/configuration/","title":"Configuration","text":"

    The plugin can be configured using several configuration options. These options have to be in the validation scope which means you can write them in two ways:

    validation.<option> = <value>\n

    OR

    validation {\n    <option1> = <value1>\n    <option2> = <value2>\n}\n
    "},{"location":"configuration/configuration/#parametersschema","title":"parametersSchema","text":"

    This option can be used to set the parameters JSON schema to be used by the plugin. This will affect parameter validation (validateParameters()), the summary logs (paramsSummaryLog() and paramsSummaryMap()) and the creation of the help messages.

    validation.parametersSchema = \"path/to/schema.json\" // default \"nextflow_schema.json\"\n

    This option can either be a path relative to the root of the pipeline directory or a full path to the JSON schema (Be wary to not use hardcoded local paths to ensure your pipeline will keep working on other systems)

    "},{"location":"configuration/configuration/#monochromelogs","title":"monochromeLogs","text":"

    This option can be used to turn of the colored logs from nf-validation. This can be useful if you run a Nextflow pipeline in an environment that doesn't support colored logging.

    validation.monochromeLogs = <true|false> // default: false\n
    "},{"location":"configuration/configuration/#lenientmode","title":"lenientMode","text":"

    This option can be used to make the type validation more lenient. In normal cases a value of \"12\" will fail if the type is an integer. This will succeed in lenient mode since that string can be cast to an integer.

    validation.lenientMode = <true|false> // default: false\n
    "},{"location":"configuration/configuration/#failunrecognisedparams","title":"failUnrecognisedParams","text":"

    By default the validateParameters() function will only give a warning if an unrecognised parameter has been given. This usually indicates that a typo has been made and can be easily overlooked when the plugin only emits a warning. You can turn this warning into an error with the failUnrecognisedParams option.

    validation.failUnrecognisedParams = <true|false> // default: false\n
    "},{"location":"configuration/configuration/#showhiddenparams","title":"showHiddenParams","text":"

    Deprecated

    This configuration option has been deprecated since v2.1.0. Please use validation.help.showHidden instead.

    By default the parameters, that have the \"hidden\": true annotation in the JSON schema, will not be shown in the help message. Turning on this option will make sure the hidden parameters are also shown.

    validation.showHiddenParams = <true|false> // default: false\n
    "},{"location":"configuration/configuration/#ignoreparams","title":"ignoreParams","text":"

    This option can be used to turn off the validation for certain parameters. It takes a list of parameter names as input.

    validation.ignoreParams = [\"param1\", \"param2\"] // default: []\n
    "},{"location":"configuration/configuration/#defaultignoreparams","title":"defaultIgnoreParams","text":"

    Warning

    This option should only be used by pipeline developers

    This option does exactly the same as validation.ignoreParams, but provides pipeline developers with a way to set the default parameters that should be ignored. This way the pipeline users don't have to re-specify the default ignored parameters when using the validation.ignoreParams option.

    validation.defaultIgnoreParams = [\"param1\", \"param2\"] // default: []\n
    "},{"location":"configuration/configuration/#help","title":"help","text":"

    The validation.help config scope can be used to configure the creation of the help message.

    This scope contains the following options:

    "},{"location":"configuration/configuration/#enabled","title":"enabled","text":"

    This option is used to enable the creation of the help message when the help parameters are used in the nextflow run command.

    validation.help.enabled = true // default: false\n
    "},{"location":"configuration/configuration/#shortparameter","title":"shortParameter","text":"

    This option can be used to change the --help parameter to another parameter. This parameter will print out the help message with all top level parameters.

    validation.help.shortParameter = \"giveMeHelp\" // default: \"help\"\n

    --giveMeHelp will now display the help message instead of --help for this example. This parameter will print out the help message.

    "},{"location":"configuration/configuration/#fullparameter","title":"fullParameter","text":"

    This option can be used to change the --helpFull parameter to another parameter.

    validation.help.shortParameter = \"giveMeHelpFull\" // default: \"helpFull\"\n

    --giveMeHelpFull will now display the expanded help message instead of --helpFull for this example.

    "},{"location":"configuration/configuration/#showhiddenparameter","title":"showHiddenParameter","text":"

    This option can be used to change the --showHidden parameter to another parameter. This parameter tells the plugin to also include the hidden parameters into the help message.

    validation.help.showHiddenParameter = \"showMeThoseHiddenParams\" // default: \"showHidden\"\n

    --showMeThoseHiddenParams will now make sure hidden parameters will be shown instead of --showHidden for this example.

    "},{"location":"configuration/configuration/#showhidden","title":"showHidden","text":"

    By default the parameters, that have the \"hidden\": true annotation in the JSON schema, will not be shown in the help message. Turning on this option will make sure the hidden parameters are also shown.

    validation.help.showHidden = <true|false> // default: false\n
    "},{"location":"configuration/configuration/#beforetext","title":"beforeText","text":"

    This option does not affect the help message created by the paramsHelp() function

    Any string provided to this option will printed before the help message.

    validation.help.beforeText = \"Running pipeline version 1.0\" // default: \"\"\n
    "},{"location":"configuration/configuration/#command","title":"command","text":"

    This option does not affect the help message created by the paramsHelp() function

    This option can be used to add an example command to the help message. This will be printed after the beforeText and before the help message.

    validation.help.command = \"nextflow run main.nf --input samplesheet.csv --outdir output\" // default: \"\"\n

    This example will print the following message:

    Typical pipeline command:\n\n  nextflow run main.nf --input samplesheet.csv --outdir output\n
    "},{"location":"configuration/configuration/#aftertext","title":"afterText","text":"

    This option does not affect the help message created by the paramsHelp() function

    Any string provided to this option will be printed after the help message.

    validation.help.afterText = \"Please cite the pipeline owners when using this pipeline\" // default: \"\"\n
    "},{"location":"contributing/setup/","title":"Getting started with plugin development","text":""},{"location":"contributing/setup/#compiling","title":"Compiling","text":"

    To compile and run the tests use the following command:

    ./gradlew check\n
    "},{"location":"contributing/setup/#launch-it-with-installed-nextflow","title":"Launch it with installed Nextflow","text":"

    Warning

    This method will add the development version of the plugin to your Nextflow plugins Take care when using this method and make sure that you are never using a development version to run real pipelines. You can delete all nf-schema versions using this command:

    rm -rf ~/.nextflow/plugins/nf-schema*\n

    make install\n
    nextflow.config
    plugins {\n    id 'nf-schema@x.y.z'\n}\n
    "},{"location":"contributing/setup/#launch-it-with-a-local-version-of-nextflow","title":"Launch it with a local version of Nextflow","text":"
    cd .. && git clone https://github.com/nextflow-io/nextflow\ncd nextflow && ./gradlew exportClasspath\n
    includeBuild('../nextflow')\n
    ./gradlew compileGroovy\n
    ./launch.sh run -plugins nf-schema <script/pipeline name> [pipeline params]\n
    "},{"location":"contributing/setup/#change-and-preview-the-docs","title":"Change and preview the docs","text":"

    The docs are generated using Material for MkDocs. You can install the required packages as follows:

    pip install mkdocs-material pymdown-extensions pillow cairosvg\n

    To change the docs, edit the files in the docs/ folder and run the following command to generate the docs:

    mkdocs serve\n

    To preview the docs, open the URL provided by mkdocs in your browser.

    "},{"location":"nextflow_schema/","title":"Nextflow schema for parameters","text":"

    The functionality of the nf-schema plugin centres on a pipeline schema file. By convention, this file is stored in the workflow root directory and called nextflow_schema.json.

    "},{"location":"nextflow_schema/#what-it-does","title":"What it does","text":"

    The schema file provides a place to describe the pipeline configuration. It is based on the JSON Schema format standard.

    In brief, it includes information for each parameter about:

    ..and more. See the full specification for details.

    Warning

    Although it's based on JSON Schema - there are some differences. We use a few non-standard keys and impose one or two limitations that are not present in the standard specification.

    Tip

    It's highly recommended that you don't try to write the schema JSON file manually. Instead, use the provided tooling - see Creating schema for details.

    "},{"location":"nextflow_schema/#how-its-used","title":"How it's used","text":"

    The nextflow_schema.json file and format have been in use for a few years now and are widely used in the community. Some specific examples of usage are:

    "},{"location":"nextflow_schema/#looking-to-the-future","title":"Looking to the future","text":"

    The pipeline schema has been developed to provide additional functionality not present in core Nextflow. It's our hope that at some point this functionality will be added to core Nextflow, making schema files redundant.

    See the GitHub issue Evolution of Nextflow configuration file (nextflow-io/nextflow#2723) on the Nextflow repo for discussion about potential new configuration file formats, which could potentially include the kind of information that we have within schema.

    "},{"location":"nextflow_schema/create_schema/","title":"Creating schema files","text":"

    Warning

    It's highly recommended that you don't try to write the schema JSON file manually!

    The schema files can get large and complex and are difficult to debug. Don't be tempted to open in your code editor - instead use the provided tools!

    "},{"location":"nextflow_schema/create_schema/#requirements","title":"Requirements","text":"

    To work with Nextflow schema files, you need the nf-core command-line tools package. You can find full installation instructions in the nf-core documentation, but in brief, you install as with any other Python package:

    pip install nf-core\n# -- OR -- #\nconda install nf-core # (1)!\n
    1. Note: Needs bioconda channels to be configured! See the Bioconda usage docs.

    Info

    Although these tools are currently within the nf-core tooling ecosystem, they should work with any Nextflow pipeline: you don't have to be using the nf-core template for this.

    Note

    We aim to extract this functionality into stand-alone tools at a future date, as we have done with the pipeline validation code in this plugin.

    "},{"location":"nextflow_schema/create_schema/#build-a-pipeline-schema","title":"Build a pipeline schema","text":"

    Once you have nf-core/tools installed and have written your pipeline configuration, go to the pipeline root and run the following:

    nf-core schema build\n

    Warning

    The current version of nf-core tools (v2.13.1) does not support the new schema draft used in nf-schema. Running this command after building the schema will convert the schema to the right draft:

    sed -i -e 's/http:\\/\\/json-schema.org\\/draft-07\\/schema/https:\\/\\/json-schema.org\\/draft\\/2020-12\\/schema/g' -e 's/definitions/$defs/g' nextflow_schema.json\n
    A new version of the nf-core schema builder will be available soon. Keep an eye out!

    The tool will run the nextflow config command to extract your pipeline's configuration and compare the output to your nextflow_schema.json file (if it exists). It will prompt you to update the schema file with any changes, then it will ask if you wish to edit the schema using the web interface.

    This web interface is where you should add detail to your schema, customising the various fields for each parameter.

    Tip

    You can run the nf-core schema build command again and again, as many times as you like. It's designed both for initial creation but also future updates of the schema file.

    It's a good idea to \"save little and often\" by clicking Finished and saving your work locally, then running the command again to continue.

    "},{"location":"nextflow_schema/create_schema/#build-a-sample-sheet-schema","title":"Build a sample sheet schema","text":"

    Danger

    There is currently no tooling to help you write sample sheet schema

    You can find an example in Example sample sheet schema

    Watch this space..

    "},{"location":"nextflow_schema/nextflow_schema_examples/","title":"Example Nextflow schema","text":"

    You can see an example JSON Schema for a Nextflow pipeline nextflow_schema.json file below.

    This file was generated from the nf-core pipeline template, using nf-core create. It is used as a test fixture in the nf-schema package here.

    Note

    More examples can be found in the plugin testResources directory.

    {\n    \"$schema\": \"https://json-schema.org/draft/2020-12/schema\",\n    \"$id\": \"https://raw.githubusercontent.com/nf-core/testpipeline/master/nextflow_schema.json\",\n    \"title\": \"nf-core/testpipeline pipeline parameters\",\n    \"description\": \"this is a test\",\n    \"type\": \"object\",\n    \"$defs\": {\n        \"input_output_options\": {\n            \"title\": \"Input/output options\",\n            \"type\": \"object\",\n            \"fa_icon\": \"fas fa-terminal\",\n            \"description\": \"Define where the pipeline should find input data and save output data.\",\n            \"required\": [\"input\", \"outdir\"],\n            \"properties\": {\n                \"input\": {\n                    \"type\": \"string\",\n                    \"format\": \"file-path\",\n                    \"mimetype\": \"text/csv\",\n                    \"pattern\": \"^\\\\S+\\\\.(csv|tsv|yaml|json)$\",\n                    \"description\": \"Path to comma-separated file containing information about the samples in the experiment.\",\n                    \"help_text\": \"You will need to create a design file with information about the samples in your experiment before running the pipeline. Use this parameter to specify its location. It has to be a comma-separated file with 3 columns, and a header row. See [usage docs](https://nf-co.re/testpipeline/usage#samplesheet-input).\",\n                    \"fa_icon\": \"fas fa-file-csv\"\n                },\n                \"outdir\": {\n                    \"type\": \"string\",\n                    \"format\": \"directory-path\",\n                    \"description\": \"The output directory where the results will be saved. You have to use absolute paths to storage on Cloud infrastructure.\",\n                    \"fa_icon\": \"fas fa-folder-open\"\n                },\n                \"email\": {\n                    \"type\": \"string\",\n                    \"description\": \"Email address for completion summary.\",\n                    \"fa_icon\": \"fas fa-envelope\",\n                    \"help_text\": \"Set this parameter to your e-mail address to get a summary e-mail with details of the run sent to you when the workflow exits. If set in your user config file (`~/.nextflow/config`) then you don't need to specify this on the command line for every run.\",\n                    \"pattern\": \"^([a-zA-Z0-9_\\\\-\\\\.]+)@([a-zA-Z0-9_\\\\-\\\\.]+)\\\\.([a-zA-Z]{2,5})$\"\n                },\n                \"multiqc_title\": {\n                    \"type\": \"string\",\n                    \"description\": \"MultiQC report title. Printed as page header, used for filename if not otherwise specified.\",\n                    \"fa_icon\": \"fas fa-file-signature\"\n                }\n            }\n        },\n        \"reference_genome_options\": {\n            \"title\": \"Reference genome options\",\n            \"type\": \"object\",\n            \"fa_icon\": \"fas fa-dna\",\n            \"description\": \"Reference genome related files and options required for the workflow.\",\n            \"properties\": {\n                \"genome\": {\n                    \"type\": \"string\",\n                    \"description\": \"Name of iGenomes reference.\",\n                    \"fa_icon\": \"fas fa-book\",\n                    \"help_text\": \"If using a reference genome configured in the pipeline using iGenomes, use this parameter to give the ID for the reference. This is then used to build the full paths for all required reference genome files e.g. `--genome GRCh38`. \\n\\nSee the [nf-core website docs](https://nf-co.re/usage/reference_genomes) for more details.\"\n                },\n                \"fasta\": {\n                    \"type\": \"string\",\n                    \"format\": \"file-path\",\n                    \"mimetype\": \"text/plain\",\n                    \"pattern\": \"^\\\\S+\\\\.fn?a(sta)?(\\\\.gz)?$\",\n                    \"description\": \"Path to FASTA genome file.\",\n                    \"help_text\": \"This parameter is *mandatory* if `--genome` is not specified. If you don't have a BWA index available this will be generated for you automatically. Combine with `--save_reference` to save BWA index for future runs.\",\n                    \"fa_icon\": \"far fa-file-code\"\n                },\n                \"igenomes_base\": {\n                    \"type\": \"string\",\n                    \"format\": \"directory-path\",\n                    \"description\": \"Directory / URL base for iGenomes references.\",\n                    \"default\": \"s3://ngi-igenomes/igenomes\",\n                    \"fa_icon\": \"fas fa-cloud-download-alt\",\n                    \"hidden\": true\n                },\n                \"igenomes_ignore\": {\n                    \"type\": \"boolean\",\n                    \"description\": \"Do not load the iGenomes reference config.\",\n                    \"fa_icon\": \"fas fa-ban\",\n                    \"hidden\": true,\n                    \"help_text\": \"Do not load `igenomes.config` when running the pipeline. You may choose this option if you observe clashes between custom parameters and those supplied in `igenomes.config`.\"\n                }\n            }\n        },\n        \"institutional_config_options\": {\n            \"title\": \"Institutional config options\",\n            \"type\": \"object\",\n            \"fa_icon\": \"fas fa-university\",\n            \"description\": \"Parameters used to describe centralised config profiles. These should not be edited.\",\n            \"help_text\": \"The centralised nf-core configuration profiles use a handful of pipeline parameters to describe themselves. This information is then printed to the Nextflow log when you run a pipeline. You should not need to change these values when you run a pipeline.\",\n            \"properties\": {\n                \"custom_config_version\": {\n                    \"type\": \"string\",\n                    \"description\": \"Git commit id for Institutional configs.\",\n                    \"default\": \"master\",\n                    \"hidden\": true,\n                    \"fa_icon\": \"fas fa-users-cog\"\n                },\n                \"custom_config_base\": {\n                    \"type\": \"string\",\n                    \"description\": \"Base directory for Institutional configs.\",\n                    \"default\": \"https://raw.githubusercontent.com/nf-core/configs/master\",\n                    \"hidden\": true,\n                    \"help_text\": \"If you're running offline, Nextflow will not be able to fetch the institutional config files from the internet. If you don't need them, then this is not a problem. If you do need them, you should download the files from the repo and tell Nextflow where to find them with this parameter.\",\n                    \"fa_icon\": \"fas fa-users-cog\"\n                },\n                \"config_profile_name\": {\n                    \"type\": \"string\",\n                    \"description\": \"Institutional config name.\",\n                    \"hidden\": true,\n                    \"fa_icon\": \"fas fa-users-cog\"\n                },\n                \"config_profile_description\": {\n                    \"type\": \"string\",\n                    \"description\": \"Institutional config description.\",\n                    \"hidden\": true,\n                    \"fa_icon\": \"fas fa-users-cog\"\n                },\n                \"config_profile_contact\": {\n                    \"type\": \"string\",\n                    \"description\": \"Institutional config contact information.\",\n                    \"hidden\": true,\n                    \"fa_icon\": \"fas fa-users-cog\"\n                },\n                \"config_profile_url\": {\n                    \"type\": \"string\",\n                    \"description\": \"Institutional config URL link.\",\n                    \"hidden\": true,\n                    \"fa_icon\": \"fas fa-users-cog\"\n                }\n            }\n        },\n        \"max_job_request_options\": {\n            \"title\": \"Max job request options\",\n            \"type\": \"object\",\n            \"fa_icon\": \"fab fa-acquisitions-incorporated\",\n            \"description\": \"Set the top limit for requested resources for any single job.\",\n            \"help_text\": \"If you are running on a smaller system, a pipeline step requesting more resources than are available may cause the Nextflow to stop the run with an error. These options allow you to cap the maximum resources requested by any single job so that the pipeline will run on your system.\\n\\nNote that you can not _increase_ the resources requested by any job using these options. For that you will need your own configuration file. See [the nf-core website](https://nf-co.re/usage/configuration) for details.\",\n            \"properties\": {\n                \"max_cpus\": {\n                    \"type\": \"integer\",\n                    \"description\": \"Maximum number of CPUs that can be requested for any single job.\",\n                    \"default\": 16,\n                    \"fa_icon\": \"fas fa-microchip\",\n                    \"hidden\": true,\n                    \"help_text\": \"Use to set an upper-limit for the CPU requirement for each process. Should be an integer e.g. `--max_cpus 1`\"\n                },\n                \"max_memory\": {\n                    \"type\": \"string\",\n                    \"description\": \"Maximum amount of memory that can be requested for any single job.\",\n                    \"default\": \"128.GB\",\n                    \"fa_icon\": \"fas fa-memory\",\n                    \"pattern\": \"^\\\\d+(\\\\.\\\\d+)?\\\\.?\\\\s*(K|M|G|T)?B$\",\n                    \"hidden\": true,\n                    \"help_text\": \"Use to set an upper-limit for the memory requirement for each process. Should be a string in the format integer-unit e.g. `--max_memory '8.GB'`\"\n                },\n                \"max_time\": {\n                    \"type\": \"string\",\n                    \"description\": \"Maximum amount of time that can be requested for any single job.\",\n                    \"default\": \"240.h\",\n                    \"fa_icon\": \"far fa-clock\",\n                    \"pattern\": \"^(\\\\d+\\\\.?\\\\s*(s|m|h|day)\\\\s*)+$\",\n                    \"hidden\": true,\n                    \"help_text\": \"Use to set an upper-limit for the time requirement for each process. Should be a string in the format integer-unit e.g. `--max_time '2.h'`\"\n                }\n            }\n        },\n        \"generic_options\": {\n            \"title\": \"Generic options\",\n            \"type\": \"object\",\n            \"fa_icon\": \"fas fa-file-import\",\n            \"description\": \"Less common options for the pipeline, typically set in a config file.\",\n            \"help_text\": \"These options are common to all nf-core pipelines and allow you to customise some of the core preferences for how the pipeline runs.\\n\\nTypically these options would be set in a Nextflow config file loaded for all pipeline runs, such as `~/.nextflow/config`.\",\n            \"properties\": {\n                \"help\": {\n                    \"type\": [\"string\", \"boolean\"],\n                    \"description\": \"Display help text.\",\n                    \"fa_icon\": \"fas fa-question-circle\",\n                    \"hidden\": true\n                },\n                \"publish_dir_mode\": {\n                    \"type\": \"string\",\n                    \"default\": \"copy\",\n                    \"description\": \"Method used to save pipeline results to output directory.\",\n                    \"help_text\": \"The Nextflow `publishDir` option specifies which intermediate files should be saved to the output directory. This option tells the pipeline what method should be used to move these files. See [Nextflow docs](https://www.nextflow.io/docs/latest/process.html#publishdir) for details.\",\n                    \"fa_icon\": \"fas fa-copy\",\n                    \"enum\": [\"symlink\", \"rellink\", \"link\", \"copy\", \"copyNoFollow\", \"move\"],\n                    \"hidden\": true\n                },\n                \"email_on_fail\": {\n                    \"type\": \"string\",\n                    \"description\": \"Email address for completion summary, only when pipeline fails.\",\n                    \"fa_icon\": \"fas fa-exclamation-triangle\",\n                    \"pattern\": \"^([a-zA-Z0-9_\\\\-\\\\.]+)@([a-zA-Z0-9_\\\\-\\\\.]+)\\\\.([a-zA-Z]{2,5})$\",\n                    \"help_text\": \"An email address to send a summary email to when the pipeline is completed - ONLY sent if the pipeline does not exit successfully.\",\n                    \"hidden\": true\n                },\n                \"plaintext_email\": {\n                    \"type\": \"boolean\",\n                    \"description\": \"Send plain-text email instead of HTML.\",\n                    \"fa_icon\": \"fas fa-remove-format\",\n                    \"hidden\": true\n                },\n                \"max_multiqc_email_size\": {\n                    \"type\": \"string\",\n                    \"description\": \"File size limit when attaching MultiQC reports to summary emails.\",\n                    \"pattern\": \"^\\\\d+(\\\\.\\\\d+)?\\\\.?\\\\s*(K|M|G|T)?B$\",\n                    \"default\": \"25.MB\",\n                    \"fa_icon\": \"fas fa-file-upload\",\n                    \"hidden\": true\n                },\n                \"monochrome_logs\": {\n                    \"type\": \"boolean\",\n                    \"description\": \"Do not use coloured log outputs.\",\n                    \"fa_icon\": \"fas fa-palette\",\n                    \"hidden\": true\n                },\n                \"multiqc_config\": {\n                    \"type\": \"string\",\n                    \"description\": \"Custom config file to supply to MultiQC.\",\n                    \"fa_icon\": \"fas fa-cog\",\n                    \"hidden\": true\n                },\n                \"tracedir\": {\n                    \"type\": \"string\",\n                    \"description\": \"Directory to keep pipeline Nextflow logs and reports.\",\n                    \"default\": \"${params.outdir}/pipeline_info\",\n                    \"fa_icon\": \"fas fa-cogs\",\n                    \"hidden\": true\n                },\n                \"validate_params\": {\n                    \"type\": \"boolean\",\n                    \"description\": \"Boolean whether to validate parameters against the schema at runtime\",\n                    \"default\": true,\n                    \"fa_icon\": \"fas fa-check-square\",\n                    \"hidden\": true\n                },\n                \"validationShowHiddenParams\": {\n                    \"type\": \"boolean\",\n                    \"fa_icon\": \"far fa-eye-slash\",\n                    \"description\": \"Show all params when using `--help`\",\n                    \"hidden\": true,\n                    \"help_text\": \"By default, parameters set as _hidden_ in the schema are not shown on the command line when a user runs with `--help`. Specifying this option will tell the pipeline to show all parameters.\"\n                },\n                \"enable_conda\": {\n                    \"type\": \"boolean\",\n                    \"description\": \"Run this workflow with Conda. You can also use '-profile conda' instead of providing this parameter.\",\n                    \"hidden\": true,\n                    \"fa_icon\": \"fas fa-bacon\"\n                }\n            }\n        }\n    },\n    \"allOf\": [\n        {\n            \"$ref\": \"#/$defs/input_output_options\"\n        },\n        {\n            \"$ref\": \"#/$defs/reference_genome_options\"\n        },\n        {\n            \"$ref\": \"#/$defs/institutional_config_options\"\n        },\n        {\n            \"$ref\": \"#/$defs/max_job_request_options\"\n        },\n        {\n            \"$ref\": \"#/$defs/generic_options\"\n        }\n    ]\n}\n
    "},{"location":"nextflow_schema/nextflow_schema_specification/","title":"Nextflow schema specification","text":"

    The Nextflow schema file contains information about pipeline configuration parameters. The file is typically saved in the workflow root directory and called nextflow_schema.json.

    The Nextflow schema syntax is based on the JSON schema standard, with some key differences. You can find more information about JSON Schema here:

    Warning

    This file is a reference specification, not documentation about how to write a schema manually.

    Please see Creating schema files for instructions on how to create these files (and don't be tempted to do it manually in a code editor!)

    Note

    The nf-schema plugin, as well as several other interfaces using Nextflow schema, uses a stock JSON schema library for parameter validation. As such, any valid JSON schema should work for validation.

    However, please note that graphical UIs (docs, launch interfaces) are largely hand-written and may not expect JSON schema usage that is not described here. As such, it's safest to stick to the specification described here and not the core JSON schema spec.

    "},{"location":"nextflow_schema/nextflow_schema_specification/#definitions","title":"Definitions","text":"

    A slightly strange use of a JSON schema standard that we use for Nextflow schema is $defs.

    JSON schema can group variables together in an object, but then the validation expects this structure to exist in the data that it is validating. In reality, we have a very long \"flat\" list of parameters, all at the top level of params.foo.

    In order to give some structure to log outputs, documentation and so on, we group parameters into $defs. Each def is an object with a title, description and so on. However, as they are under $defs scope they are effectively ignored by the validation and so their nested nature is not a problem. We then bring the contents of each definition object back to the \"flat\" top level for validation using a series of allOf statements at the end of the schema, which reference the specific definition keys.

    {\n  \"$schema\": \"https://json-schema.org/draft/2020-12/schema\",\n  \"type\": \"object\",\n  // Definition groups\n  \"$defs\": { // (1)!\n    \"my_group_of_params\": { // (2)!\n      \"title\": \"A virtual grouping used for docs and pretty-printing\",\n      \"type\": \"object\",\n      \"required\": [\"foo\", \"bar\"], // (3)!\n      \"properties\": { // (4)!\n        \"foo\": { // (5)!\n          \"type\": \"string\"\n        },\n        \"bar\": {\n          \"type\": \"string\"\n        }\n      }\n    }\n  },\n  // Contents of each definition group brought into main schema for validation\n  \"allOf\": [\n    { \"$ref\": \"#/$defs/my_group_of_params\" } // (6)!\n  ]\n}\n
    1. An arbitrary number of definition groups can go in here - these are ignored by main schema validation.
    2. This ID is used later in the allOf block to reference the definition.
    3. Note that any required properties need to be listed within this object scope.
    4. Actual parameter specifications go in here.
    5. Shortened here for the example, see below for full parameter specification.
    6. A $ref line like this needs to be added for every definition group

    Parameters can be described outside of the $defs scope, in the regular JSON Schema top-level properties scope. However, they will be displayed as ungrouped in tools working off the schema.

    "},{"location":"nextflow_schema/nextflow_schema_specification/#nested-parameters","title":"Nested parameters","text":"

    New feature in v2.1.0

    Nextflow config allows parameters to be nested as objects, for example:

    params {\n    foo {\n        bar = \"baz\"\n    }\n}\n

    or on the CLI:

    nextflow run <pipeline> --foo.bar \"baz\"\n

    Nested parameters can be specified in the schema by adding a properties keyword to the root parameters:

    {\n  \"type\": \"object\",\n  \"properties\": {\n    \"thisIsNested\": {\n      // Annotation for the --thisIsNested parameter\n      \"type\": \"object\", // Parameters that contain subparameters need to have the \"object\" type\n      \"properties\": {\n        // Add other parameters in here\n        \"deep\": {\n          // Annotation for the --thisIsNested.deep parameter\n          \"type\": \"string\"\n        }\n      }\n    }\n  }\n}\n

    There is no limit to how deeply nested parameters can be. Mind however that deeply nested parameters are not that user friendly and will create some very ugly help messages. It's advised to not go deeper than two levels of nesting.

    "},{"location":"nextflow_schema/nextflow_schema_specification/#required-parameters","title":"Required parameters","text":"

    Any parameters that must be specified should be set as required in the schema.

    Tip

    Make sure you do set null as a default value for the parameter, otherwise it will have a value even if not supplied by the pipeline user and the required property will have no effect.

    This is not done with a property key like other things described below, but rather by naming the parameter in the required array in the definition object / top-level object.

    For more information, see the JSON schema documentation.

    {\n  \"type\": \"object\",\n  \"properties\": {\n    \"name\": { \"type\": \"string\" },\n    \"email\": { \"type\": \"string\" },\n    \"address\": { \"type\": \"string\" },\n    \"telephone\": { \"type\": \"string\" }\n  },\n  \"required\": [\"name\", \"email\"]\n}\n
    "},{"location":"nextflow_schema/nextflow_schema_specification/#parameter-name","title":"Parameter name","text":"

    The properties object key must correspond to the parameter variable name in the Nextflow config.

    For example, for params.foo, the schema should look like this:

    // ..\n\"type\": \"object\",\n\"properties\": {\n    \"foo\": {\n        \"type\": \"string\",\n        // ..\n    }\n}\n// ..\n
    "},{"location":"nextflow_schema/nextflow_schema_specification/#keys-for-all-parameters","title":"Keys for all parameters","text":""},{"location":"nextflow_schema/nextflow_schema_specification/#type","title":"type","text":"

    Variable type, taken from the JSON schema keyword vocabulary:

    Validation checks that the supplied parameter matches the expected type, and will fail with an error if not.

    This JSON schema type is not supported:

    "},{"location":"nextflow_schema/nextflow_schema_specification/#default","title":"default","text":"

    Default value for the parameter.

    Should match the type and validation patterns set for the parameter in other fields.

    Tip

    If no default should be set, completely omit this key from the schema. Do not set it as an empty string, or null.

    However, parameters with no defaults should be set to null within your Nextflow config file.

    Note

    When creating a schema using nf-core schema build, this field will be automatically created based on the default value defined in the pipeline config files.

    Generally speaking, the two should always be kept in sync to avoid unexpected problems and usage errors. In some rare cases, this may not be possible (for example, a dynamic groovy expression cannot be encoded in JSON), in which case try to specify as \"sensible\" a default within the schema as possible.

    "},{"location":"nextflow_schema/nextflow_schema_specification/#description","title":"description","text":"

    A short description of what the parameter does, written in markdown. Printed in docs and terminal help text. Should be maximum one short sentence.

    "},{"location":"nextflow_schema/nextflow_schema_specification/#help_text","title":"help_text","text":"

    Non-standard key

    A longer text with usage help for the parameter, written in markdown. Can include newlines with multiple paragraphs and more complex markdown structures.

    Typically hidden by default in documentation and interfaces, unless explicitly clicked / requested.

    "},{"location":"nextflow_schema/nextflow_schema_specification/#errormessage","title":"errorMessage","text":"

    Non-standard key

    If validation fails, an error message is printed to the terminal, so that the end user knows what to fix. However, these messages are not always very clear - especially to newcomers.

    To improve this experience, pipeline developers can set a custom errorMessage for a given parameter in a the schema. If validation fails, this errorMessage is printed instead, and the raw JSON schema validation message goes to the Nextflow debug log output.

    For example, instead of printing:

    * --input (samples.yml): \"samples.yml\" does not match regular expression [^\\S+\\.csv$]\n

    We can set

    \"input\": {\n  \"type\": \"string\",\n  \"pattern\": \"^\\S+\\.csv$\",\n  \"errorMessage\": \"File name must end in '.csv' cannot contain spaces\"\n}\n

    and get:

    * --input (samples.yml): File name must end in '.csv' cannot contain spaces\n
    "},{"location":"nextflow_schema/nextflow_schema_specification/#deprecated","title":"deprecated","text":"

    Extended key

    A boolean JSON flag that instructs anything using the schema that this parameter/field is deprecated and should not be used. This can be useful to generate messages telling the user that a parameter has changed between versions.

    JSON schema states that this is an informative key only, but in nf-schema this will cause a validation error if the parameter/field is used.

    Tip

    Using the errorMessage keyword can be useful to provide more information about the deprecation and what to use instead.

    "},{"location":"nextflow_schema/nextflow_schema_specification/#enum","title":"enum","text":"

    An array of enumerated values: the parameter must match one of these values exactly to pass validation.

    {\n  \"enum\": [\"red\", \"amber\", \"green\"]\n}\n
    "},{"location":"nextflow_schema/nextflow_schema_specification/#fa_icon","title":"fa_icon","text":"

    Non-standard key

    A text identifier corresponding to an icon from Font Awesome. Used for easier visual navigation of documentation and pipeline interfaces.

    Should be the font-awesome class names, for example:

    \"fa_icon\": \"fas fa-file-csv\"\n
    "},{"location":"nextflow_schema/nextflow_schema_specification/#hidden","title":"hidden","text":"

    Non-standard key

    A boolean JSON flag that instructs anything using the schema that this is an unimportant parameter.

    Typically used to keep the pipeline docs / UIs uncluttered with common parameters which are not used by the majority of users. For example, --plaintext_email and --monochrome_logs.

    \"hidden\": true\n
    "},{"location":"nextflow_schema/nextflow_schema_specification/#string-specific-keys","title":"String-specific keys","text":""},{"location":"nextflow_schema/nextflow_schema_specification/#pattern","title":"pattern","text":"

    Regular expression which the string must match in order to pass validation.

    For example, this pattern only validates if the supplied string ends in .fastq, .fq, .fastq.gz or .fq.gz:

    {\n  \"type\": \"string\",\n  \"pattern\": \".*.f(ast)?q(.gz)?$\"\n}\n
    "},{"location":"nextflow_schema/nextflow_schema_specification/#minlength-maxlength","title":"minLength, maxLength","text":"

    Specify a minimum / maximum string length with minLength and maxLength.

    {\n  \"type\": \"string\",\n  \"minLength\": 2,\n  \"maxLength\": 3\n}\n
    "},{"location":"nextflow_schema/nextflow_schema_specification/#format","title":"format","text":"

    Formats can be used to give additional validation checks against string values for certain properties.

    Non-standard key (values)

    The format key is a standard JSON schema key, however we primarily use it for validating file / directory path operations with non-standard schema values.

    Example usage is as follows:

    {\n  \"type\": \"string\",\n  \"format\": \"file-path\"\n}\n

    The available format types are below:

    file-path States that the provided value is a file. Does not check its existence, but it does check if the path is not a directory. directory-path States that the provided value is a directory. Does not check its existence, but if it exists, it does check that the path is not a file. path States that the provided value is a path (file or directory). Does not check its existence. file-path-pattern States that the provided value is a glob pattern that will be used to fetch files. Checks that the pattern is valid and that at least one file is found."},{"location":"nextflow_schema/nextflow_schema_specification/#exists","title":"exists","text":"

    When a format is specified for a value, you can provide the key exists set to true in order to validate that the provided path exists. Set this to false to validate that the path does not exist.

    Example usage is as follows:

    {\n  \"type\": \"string\",\n  \"format\": \"file-path\",\n  \"exists\": true\n}\n

    Note

    If the parameter is an S3 URL path, this validation is ignored.

    "},{"location":"nextflow_schema/nextflow_schema_specification/#mimetype","title":"mimetype","text":"

    MIME type for a file path. Setting this value informs downstream tools about what kind of file is expected.

    Should only be set when format is file-path.

    {\n  \"type\": \"string\",\n  \"format\": \"file-path\",\n  \"mimetype\": \"text/csv\"\n}\n
    "},{"location":"nextflow_schema/nextflow_schema_specification/#schema","title":"schema","text":"

    Path to a JSON schema file used to validate the supplied file.

    Should only be set when format is file-path.

    Tip

    Setting this field is key to working with sample sheet validation and channel generation, as described in the next section of the nf-schema docs.

    These schema files are typically stored in the pipeline assets directory, but can be anywhere.

    {\n  \"type\": \"string\",\n  \"format\": \"file-path\",\n  \"schema\": \"assets/foo_schema.json\"\n}\n

    Note

    If the parameter is set to null, false or an empty string, this validation is ignored. The file won't be validated.

    "},{"location":"nextflow_schema/nextflow_schema_specification/#numeric-specific-keys","title":"Numeric-specific keys","text":""},{"location":"nextflow_schema/nextflow_schema_specification/#minimum-maximum","title":"minimum, maximum","text":"

    Specify a minimum / maximum value for an integer or float number length with minimum and maximum.

    If x is the value being validated, the following must hold true:

    {\n  \"type\": \"number\",\n  \"minimum\": 0,\n  \"maximum\": 100\n}\n

    Note

    The JSON schema doc also mention exclusiveMinimum, exclusiveMaximum and multipleOf keys. Because nf-schema uses stock JSON schema validation libraries, these should work for validating keys. However, they are not officially supported within the Nextflow schema ecosystem and so some interfaces may not recognise them.

    "},{"location":"nextflow_schema/nextflow_schema_specification/#array-specific-keys","title":"Array-specific keys","text":""},{"location":"nextflow_schema/nextflow_schema_specification/#uniqueitems","title":"uniqueItems","text":"

    All items in the array should be unique.

    {\n  \"type\": \"array\",\n  \"uniqueItems\": true\n}\n
    "},{"location":"nextflow_schema/nextflow_schema_specification/#uniqueentries","title":"uniqueEntries","text":"

    Non-standard key

    The combination of all values in the given keys should be unique. For this key to work you need to make sure the array items are of type object and contains the keys in the uniqueEntries list.

    {\n  \"type\": \"array\",\n  \"items\": {\n    \"type\": \"object\",\n    \"uniqueEntries\": [\"foo\", \"bar\"],\n    \"properties\": {\n      \"foo\": { \"type\": \"string\" },\n      \"bar\": { \"type\": \"string\" }\n    }\n  }\n}\n

    This schema tells nf-schema that the combination of foo and bar should be unique across all objects in the array.

    "},{"location":"nextflow_schema/sample_sheet_schema_examples/","title":"Example sample sheet schema","text":""},{"location":"nextflow_schema/sample_sheet_schema_examples/#nf-corernaseq-example","title":"nf-core/rnaseq example","text":"

    The nf-core/rnaseq pipeline was one of the first to have a sample sheet schema. You can see this, used for validating sample sheets with --input here: assets/schema_input.json.

    Tip

    Note the approach used for validating filenames in the fastq_2 column. The column is optional, so if a pattern was supplied by itself then validation would fail when no string is supplied.

    Instead, we say that the string must either match that pattern or it must have a maxLength of 0 (an empty string).

    {\n  \"$schema\": \"https://json-schema.org/draft/2020-12/schema\",\n  \"$id\": \"https://raw.githubusercontent.com/nf-core/rnaseq/master/assets/schema_input.json\",\n  \"title\": \"nf-core/rnaseq pipeline - params.input schema\",\n  \"description\": \"Schema for the file provided with params.input\",\n  \"type\": \"array\",\n  \"items\": {\n    \"type\": \"object\",\n    \"properties\": {\n      \"sample\": {\n        \"type\": \"string\",\n        \"pattern\": \"^\\\\S+$\",\n        \"errorMessage\": \"Sample name must be provided and cannot contain spaces\",\n        \"meta\": [\"my_sample\"]\n      },\n      \"fastq_1\": {\n        \"type\": \"string\",\n        \"pattern\": \"^\\\\S+\\\\.f(ast)?q\\\\.gz$\",\n        \"format\": \"file-path\",\n        \"errorMessage\": \"FastQ file for reads 1 must be provided, cannot contain spaces and must have extension '.fq.gz' or '.fastq.gz'\"\n      },\n      \"fastq_2\": {\n        \"errorMessage\": \"FastQ file for reads 2 cannot contain spaces and must have extension '.fq.gz' or '.fastq.gz'\",\n        \"type\": \"string\",\n        \"pattern\": \"^\\\\S+\\\\.f(ast)?q\\\\.gz$\",\n        \"format\": \"file-path\"\n      },\n      \"strandedness\": {\n        \"type\": \"string\",\n        \"errorMessage\": \"Strandedness must be provided and be one of 'forward', 'reverse' or 'unstranded'\",\n        \"enum\": [\"forward\", \"reverse\", \"unstranded\"],\n        \"meta\": [\"my_strandedness\"]\n      }\n    },\n    \"required\": [\"sample\", \"fastq_1\", \"strandedness\"]\n  }\n}\n
    "},{"location":"nextflow_schema/sample_sheet_schema_examples/#nf-schema-test-case","title":"nf-schema test case","text":"

    You can see a very feature-complete example JSON Schema for a sample sheet schema file below.

    It is used as a test fixture in the nf-schema package here.

    Note

    More examples can be found in the plugin testResources directory.

    {\n    \"$schema\": \"https://json-schema.org/draft/2020-12/schema\",\n    \"$id\": \"https://raw.githubusercontent.com/nextflow-io/nf-schema/master/plugins/nf-schema/src/testResources/schema_input.json\",\n    \"title\": \"Samplesheet validation schema\",\n    \"description\": \"Schema for the samplesheet used in this pipeline\",\n    \"type\": \"array\",\n    \"items\": {\n        \"type\": \"object\",\n        \"properties\": {\n            \"field_1\": {\n                \"type\": \"string\",\n                \"meta\": [\"string1\",\"string2\"],\n                \"default\": \"value\"\n            },\n            \"field_2\": {\n                \"type\": \"integer\",\n                \"meta\": [\"integer1\",\"integer2\"],\n                \"default\": 0\n            },\n            \"field_3\": {\n                \"type\": \"boolean\",\n                \"meta\": [\"boolean1\",\"boolean2\"],\n                \"default\": true\n            },\n            \"field_4\": {\n                \"type\": \"string\"\n            },\n            \"field_5\": {\n                \"type\": \"number\"\n            },\n            \"field_6\": {\n                \"type\": \"boolean\"\n            },\n            \"field_7\": {\n                \"type\": \"string\",\n                \"format\": \"file-path\",\n                \"exists\": true,\n                \"pattern\": \"^.*\\\\.txt$\"\n            },\n            \"field_8\": {\n                \"type\": \"string\",\n                \"format\": \"directory-path\",\n                \"exists\": true\n            },\n            \"field_9\": {\n                \"type\": \"string\",\n                \"format\": \"path\",\n                \"exists\": true\n            },\n            \"field_10\": {\n                \"type\": \"string\"\n            },\n            \"field_11\": {\n                \"type\": \"integer\"\n            },\n            \"field_12\": {\n                \"type\": \"string\",\n                \"default\": \"itDoesExist\"\n            }\n        },\n        \"required\": [\"field_4\", \"field_6\"],\n        \"dependentRequired\": {\n            \"field_1\": [\"field_2\", \"field_3\"]\n        }\n    },\n    \"allOf\": [\n        {\"uniqueEntries\": [\"field_11\", \"field_10\"]},\n        {\"uniqueEntries\": [\"field_10\"]}\n    ]\n}\n
    "},{"location":"nextflow_schema/sample_sheet_schema_specification/","title":"Sample sheet schema specification","text":"

    Sample sheet schema files are used by the nf-schema plugin for validation of sample sheet contents and type conversion / channel generation.

    The Nextflow schema syntax is based on the JSON schema standard. You can find more information about JSON Schema here:

    "},{"location":"nextflow_schema/sample_sheet_schema_specification/#schema-structure","title":"Schema structure","text":"

    Validation by the plugin works by parsing the supplied file contents into a groovy object, then passing this to the JSON schema validation library. As such, the structure of the schema must match the structure of the parsed file.

    Typically, samplesheets are CSV files, with fields represented as columns and samples as rows. TSV, JSON and YAML samplesheets are also supported by this plugin

    In this case, the parsed object will be an array (see JSON schema docs). The array type is associated with an items key which in our case contains a single object. The object has properties, where the keys must match the headers of the CSV file.

    So, for CSV samplesheets, the top-level schema should look something like this:

    {\n  \"$schema\": \"https://json-schema.org/draft/2020-12/schema\",\n  \"type\": \"array\",\n  \"items\": {\n    \"type\": \"object\",\n    \"properties\": {\n      \"field_1\": { \"type\": \"string\" },\n      \"field_2\": { \"type\": \"string\" }\n    }\n  }\n}\n

    If your sample sheet has a different format (for example, a nested YAML file), you will need to build your schema to match the parsed structure.

    "},{"location":"nextflow_schema/sample_sheet_schema_specification/#properties","title":"Properties","text":"

    Every array object will contain keys for each field. Each field should be described as an element in the object properties section.

    The keys of each property must match the header text used in the sample sheet.

    Fields that are present in the sample sheet, but not in the schema will be ignored and produce a warning.

    Tip

    The order of columns in the sample sheet is not relevant, as long as the header text matches.

    Warning

    The order of properties in the schema is important. This order defines the order of output channel properties when using the samplesheetToList() function.

    "},{"location":"nextflow_schema/sample_sheet_schema_specification/#common-keys","title":"Common keys","text":"

    The majority of schema keys for sample sheet schema validation are identical to the Nextflow schema. For example: type, pattern, format, errorMessage, exists and so on.

    Please refer to the Nextflow schema specification docs for details.

    "},{"location":"nextflow_schema/sample_sheet_schema_specification/#sample-sheet-keys","title":"Sample sheet keys","text":"

    Below are the properties that are specific to sample sheet schema. These exist in addition to those described in the Nextflow schema specification.

    "},{"location":"nextflow_schema/sample_sheet_schema_specification/#meta","title":"meta","text":"

    Type: List or String

    The current field will be considered a meta value when this parameter is present. This parameter should contain a list of the meta fields or a string stating a single meta field to assign this value to. The default is no meta for each field.

    For example:

    {\n  \"meta\": \"id\"\n}\n

    will convert the field value to a meta value, resulting in the channel [[id:value]...] See here for an example in the sample sheet.

    "},{"location":"parameters/help_text/","title":"Help text","text":""},{"location":"parameters/help_text/#configure-help-message","title":"Configure help message","text":"

    Add the following configuration to your configuration files to enable the creation of help messages:

    nextflow.config
    validation {\n    help {\n        enabled: true\n    }\n}\n

    That's it! Every time the pipeline user passes the --help and --helpFull parameters to the pipeline, the help message will be created!

    The help message can be customized with a series of different options. See help configuration docs for a list of all options.

    "},{"location":"parameters/help_text/#help-message","title":"Help message","text":"

    Following example shows a snippet of a JSON schema which can be used to perfect visualize the differences between the different help messages. This schema contains one group of parameters called Input parameters that contains two parameters: --input and --outdir. There are also two ungrouped parameters in this schema: --reference and --type. --reference is a nested parameter that contains the .fasta, .fai and .aligners subparameters. .aligners also contains two subparameters: .bwa and .bowtie.

    There are three different help messages:

    1. Using --help will only show the top level parameters (--input, --outdir, --reference and --type in the example). The type, description, possible options and defaults of these parameters will also be added to the message if they are present in the JSON schema.
    2. Using --helpFull will print all parameters (no matter how deeply nested they are) (--input, --outdir, --reference.fasta, --reference.fai, --reference.aligners.bwa, --reference.aligners.bowtie and --type in the example)
    3. --help can also be used with a parameter given to it. This will print out a detailed help message of the parameter. This will also show the subparameters present for the parameter.
    JSON schema--help--helpFull--help input--help reference.aligners
    ...\n\"$defs\": { // A section to define several definition in the JSON schema\n    \"Input parameters\": { // A group called \"Input parameters\"\n        \"properties\": { // All properties (=parameters) in this group\n            \"input\": {\n                \"type\": \"string\",\n                \"description\": \"The input samplesheet\",\n                \"format\": \"file-path\",\n                \"pattern\": \"^.$\\.csv$\",\n                \"help_text\": \"This file needs to contain all input samples\",\n                \"exists\": true\n            },\n            \"outdir\": {\n                \"type\": \"string\",\n                \"description\": \"The output directory\",\n                \"format\": \"directory-path\",\n                \"default\": \"results\"\n            }\n        }\n    }\n},\n\"properties\": { // Ungrouped parameters go here\n    \"reference\": {\n        \"type\": \"object\", // A parameter that contains nested parameters is always an \"object\"\n        \"description\": \"A group of parameters to configure the reference sets\",\n        \"properties\": { // All parameters nested in the --reference parameter\n            \"fasta\": {\n                \"type\": \"string\",\n                \"description\": \"The FASTA file\"\n            },\n            \"fai\": {\n                \"type\": \"string\",\n                \"description\": \"The FAI file\"\n            },\n            \"aligners\": {\n                \"type\": \"object\",\n                \"description\": \"A group of parameters specifying the aligner indices\",\n                \"properties\": { // All parameters nested in the --reference.aligners parameter\n                    \"bwa\": {\n                        \"type\": \"string\",\n                        \"description\": \"The BWA index\"\n                    },\n                    \"bowtie\": {\n                        \"type\": \"string\",\n                        \"description\": \"The BOWTIE index\"\n                    }\n                }\n            }\n        }\n    },\n    \"type\": {\n        \"type\": \"string\",\n        \"description\": \"The analysis type\",\n        \"enum\": [\"WES\",\"WGS\"]\n    }\n}\n...\n
    --reference  [object]          A group of parameters to configure the reference sets\n--type       [string]          The analysis type (accepted: WES, WGS)\n--help       [boolean, string] Show the help message for all top level parameters. When a parameter is given to `--help`, the full help message of that parameter will be printed.\n--helpFull   [boolean]         Show the help message for all non-hidden parameters.\n--showHidden [boolean]         Show all hidden parameters in the help message. This needs to be used in combination with `--help` or `--helpFull`.\n\nInput parameters\n    --input  [string] The input samplesheet\n    --outdir [string] The output directory [default: results]\n
    --reference.fasta           [string]          The FASTA file\n--reference.fai             [string]          The FAI file\n--reference.aligners.bwa    [string]          The BWA index\n--reference.aligners.bowtie [string]          The BOWTIE index\n--type                      [string]          The analysis type (accepted: WES, WGS)\n--help                      [boolean, string] Show the help message for all top level parameters. When a parameter is given to `--help`, the full help message of that parameter will be printed.\n--helpFull                  [boolean]         Show the help message for all non-hidden parameters.\n--showHidden                [boolean]         Show all hidden parameters in the help message. This needs to be used in combination with `--help` or `--helpFull`.\n\nInput parameters\n    --input                 [string] The input samplesheet\n    --outdir                [string] The output directory [default: results]\n
    --input\n    type       : string\n    description: The input samplesheet\n    format     : file-path\n    pattern    : ^.$\\.csv$\n    help_text  : This file needs to contain all input samples\n    exists     : true\n
    --reference.aligners\n    type       : object\n    description: A group of parameters specifying the aligner indices\n    options    :\n        --reference.aligners.bwa    [string] The BWA index\n        --reference.aligners.bowtie [string] The BOWTIE index\n

    The help message will always show the ungrouped parameters first. --help, --helpFull and --showHidden will always be automatically added to the help message. These defaults can be overwritten by adding them as ungrouped parameters to the JSON schema.

    After the ungrouped parameters, the grouped parameters will be printed.

    "},{"location":"parameters/help_text/#hidden-parameters","title":"Hidden parameters","text":"

    Params that are set as hidden in the JSON Schema are not shown in the help message. To show these parameters, pass the --showHidden parameter to the nextflow command.

    "},{"location":"parameters/help_text/#coloured-logs","title":"Coloured logs","text":"

    By default, the help output is coloured using ANSI escape codes.

    If you prefer, you can disable these by setting the validation.monochromeLogs configuration option to true

    Default (coloured)Monochrome logs

    "},{"location":"parameters/help_text/#paramshelp","title":"paramsHelp()","text":"

    Deprecated

    This function has been deprecated in v2.1.0. Use the help configuration instead

    This function returns a help message with the command to run a pipeline and the available parameters. Pass it to log.info to print in the terminal.

    It accepts three arguments:

    1. An example command, typically used to run the pipeline, to be included in the help string
    2. An option to set the file name of a Nextflow Schema file: parameters_schema: <schema.json> (Default: nextflow_schema.json)
    3. An option to hide the deprecation warning: hideWarning: <true/false> (Default: false)

    Note

    paramsHelp() doesn't stop pipeline execution after running. You must add this into your pipeline code if it's the desired functionality.

    Typical usage:

    main.nfnextflow.confignextflow_schema.json
    include { paramsHelp } from 'plugin/nf-schema'\n\nif (params.help) {\n    log.info paramsHelp(\"nextflow run my_pipeline --input input_file.csv\")\n    exit 0\n}\n
    plugins {\n  id 'nf-schema@2.0.0'\n}\n\nparams {\n  input = \"samplesheet.csv\"\n  outdir = \"results\"\n}\n
    {\n    \"$schema\": \"https://json-schema.org/draft/2020-12/schema\",\n    \"$id\": \"https://raw.githubusercontent.com/nf-core/testpipeline/master/nextflow_schema.json\",\n    \"title\": \"nf-core/testpipeline pipeline parameters\",\n    \"description\": \"this is a test\",\n    \"type\": \"object\",\n    \"$defs\": {\n        \"input_output_options\": {\n            \"title\": \"Input/output options\",\n            \"type\": \"object\",\n            \"fa_icon\": \"fas fa-terminal\",\n            \"description\": \"Define where the pipeline should find input data and save output data.\",\n            \"required\": [\"input\", \"outdir\"],\n            \"properties\": {\n                \"input\": {\n                    \"type\": \"string\",\n                    \"format\": \"file-path\",\n                    \"mimetype\": \"text/csv\",\n                    \"schema\": \"assets/schema_input.json\",\n                    \"pattern\": \"^\\\\S+\\\\.(csv|tsv|yaml|json)$\",\n                    \"description\": \"Path to comma-separated file containing information about the samples in the experiment.\",\n                    \"help_text\": \"You will need to create a design file with information about the samples in your experiment before running the pipeline. Use this parameter to specify its location. It has to be a comma-separated file with 3 columns, and a header row. See [usage docs](https://nf-co.re/testpipeline/usage#samplesheet-input).\",\n                    \"fa_icon\": \"fas fa-file-csv\"\n                },\n                \"outdir\": {\n                    \"type\": \"string\",\n                    \"format\": \"directory-path\",\n                    \"description\": \"The output directory where the results will be saved. You have to use absolute paths to storage on Cloud infrastructure.\",\n                    \"fa_icon\": \"fas fa-folder-open\"\n                }\n            }\n        }\n    },\n    \"allOf\": [\n        {\n            \"$ref\": \"#/$defs/input_output_options\"\n        }\n    ]\n}\n

    Output:

    N E X T F L O W  ~  version 23.04.1\nLaunching `pipeline/main.nf` [infallible_turing] DSL2 - revision: 8bf4c8d053\n\nTypical pipeline command:\n\n  nextflow run my_pipeline --input input_file.csv\n\nInput/output options\n  --input  [string]  Path to comma-separated file containing information about the samples in the experiment.\n  --outdir [string]  The output directory where the results will be saved. You have to use absolute paths to storage on Cloud infrastructure.\n\n------------------------------------------------------\n

    Warning

    We shouldn't be using exit as it kills the Nextflow head job in a way that is difficult to handle by systems that may be running it externally, but at the time of writing there is no good alternative. See nextflow-io/nextflow#3984.

    "},{"location":"parameters/summary_log/","title":"Summary log","text":""},{"location":"parameters/summary_log/#paramssummarylog","title":"paramsSummaryLog()","text":"

    This function returns a string that can be logged to the terminal, summarizing the parameters provided to the pipeline.

    Note

    The summary prioritizes displaying only the parameters that are different than the default schema values. Parameters which don't have a default in the JSON Schema and which have a value of null, \"\", false or 'false' won't be returned in the map. This is to streamline the extensive parameter lists often associated with pipelines, and highlight the customized elements. This feature is essential for users to verify their configurations, like checking for typos or confirming proper resolution, without wading through an array of default settings.

    The function takes two arguments:

    Typical usage:

    main.nfnextflow.confignextflow_schema.json
    include { paramsSummaryLog } from 'plugin/nf-schema'\n\nlog.info paramsSummaryLog(workflow)\n
    plugins {\n  id 'nf-schema@2.0.0'\n}\n\nparams {\n  input = \"samplesheet.csv\"\n  outdir = \"results\"\n}\n
    {\n    \"$schema\": \"https://json-schema.org/draft/2020-12/schema\",\n    \"$id\": \"https://raw.githubusercontent.com/nf-core/testpipeline/master/nextflow_schema.json\",\n    \"title\": \"nf-core/testpipeline pipeline parameters\",\n    \"description\": \"this is a test\",\n    \"type\": \"object\",\n    \"$defs\": {\n        \"input_output_options\": {\n            \"title\": \"Input/output options\",\n            \"type\": \"object\",\n            \"fa_icon\": \"fas fa-terminal\",\n            \"description\": \"Define where the pipeline should find input data and save output data.\",\n            \"required\": [\"input\", \"outdir\"],\n            \"properties\": {\n                \"input\": {\n                    \"type\": \"string\",\n                    \"format\": \"file-path\",\n                    \"mimetype\": \"text/csv\",\n                    \"schema\": \"assets/schema_input.json\",\n                    \"pattern\": \"^\\\\S+\\\\.(csv|tsv|yaml|json)$\",\n                    \"description\": \"Path to comma-separated file containing information about the samples in the experiment.\",\n                    \"help_text\": \"You will need to create a design file with information about the samples in your experiment before running the pipeline. Use this parameter to specify its location. It has to be a comma-separated file with 3 columns, and a header row. See [usage docs](https://nf-co.re/testpipeline/usage#samplesheet-input).\",\n                    \"fa_icon\": \"fas fa-file-csv\"\n                },\n                \"outdir\": {\n                    \"type\": \"string\",\n                    \"format\": \"directory-path\",\n                    \"description\": \"The output directory where the results will be saved. You have to use absolute paths to storage on Cloud infrastructure.\",\n                    \"fa_icon\": \"fas fa-folder-open\"\n                }\n            }\n        }\n    },\n    \"allOf\": [\n        {\n            \"$ref\": \"#/$defs/input_output_options\"\n        }\n    ]\n}\n

    Output:

    N E X T F L O W  ~  version 23.04.1\nLaunching `pipeline/main.nf` [sleepy_goldberg] DSL2 - revision: 7a280216f3\n\nCore Nextflow options\n  runName    : sleepy_goldberg\n  launchDir  : /Users/demo/GitHub/nextflow-io/nf-schema/examples/paramsSummaryLog\n  workDir    : /Users/demo/GitHub/nextflow-io/nf-schema/examples/paramsSummaryLog/work\n  projectDir : /Users/demo/GitHub/nextflow-io/nf-schema/examples/paramsSummaryLog/pipeline\n  userName   : demo\n  profile    : standard\n  configFiles:\n\nInput/output options\n  input      : samplesheet.csv\n  outdir     : results\n\n!! Only displaying parameters that differ from the pipeline defaults !!\n------------------------------------------------------\n
    "},{"location":"parameters/summary_log/#coloured-logs","title":"Coloured logs","text":"

    By default, the summary output is coloured using ANSI escape codes.

    If you prefer, you can disable these by using the argument monochrome_logs, e.g. paramsHelp(monochrome_logs: true). Alternatively this can be set at a global level via parameter --monochrome_logs or adding params.monochrome_logs = true to a configuration file. Not --monochromeLogs or params.monochromeLogs is also supported.

    Default (coloured)Monochrome logs

    "},{"location":"parameters/summary_log/#paramssummarymap","title":"paramsSummaryMap()","text":"

    This function returns a Groovy Map summarizing parameters/workflow options used by the pipeline. As above, it only returns the provided parameters that are different to the default values.

    This function takes the same arguments as paramsSummaryLog(): the workflow object and an optional schema file path.

    Note

    Parameters which don't have a default in the JSON Schema and which have a value of null, \"\", false or 'false' won't be returned in the map.

    Typical usage:

    main.nfnextflow.confignextflow_schema.json
    include { paramsSummaryMap } from 'plugin/nf-schema'\n\nprintln paramsSummaryMap(workflow)\n
    plugins {\n  id 'nf-schema@2.0.0'\n}\n\nparams {\n  input = \"samplesheet.csv\"\n  outdir = \"results\"\n}\n
    {\n    \"$schema\": \"https://json-schema.org/draft/2020-12/schema\",\n    \"$id\": \"https://raw.githubusercontent.com/nf-core/testpipeline/master/nextflow_schema.json\",\n    \"title\": \"nf-core/testpipeline pipeline parameters\",\n    \"description\": \"this is a test\",\n    \"type\": \"object\",\n    \"$defs\": {\n        \"input_output_options\": {\n            \"title\": \"Input/output options\",\n            \"type\": \"object\",\n            \"fa_icon\": \"fas fa-terminal\",\n            \"description\": \"Define where the pipeline should find input data and save output data.\",\n            \"required\": [\"input\", \"outdir\"],\n            \"properties\": {\n                \"input\": {\n                    \"type\": \"string\",\n                    \"format\": \"file-path\",\n                    \"mimetype\": \"text/csv\",\n                    \"schema\": \"assets/schema_input.json\",\n                    \"pattern\": \"^\\\\S+\\\\.(csv|tsv|yaml|json)$\",\n                    \"description\": \"Path to comma-separated file containing information about the samples in the experiment.\",\n                    \"help_text\": \"You will need to create a design file with information about the samples in your experiment before running the pipeline. Use this parameter to specify its location. It has to be a comma-separated file with 3 columns, and a header row. See [usage docs](https://nf-co.re/testpipeline/usage#samplesheet-input).\",\n                    \"fa_icon\": \"fas fa-file-csv\"\n                },\n                \"outdir\": {\n                    \"type\": \"string\",\n                    \"format\": \"directory-path\",\n                    \"description\": \"The output directory where the results will be saved. You have to use absolute paths to storage on Cloud infrastructure.\",\n                    \"fa_icon\": \"fas fa-folder-open\"\n                }\n            }\n        }\n    },\n    \"allOf\": [\n        {\n            \"$ref\": \"#/$defs/input_output_options\"\n        }\n    ]\n}\n

    Output:

    N E X T F L O W  ~  version 23.04.1\nLaunching `pipeline/main.nf` [happy_lamport] DSL2 - revision: c45338cd96\n\n[Core Nextflow options:[runName:happy_lamport, launchDir:/Users/ewels/GitHub/nextflow-io/nf-schema/examples/paramsSummaryMap, workDir:/Users/ewels/GitHub/nextflow-io/nf-schema/examples/paramsSummaryMap/work, projectDir:/Users/ewels/GitHub/nextflow-io/nf-schema/examples/paramsSummaryMap/pipeline, userName:ewels, profile:standard, configFiles:], Input/output options:[input:samplesheet.csv, outdir:results]]\n
    "},{"location":"parameters/validation/","title":"Validation of pipeline parameters","text":""},{"location":"parameters/validation/#validateparameters","title":"validateParameters()","text":"

    This function takes all pipeline parameters and checks that they adhere to the specifications defined in the JSON Schema.

    The function takes two optional arguments:

    You can provide the parameters as follows:

    validateParameters(parameters_schema: 'custom_nextflow_parameters.json', monochrome_logs: true)\n

    Monochrome logs can also be set globally providing the parameter --monochrome_logs or adding params.monochrome_logs = true to a configuration file. The form --monochromeLogs is also supported.

    Tip

    As much of the Nextflow ecosystem assumes the nextflow_schema.json filename, it's recommended to stick with the default, if possible.

    See the Schema specification for information about what validation data you can encode within the schema for each parameter.

    "},{"location":"parameters/validation/#example","title":"Example","text":"

    The example below has a deliberate typo in params.input (.txt instead of .csv). The validation function catches this for two reasons:

    The function causes Nextflow to exit immediately with an error.

    Outputmain.nfnextflow.confignextflow_schema.json
    N E X T F L O W  ~  version 23.04.1\nLaunching `pipeline/main.nf` [amazing_crick] DSL2 - revision: 53bd9eac20\n\nERROR ~ Validation of pipeline parameters failed!\n\n -- Check '.nextflow.log' file for details\nThe following invalid input values have been detected:\n\n* --input (samplesheet.txt): \"samplesheet.txt\" does not match regular expression [^\\S+\\.(csv|tsv|yml|yaml)$]\n* --input (samplesheet.txt): the file or directory 'samplesheet.txt' does not exist\n
    include { validateParameters } from 'plugin/nf-schema'\n\nvalidateParameters()\n
    plugins {\n  id 'nf-schema@2.0.0'\n}\n\nparams {\n  input = \"samplesheet.txt\"\n  outdir = \"results\"\n}\n
    {\n    \"$schema\": \"https://json-schema.org/draft/2020-12/schema\",\n    \"$id\": \"https://raw.githubusercontent.com/nf-core/testpipeline/master/nextflow_schema.json\",\n    \"title\": \"nf-core/testpipeline pipeline parameters\",\n    \"description\": \"this is a test\",\n    \"type\": \"object\",\n    \"$defs\": {\n        \"input_output_options\": {\n            \"title\": \"Input/output options\",\n            \"type\": \"object\",\n            \"fa_icon\": \"fas fa-terminal\",\n            \"description\": \"Define where the pipeline should find input data and save output data.\",\n            \"required\": [\"input\", \"outdir\"],\n            \"properties\": {\n                \"input\": {\n                    \"type\": \"string\",\n                    \"format\": \"file-path\",\n                    \"mimetype\": \"text/csv\",\n                    \"schema\": \"assets/schema_input.json\",\n                    \"pattern\": \"^\\\\S+\\\\.(csv|tsv|yaml|json)$\",\n                    \"exists\": true,\n                    \"description\": \"Path to comma-separated file containing information about the samples in the experiment.\",\n                    \"help_text\": \"You will need to create a design file with information about the samples in your experiment before running the pipeline. Use this parameter to specify its location. It has to be a comma-separated file with 3 columns, and a header row. See [usage docs](https://nf-co.re/testpipeline/usage#samplesheet-input).\",\n                    \"fa_icon\": \"fas fa-file-csv\"\n                },\n                \"outdir\": {\n                    \"type\": \"string\",\n                    \"format\": \"directory-path\",\n                    \"description\": \"The output directory where the results will be saved. You have to use absolute paths to storage on Cloud infrastructure.\",\n                    \"fa_icon\": \"fas fa-folder-open\"\n                }\n            }\n        }\n    },\n    \"allOf\": [\n        {\n            \"$ref\": \"#/$defs/input_output_options\"\n        }\n    ]\n}\n
    "},{"location":"parameters/validation/#failing-for-unrecognized-parameters","title":"Failing for unrecognized parameters","text":"

    When parameters which are not specified in the JSON Schema are provided, the parameter validation function returns a WARNING. This is because user-specific institutional configuration profiles may make use of params that are unknown to the pipeline.

    The down-side of this is that warnings about typos in parameters can go unnoticed.

    To force the pipeline execution to fail with an error instead, you can provide the validation.failUnrecognisedParams = true configuration option:

    Default Fail unrecognised params Outputnextflow.configmain.nf
    N E X T F L O W  ~  version 23.04.1\nLaunching `pipeline/main.nf` [jovial_linnaeus] DSL2 - revision: 53bd9eac20\n\nWARN: The following invalid input values have been detected:\n\n* --foo: bar\n\nHello World!\n
    plugins {\n  id 'nf-schema@2.0.0'\n}\n\nparams {\n  input = \"samplesheet.csv\"\n  outdir = \"results\"\n  foo = \"bar\"\n}\n
    include { validateParameters } from 'plugin/nf-schema'\n\nvalidateParameters()\n\nprintln \"Hello World!\"\n
    Outputnextflow.configmain.nf
    N E X T F L O W  ~  version 23.04.1\nLaunching `pipeline/main.nf` [pedantic_descartes] DSL2 - revision: 53bd9eac20\n\nERROR ~ ERROR: Validation of pipeline parameters failed!\n\n -- Check '.nextflow.log' file for details\nThe following invalid input values have been detected:\n\n* --foo: bar\n
    plugins {\n  id 'nf-schema@2.0.0'\n}\n\nvalidation.failUnrecognisedParams = true\n\nparams {\n  input = \"samplesheet.csv\"\n  outdir = \"results\"\n  foo = \"bar\"\n}\n
    include { validateParameters } from 'plugin/nf-schema'\n\nvalidateParameters()\n\nprintln \"Hello World!\"\n
    "},{"location":"parameters/validation/#ignoring-unrecognized-parameters","title":"Ignoring unrecognized parameters","text":"

    Sometimes, a parameter that you want to set may not be described in the pipeline schema for a good reason. Maybe it's something you're using in your Nextflow configuration setup for your compute environment, or it's a complex parameter that cannot be handled in the schema, such as nested parameters.

    In these cases, to avoid getting warnings when an unrecognised parameter is set, you can use --validationSchemaIgnoreParams / params.validationSchemaIgnoreParams.

    This should be a comma-separated list of strings that correspond to parameter names.

    "},{"location":"parameters/validation/#variable-type-checking","title":"Variable type checking","text":"

    By default, validateParameters() is strict about expecting parameters to adhere to their expected type. If the schema says that params.foo should be an integer and the user sets params.foo = \"12\" (a string with a number), it will fail.

    If this causes problems, the user can run validation in \"lenient mode\", whereby the JSON Schema validation tries to cast parameters to their correct type. For example, providing an integer as a string will no longer fail validation.

    Note

    The validation does not affect the parameter variable types in your pipeline. It attempts to cast a temporary copy of the params only, during the validation step.

    To enable lenient validation mode, set validation.lenientMode = true in your configuration file.

    "},{"location":"samplesheets/examples/","title":"Sample sheet channel manipulation examples","text":""},{"location":"samplesheets/examples/#introduction","title":"Introduction","text":"

    Understanding channel structure and manipulation is critical for getting the most out of Nextflow. nf-schema helps initialise your channels from the text inputs to get you started, but further work might be required to fit your exact use case. In this page we run through some common cases for transforming the output of samplesheetToList().

    "},{"location":"samplesheets/examples/#glossary","title":"Glossary","text":""},{"location":"samplesheets/examples/#default-mode","title":"Default mode","text":"

    Each item in the list emitted by samplesheetToList() is a tuple, corresponding with each row of the sample sheet. Each item will be composed of a meta value (if present) and any additional elements from columns in the sample sheet, e.g.:

    sample,fastq_1,fastq_2,bed\nsample1,fastq1.R1.fq.gz,fastq1.R2.fq.gz,sample1.bed\nsample2,fastq2.R1.fq.gz,fastq2.R2.fq.gz,\n

    Might create a list where each element consists of 4 items, a map value followed by three files:

    // Columns:\n[ val([ sample: sample ]), file(fastq1), file(fastq2), file(bed) ]\n\n// Resulting in:\n[ [ id: \"sample\" ], fastq1.R1.fq.gz, fastq1.R2.fq.gz, sample1.bed]\n[ [ id: \"sample2\" ], fastq2.R1.fq.gz, fastq2.R2.fq.gz, [] ] // A missing value from the sample sheet is an empty list\n

    This list can be converted to a channel that can be used as input of a process where the input declaration is:

    tuple val(meta), path(fastq_1), path(fastq_2), path(bed)\n

    It may be necessary to manipulate this channel to fit your process inputs. For more documentation, check out the Nextflow operator docs, however here are some common use cases with samplesheetToList().

    "},{"location":"samplesheets/examples/#using-a-sample-sheet-with-no-headers","title":"Using a sample sheet with no headers","text":"

    Sometimes you only have one possible input in the pipeline sample sheet. In this case it doesn't make sense to have a header in the sample sheet. This can be done by removing the properties section from the sample sheet and changing the type of the element from object the desired type:

    {\n  \"$schema\": \"https://json-schema.org/draft/2020-12/schema\",\n  \"description\": \"Schema for the file provided with params.input\",\n  \"type\": \"array\",\n  \"items\": {\n    \"type\": \"string\"\n  }\n}\n

    When using samplesheets like this CSV file:

    test_1\ntest_2\n

    or this YAML file:

    - test_1\n- test_2\n

    The output of samplesheetToList() will look like this:

    test_1\ntest_2\n
    "},{"location":"samplesheets/examples/#changing-the-structure-of-channel-items","title":"Changing the structure of channel items","text":"

    Each item in the list will be a tuple, but some processes will use multiple files as a list in their input channel, this is common in nf-core modules. For example, consider the following input declaration in a process, where FASTQ could be > 1 file:

    process ZCAT_FASTQS {\n    input:\n        tuple val(meta), path(fastq)\n\n    \"\"\"\n    zcat $fastq\n    \"\"\"\n}\n

    The output of samplesheetToList() (converted to a channel) can be used by default with a process with the following input declaration:

    val(meta), path(fastq_1), path(fastq_2)\n

    To manipulate each item within a channel, you should use the Nextflow .map() operator. This will apply a function to each element of the channel in turn. Here, we convert the flat tuple into a tuple composed of a meta and a list of FASTQ files:

    Channel.fromList(samplesheetToList(params.input, \"path/to/json/schema\"))\n    .map { meta, fastq_1, fastq_2 -> tuple(meta, [ fastq_1, fastq_2 ]) }\n    .set { input }\n\ninput.view() // Channel has 2 elements: meta, fastqs\n

    This is now compatible with the process defined above and will not raise a warning about input cardinality:

    ZCAT_FASTQS(input)\n
    "},{"location":"samplesheets/examples/#removing-elements-in-channel-items","title":"Removing elements in channel items","text":"

    For example, to remove the BED file from the channel created above, we could not return it from the map. Note the absence of the bed item in the return of the closure below:

    Channel.fromList(samplesheetToList(params.input, \"path/to/json/schema\"))\n    .map { meta, fastq_1, fastq_2, bed -> tuple(meta, fastq_1, fastq_2) }\n    .set { input }\n\ninput.view() // Channel has 3 elements: meta, fastq_1, fastq_2\n

    In this way you can drop items from a channel.

    "},{"location":"samplesheets/examples/#separating-channel-items","title":"Separating channel items","text":"

    We could perform this twice to create one channel containing the FASTQs and one containing the BED files, however Nextflow has a native operator to separate channels called .multiMap(). Here, we separate the FASTQs and BEDs into two separate channels using multiMap. Note, the channels are both contained in input and accessed as an attribute using dot notation:

    Channel.fromList(samplesheetToList(params.input, \"path/to/json/schema\"))\n    .multiMap { meta, fastq_1, fastq_2, bed ->\n        fastq: tuple(meta, fastq_1, fastq_2)\n        bed:   tuple(meta, bed)\n    }\n    .set { input }\n

    The channel has two attributes, fastq and bed, which can be accessed separately.

    input.fastq.view() // Channel has 3 elements: meta, fastq_1, fastq_2\ninput.bed.view()   // Channel has 2 elements: meta, bed\n

    Importantly, multiMap applies to every item in the channel and returns an item to both channels for every input, i.e. input, input.fastq and input.bed all contain the same number of items, however each item will be different.

    "},{"location":"samplesheets/examples/#separate-items-based-on-a-condition","title":"Separate items based on a condition","text":"

    You can use the .branch() operator to separate the channel entries based on a condition. This is especially useful when you can get multiple types of input data.

    This example shows a channel which can have entries for WES or WGS data. WES data includes a BED file denoting the target regions, but WGS data does not. These analysis are different so we want to separate the WES and WGS entries from each other. We can separate the two using .branch based on the presence of the BED file:

    // Channel with four elements - see docs for examples\nparams.input = \"samplesheet.csv\"\n\nChannel.fromList(samplesheetToList(params.input, \"path/to/json/schema\"))\n    .branch { meta, fastq_1, fastq_2, bed ->\n        // If BED does not exist\n        WGS: !bed\n            return [meta, fastq_1, fastq_2]\n        // If BED exists\n        WES: bed\n            // The original channel structure will be used when no return statement is used.\n    }\n    .set { input }\n\ninput.WGS.view() // Channel has 3 elements: meta, fastq_1, fastq_2\ninput.WES.view() // Channel has 4 elements: meta, fastq_1, fastq_2, bed\n

    Unlike .multiMap(), the outputs of .branch() will contain a different number of items.

    "},{"location":"samplesheets/examples/#combining-a-channel","title":"Combining a channel","text":"

    After splitting the channel, it may be necessary to rejoin the channel. There are many ways to join a channel, but here we will demonstrate the simplest which uses the Nextflow join operator to rejoin any of the channels from above based on the first element in each item, the meta value.

    input.fastq.view() // Channel has 3 elements: meta, fastq_1, fastq_2\ninput.bed.view()   // Channel has 2 elements: meta, bed\n\ninput.fastq\n    .join( input.bed )\n    .set { input_joined }\n\ninput_joined.view() // Channel has 4 elements: meta, fastq_1, fastq_2, bed\n
    "},{"location":"samplesheets/examples/#count-items-with-a-common-value","title":"Count items with a common value","text":"

    This example is based on this code from Marcel Ribeiro-Dantas.

    It's useful to determine the count of channel entries with similar values when you want to merge them later on (to prevent pipeline bottlenecks with .groupTuple()).

    This example contains a channel where multiple samples can be in the same family. Later on in the pipeline we want to merge the analyzed files so one file gets created for each family. The result will be a channel with an extra meta field containing the count of channel entries with the same family name.

    // channel created with samplesheetToList() previous to modification:\n// [[id:example1, family:family1], example1.txt]\n// [[id:example2, family:family1], example2.txt]\n// [[id:example3, family:family2], example3.txt]\n\nparams.input = \"sample sheet.csv\"\n\nChannel.fromList(samplesheetToList(params.input, \"path/to/json/schema\"))\n    .tap { ch_raw }                       // Create a copy of the original channel\n    .map { meta, txt -> [ meta.family ] } // Isolate the value to count on\n    .reduce([:]) { counts, family ->      // Creates a map like this: [family1:2, family2:1]\n        counts[family] = (counts[family] ?: 0) + 1\n        counts\n    }\n    .combine(ch_raw)                     // Add the count map to the original channel\n    .map { counts, meta, txt ->          // Add the counts of the current family to the meta\n        new_meta = meta + [count:counts[meta.family]]\n        [ new_meta, txt ]\n    }\n    .set { input }\n\ninput.view()\n// [[id:example1, family:family1, count:2], example1.txt]\n// [[id:example2, family:family1, count:2], example2.txt]\n// [[id:example3, family:family2, count:1], example3.txt]\n
    "},{"location":"samplesheets/samplesheetToList/","title":"Create a list from a sample sheet","text":""},{"location":"samplesheets/samplesheetToList/#samplesheettolist","title":"samplesheetToList()","text":"

    This function validates and converts a sample sheet to a Groovy list. This is done using information encoded within a sample sheet schema (see the docs).

    The function has two required arguments:

    1. The path to the samplesheet
    2. The path to the JSON schema file corresponding to the samplesheet.

    These can be either a string with the relative path (from the root of the pipeline) or a file object of the schema.

    samplesheetToList(\"path/to/samplesheet\", \"path/to/json/schema\")\n

    Note

    All data points in the CSV and TSV samplesheets will be converted to their derived type. (e.g. \"true\" will be converted to the Boolean true and \"2\" will be converted to the Integer 2). You can still convert these types back to a String if this is not the expected behaviour with .map { val -> val.toString() }

    This function can be used together with existing channel factories/operators to create one channel entry per samplesheet entry.

    "},{"location":"samplesheets/samplesheetToList/#use-as-a-channel-factory","title":"Use as a channel factory","text":"

    The function can be used with the .fromList channel factory to generate a queue channel:

    Channel.fromList(samplesheetToList(\"path/to/samplesheet\", \"path/to/json/schema\"))\n

    Note

    This will mimic the fromSamplesheet channel factory, found in the previous nf-validation.

    "},{"location":"samplesheets/samplesheetToList/#use-as-a-channel-operator","title":"Use as a channel operator","text":"

    Alternatively, the function can be used with the .flatMap channel operator to create a channel from samplesheet paths that are already in a channel:

    Channel.of(\"path/to/samplesheet\").flatMap { samplesheetToList(it, \"path/to/json/schema\") }\n
    "},{"location":"samplesheets/samplesheetToList/#basic-example","title":"Basic example","text":"

    In this example, we create a simple channel from a CSV sample sheet.

    N E X T F L O W  ~  version 23.04.0\nLaunching `pipeline/main.nf` [distraught_marconi] DSL2 - revision: 74f697a0d9\n[mysample1, input1_R1.fq.gz, input1_R2.fq.gz, forward]\n[mysample2, input2_R1.fq.gz, input2_R2.fq.gz, forward]\n
    main.nfsamplesheet.csvnextflow.configassets/schema_input.json
    include { samplesheetToList } from 'plugin/nf-schema'\n\nch_input = Channel.fromList(samplesheetToList(params.input, \"assets/schema_input.json\"))\n\nch_input.view()\n
    sample,fastq_1,fastq_2,strandedness\nmysample1,input1_R1.fq.gz,input1_R2.fq.gz,forward\nmysample2,input2_R1.fq.gz,input2_R2.fq.gz,forward\n
    plugins {\n  id 'nf-schema@2.0.0'\n}\n\nparams {\n  input = \"samplesheet.csv\"\n  output = \"results\"\n}\n
    {\n  \"$schema\": \"https://json-schema.org/draft/2020-12/schema\",\n  \"$id\": \"https://raw.githubusercontent.com/nf-schema/example/master/assets/schema_input.json\",\n  \"title\": \"nf-schema example - params.input schema\",\n  \"description\": \"Schema for the file provided with params.input\",\n  \"type\": \"array\",\n  \"items\": {\n    \"type\": \"object\",\n    \"properties\": {\n      \"sample\": {\n        \"type\": \"string\",\n        \"pattern\": \"^\\\\S+$\",\n        \"errorMessage\": \"Sample name must be provided and cannot contain spaces\"\n      },\n      \"fastq_1\": {\n        \"type\": \"string\",\n        \"pattern\": \"^\\\\S+\\\\.f(ast)?q\\\\.gz$\",\n        \"errorMessage\": \"FastQ file for reads 1 must be provided, cannot contain spaces and must have extension '.fq.gz' or '.fastq.gz'\"\n      },\n      \"fastq_2\": {\n        \"errorMessage\": \"FastQ file for reads 2 cannot contain spaces and must have extension '.fq.gz' or '.fastq.gz'\",\n        \"type\": \"string\",\n        \"pattern\": \"^\\\\S+\\\\.f(ast)?q\\\\.gz$\"\n      },\n      \"strandedness\": {\n        \"type\": \"string\",\n        \"errorMessage\": \"Strandedness must be provided and be one of 'forward', 'reverse' or 'unstranded'\",\n        \"enum\": [\"forward\", \"reverse\", \"unstranded\"]\n      }\n    },\n    \"required\": [\"sample\", \"fastq_1\", \"strandedness\"]\n  }\n}\n
    "},{"location":"samplesheets/samplesheetToList/#order-of-fields","title":"Order of fields","text":"

    This example demonstrates that the order of columns in the sample sheet file has no effect.

    Danger

    It is the order of fields in the sample sheet JSON schema which defines the order of items in the channel returned by samplesheetToList(), not the order of fields in the sample sheet file.

    N E X T F L O W  ~  version 23.04.0\nLaunching `pipeline/main.nf` [elated_kowalevski] DSL2 - revision: 74f697a0d9\n[forward, mysample1, input1_R2.fq.gz, input1_R1.fq.gz]\n[forward, mysample2, input2_R2.fq.gz, input2_R1.fq.gz]\n
    samplesheet.csvassets/schema_input.jsonmain.nfnextflow.config
    sample,fastq_1,fastq_2,strandedness\nmysample1,input1_R1.fq.gz,input1_R2.fq.gz,forward\nmysample2,input2_R1.fq.gz,input2_R2.fq.gz,forward\n
    {\n  \"$schema\": \"https://json-schema.org/draft/2020-12/schema\",\n  \"$id\": \"https://raw.githubusercontent.com/nf-schema/example/master/assets/schema_input.json\",\n  \"title\": \"nf-schema example - params.input schema\",\n  \"description\": \"Schema for the file provided with params.input\",\n  \"type\": \"array\",\n  \"items\": {\n    \"type\": \"object\",\n    \"properties\": {\n      \"strandedness\": {\n        \"type\": \"string\",\n        \"errorMessage\": \"Strandedness must be provided and be one of 'forward', 'reverse' or 'unstranded'\",\n        \"enum\": [\"forward\", \"reverse\", \"unstranded\"]\n      },\n      \"sample\": {\n        \"type\": \"string\",\n        \"pattern\": \"^\\\\S+$\",\n        \"errorMessage\": \"Sample name must be provided and cannot contain spaces\"\n      },\n      \"fastq_2\": {\n        \"errorMessage\": \"FastQ file for reads 2 cannot contain spaces and must have extension '.fq.gz' or '.fastq.gz'\",\n        \"type\": \"string\",\n        \"pattern\": \"^\\\\S+\\\\.f(ast)?q\\\\.gz$\"\n      },\n      \"fastq_1\": {\n        \"type\": \"string\",\n        \"pattern\": \"^\\\\S+\\\\.f(ast)?q\\\\.gz$\",\n        \"errorMessage\": \"FastQ file for reads 1 must be provided, cannot contain spaces and must have extension '.fq.gz' or '.fastq.gz'\"\n      }\n    },\n    \"required\": [\"sample\", \"fastq_1\", \"strandedness\"]\n  }\n}\n
    include { samplesheetToList } from 'plugin/nf-schema'\n\nch_input = Channel.fromList(samplesheetToList(params.input, \"assets/schema_input.json\"))\n\nch_input.view()\n
    plugins {\n  id 'nf-schema@2.0.0'\n}\n\nparams {\n  input = \"samplesheet.csv\"\n  output = \"results\"\n}\n
    "},{"location":"samplesheets/samplesheetToList/#channel-with-meta-map","title":"Channel with meta map","text":"

    In this example, we use the schema to mark two columns as meta fields. This returns a channel with a meta map.

    N E X T F L O W  ~  version 23.04.0\nLaunching `pipeline/main.nf` [romantic_kare] DSL2 - revision: 74f697a0d9\n[[my_sample_id:mysample1, my_strandedness:forward], input1_R1.fq.gz, input1_R2.fq.gz]\n[[my_sample_id:mysample2, my_strandedness:forward], input2_R1.fq.gz, input2_R2.fq.gz]\n
    assets/schema_input.jsonmain.nfsamplesheet.csvnextflow.config
    {\n  \"$schema\": \"https://json-schema.org/draft/2020-12/schema\",\n  \"$id\": \"https://raw.githubusercontent.com/nf-schema/example/master/assets/schema_input.json\",\n  \"title\": \"nf-schema example - params.input schema\",\n  \"description\": \"Schema for the file provided with params.input\",\n  \"type\": \"array\",\n  \"items\": {\n    \"type\": \"object\",\n    \"properties\": {\n      \"sample\": {\n        \"type\": \"string\",\n        \"pattern\": \"^\\\\S+$\",\n        \"errorMessage\": \"Sample name must be provided and cannot contain spaces\",\n        \"meta\": [\"my_sample_id\"]\n      },\n      \"fastq_1\": {\n        \"type\": \"string\",\n        \"pattern\": \"^\\\\S+\\\\.f(ast)?q\\\\.gz$\",\n        \"errorMessage\": \"FastQ file for reads 1 must be provided, cannot contain spaces and must have extension '.fq.gz' or '.fastq.gz'\"\n      },\n      \"fastq_2\": {\n        \"errorMessage\": \"FastQ file for reads 2 cannot contain spaces and must have extension '.fq.gz' or '.fastq.gz'\",\n        \"type\": \"string\",\n        \"pattern\": \"^\\\\S+\\\\.f(ast)?q\\\\.gz$\"\n      },\n      \"strandedness\": {\n        \"type\": \"string\",\n        \"errorMessage\": \"Strandedness must be provided and be one of 'forward', 'reverse' or 'unstranded'\",\n        \"enum\": [\"forward\", \"reverse\", \"unstranded\"],\n        \"meta\": [\"my_strandedness\"]\n      }\n    },\n    \"required\": [\"sample\", \"fastq_1\", \"strandedness\"]\n  }\n}\n
    include { samplesheetToList } from 'plugin/nf-schema'\n\nch_input = Channel.fromList(samplesheetToList(params.input, \"assets/schema_input.json\"))\n\nch_input.view()\n
    sample,fastq_1,fastq_2,strandedness\nmysample1,input1_R1.fq.gz,input1_R2.fq.gz,forward\nmysample2,input2_R1.fq.gz,input2_R2.fq.gz,forward\n
    plugins {\n  id 'nf-schema@2.0.0'\n}\n\nparams {\n  input = \"samplesheet.csv\"\n  output = \"results\"\n}\n
    "},{"location":"samplesheets/validate_sample_sheet/","title":"Validate a sample sheet file contents","text":"

    When a parameter provides the schema field, the validateParameters() function will automatically parse and validate the provided file contents using this JSON schema. It can validate CSV, TSV, JSON and YAML files.

    The path of the schema file must be relative to the root of the pipeline directory. See an example in the input field from the example schema.json.

    {\n  \"properties\": {\n    \"input\": {\n      \"type\": \"string\",\n      \"format\": \"file-path\",\n      \"pattern\": \"^\\\\S+\\\\.csv$\",\n      \"schema\": \"src/testResources/samplesheet_schema.json\",\n      \"description\": \"Path to comma-separated file containing information about the samples in the experiment.\"\n    }\n  }\n}\n

    Note

    The samplesheetToList function also validate the files before converting them. If you convert the samplesheet, it's not necessary to add a schema to the parameter corresponding to the samplesheet.

    For more information about the sample sheet JSON schema refer to sample sheet docs.

    "}]} \ No newline at end of file +{"config":{"lang":["en"],"separator":"[\\s\\-]+","pipeline":["stopWordFilter"]},"docs":[{"location":"","title":"nf-schema","text":"

    A Nextflow plugin to work with validation of pipeline parameters and sample sheets.

    Info

    nf-schema is the new version of the now deprecated nf-validation. Please follow the migration guide to migrate your code to this new version.

    "},{"location":"#introduction","title":"Introduction","text":"

    This Nextflow plugin provides functionality that can be used in a Nextflow pipeline to work with parameter and sample sheet schema. The added functionality is:

    Supported sample sheet formats are CSV, TSV, JSON and YAML.

    "},{"location":"#quick-start","title":"Quick Start","text":"

    Declare the plugin in your Nextflow pipeline configuration file:

    nextflow.config
    plugins {\n  id 'nf-schema@2.1.0'\n}\n

    This is all that is needed - Nextflow will automatically fetch the plugin code at run time.

    [!NOTE] The snippet above will always try to install the specified version. We encourage always pinning the plugin version to make sure the used pipeline will keep working when a new version of nf-schema with breaking changes has been released.

    You can now include the plugin helper functions into your Nextflow pipeline:

    main.nf
    include { validateParameters; paramsSummaryLog; samplesheetToList } from 'plugin/nf-schema'\n\n// Validate input parameters\nvalidateParameters()\n\n// Print summary of supplied parameters\nlog.info paramsSummaryLog(workflow)\n\n// Create a new channel of metadata from a sample sheet passed to the pipeline through the --input parameter\nch_input = Channel.fromList(samplesheetToList(params.input, \"assets/schema_input.json\"))\n

    Or enable the creation of the help message (using --help) in the configuration file:

    nextflow.config
    validation {\n  help {\n    enabled: true\n  }\n}\n
    "},{"location":"#dependencies","title":"Dependencies","text":""},{"location":"#slack-channel","title":"Slack channel","text":"

    There is a dedicated nf-schema Slack channel in the Nextflow Slack workspace.

    "},{"location":"#credits","title":"Credits","text":"

    This plugin was written based on code initially written within the nf-core community, as part of the nf-core pipeline template.

    We would like to thank the key contributors who include (but are not limited to):

    "},{"location":"background/","title":"Background","text":"

    The Nextflow workflow manager is a powerful tool for scientific workflows. In order for end users to launch a given workflow with different input data and varying settings, pipelines are developed using a special variable type called parameters (params). Defaults are hardcoded into scripts and config files but can be overwritten by user config files and command-line flags (see the Nextflow docs).

    In addition to config params, a common best-practice for pipelines is to use a \"sample sheet\" file containing required input information. For example: a sample identifier, filenames and other sample-level metadata.

    Nextflow itself does not provide functionality to validate config parameters or parsed sample sheets. To bridge this gap, we developed code within the nf-core community to allow pipelines to work with a standard nextflow_schema.json file, written using the JSON Schema format. The file allows strict typing of parameter variables and inclusion of validation rules.

    The nf-schema plugin moves this code out of the nf-core template into a stand-alone package, to make it easier to use for the wider Nextflow community. It also incorporates a number of new features, such as native Groovy sample sheet validation.

    Earlier versions of the plugin can be found in the nf-validation repository and can still be used in the pipeline. However the nf-validation plugin is no longer supported and all development has been moved to nf-schema.

    "},{"location":"migration_guide/","title":"Migration guide","text":"

    Warning

    nf-schema currently is not supported by the nf-core tooling. Using this plugin will break the linting and schema builder. See these issues for the progress on the nf-core migration to nf-schema:

    1. https://github.com/nf-core/tools/issues/2932
    2. https://github.com/nf-core/tools/issues/2784
    3. https://github.com/nf-core/tools/issues/2429

    This guide is intended to help you migrate your pipeline from nf-validation to nf-schema.

    "},{"location":"migration_guide/#major-changes-in-the-plugin","title":"Major changes in the plugin","text":"

    Following list shows the major breaking changes introduced in nf-schema:

    1. The JSON schema draft has been updated from draft-07 to draft-2020-12. See JSON Schema draft 2020-12 release notes and JSON schema draft 2019-09 release notes for more information.
    2. The fromSamplesheet channel factory has been converted to a function called samplesheetToList. See updating fromSamplesheet for more information.
    3. The unique keyword for samplesheet schemas has been removed. Please use uniqueItems or uniqueEntries now instead.
    4. The dependentRequired keyword now works as it's supposed to work in JSON schema. See dependentRequired for more information.
    5. All configuration parameters have been converted to Nextflow configuration options. See Updating configuration for more information.
    6. Help messages are now created automatically instead of using the paramsHelp() function. (v2.1.0 feature)

    A full list of changes can be found in the changelog.

    "},{"location":"migration_guide/#updating-your-pipeline","title":"Updating your pipeline","text":"

    Updating your pipeline can be done in a couple simple steps.

    "},{"location":"migration_guide/#updating-the-name-and-version-of-the-plugin","title":"Updating the name and version of the plugin","text":"

    The name and the version of the plugin should be updated from nf-validation to nf-schema@2.0.0:

    nf-validationnf-schema
    plugins {\n    id 'nf-validation@1.1.3'\n}\n
    plugins {\n    id 'nf-schema@2.0.0'\n}\n

    Additionally, all includes from nf-validation should be updated to nf-schema. This can easily be done with the following command:

    find . -type f -name \"*.nf\" -exec sed -i -e \"s/from 'plugin\\/nf-validation'/from 'plugin\\/nf-schema'/g\" -\ne 's/from \"plugin\\/nf-validation\"/from \"plugin\\/nf-schema\"/g' {} +\n
    "},{"location":"migration_guide/#updating-the-json-schema-files","title":"Updating the JSON schema files","text":"

    If you aren't using any special features in your schemas, you can simply update your nextflow_schema.json file using the following command:

    sed -i -e 's/http:\\/\\/json-schema.org\\/draft-07\\/schema/https:\\/\\/json-schema.org\\/draft\\/2020-12\\/schema/g' -e 's/definitions/$defs/g' nextflow_schema.json\n

    This will replace the old schema draft specification (draft-07) by the new one (2020-12), and the old keyword definitions by the new notation $defs.

    Note

    Repeat this command for every JSON schema used in your pipeline. e.g. for the default samplesheet schema in nf-core pipelines: bash sed -i -e 's/http:\\/\\/json-schema.org\\/draft-07\\/schema/https:\\/\\/json-schema.org\\/draft\\/2020-12\\/schema/g' -e 's/definitions/$defs/g' assets/schema_input.json

    Warning

    This will not update changes to special fields in the schema, see the guide for special JSON schema keywords on how to update these

    "},{"location":"migration_guide/#update-the-samplesheet-conversion","title":"Update the samplesheet conversion","text":"

    The .fromSamplesheet channel factory should be converted to the samplesheetToList function. Following tabs shows how to use the function to get the same effect as the channel factory:

    nf-validationnf-schema
    include { fromSamplesheet } from 'plugin/nf-validation'\nChannel.fromSamplesheet(\"input\")\n
    include { samplesheetToList } from 'plugin/nf-schema'\nChannel.fromList(samplesheetToList(params.input, \"path/to/samplesheet/schema\"))\n

    Note

    This change was necessary to make it possible for pipelines to be used as pluggable workflows. This also enables the validation and conversion of files generated by the pipeline.

    "},{"location":"migration_guide/#updating-configuration","title":"Updating configuration","text":"

    The configuration parameters have been converted to a Nextflow configuration option. You can now access these options using the validation config scope:

    validation.<option> = <value>\n

    OR

    validation {\n    <option1> = <value1>\n    <option2> = <value2>\n}\n

    See this table for an overview of what the new configuration options are for the old parameters:

    Old parameter New config option(s) params.validationMonochromeLogs = <boolean> validation.monochromeLogs = <boolean> params.validationLenientMode = <boolean> validation.lenientMode = <boolean> params.validationFailUnrecognisedParams = <boolean> validation.failUnrecognisedParams = <boolean> params.validationShowHiddenParams = <boolean> validation.showHiddenParams = <boolean> params.validationIgnoreParams = <string> validation.defaultIgnoreParams = <list> and validation.ignoreParams = <list>

    Note

    defaultIgnoreParams is meant to be used by pipeline developers to set the parameters which should always be ignored. ignoreParams is meant for the pipeline user to ignore certain parameters.

    "},{"location":"migration_guide/#updating-special-keywords-in-json-schemas","title":"Updating special keywords in JSON schemas","text":"

    If you are using any special features in your schemas, you will need to update your schemas manually. Please refer to the JSON Schema draft 2020-12 release notes and JSON schema draft 2019-09 release notes for more information.

    However here are some guides to the more common migration patterns:

    "},{"location":"migration_guide/#updating-unique-keyword","title":"Updating unique keyword","text":"

    When you use unique in your schemas, you should update it to use uniqueItems or uniqueEntries instead.

    If you used the unique:true field, you should update it to use uniqueItems like this:

    nf-validationnf-schema
    {\n    \"$schema\": \"http://json-schema.org/draft-07/schema\",\n    \"type\": \"array\",\n    \"items\": {\n        \"type\": \"object\",\n        \"properties\": {\n            \"sample\": {\n                \"type\": \"string\",\n                \"unique\": true\n            }\n        }\n    }\n}\n
    {\n    \"$schema\": \"https://json-schema.org/draft/2020-12/schema\",\n    \"type\": \"array\",\n    \"items\": {\n        \"type\": \"object\",\n        \"properties\": {\n            \"sample\": {\n                \"type\": \"string\"\n            }\n        }\n    },\n    \"uniqueItems\": true\n}\n

    If you used the unique: [\"field1\", \"field2\"] field, you should update it to use uniqueEntries like this:

    nf-validationnf-schema
    {\n    \"$schema\": \"http://json-schema.org/draft-07/schema\",\n    \"type\": \"array\",\n    \"items\": {\n        \"type\": \"object\",\n        \"properties\": {\n            \"sample\": {\n                \"type\": \"string\",\n                \"unique\": [\"sample\"]\n            }\n        }\n    }\n}\n
    {\n    \"$schema\": \"https://json-schema.org/draft/2020-12/schema\",\n    \"type\": \"array\",\n    \"items\": {\n        \"type\": \"object\",\n        \"properties\": {\n            \"sample\": {\n                \"type\": \"string\"\n            }\n        }\n    },\n    \"uniqueEntries\": [\"sample\"]\n}\n
    "},{"location":"migration_guide/#updating-dependentrequired-keyword","title":"Updating dependentRequired keyword","text":"

    When you use dependentRequired in your schemas, you should update it like this:

    nf-validationnf-schema
    {\n    \"$schema\": \"http://json-schema.org/draft-07/schema\",\n    \"type\": \"object\",\n    \"properties\": {\n        \"fastq_1\": {\n            \"type\": \"string\",\n            \"format\": \"file-path\"\n        },\n        \"fastq_2\": {\n            \"type\": \"string\",\n            \"format\": \"file-path\",\n            \"dependentRequired\": [\"fastq_1\"]\n        }\n    }\n}\n
    {\n    \"$schema\": \"https://json-schema.org/draft/2020-12/schema\",\n    \"type\": \"object\",\n    \"properties\": {\n        \"fastq_1\": {\n            \"type\": \"string\",\n            \"format\": \"file-path\"\n        },\n        \"fastq_2\": {\n            \"type\": \"string\",\n            \"format\": \"file-path\"\n        }\n    },\n    \"dependentRequired\": {\n        \"fastq_2\": [\"fastq_1\"]\n    }\n}\n
    "},{"location":"migration_guide/#updating-the-help-message","title":"Updating the help message","text":"

    v2.1.0 feature

    The creation of the help message now needs to be enabled in the configuration file. Using --help or --helpFull will automatically print the help message and stop the pipeline execution. paramsHelp() is still available in nf-schema and can still be used like before. This could be helpful to print the help message in specific cases. Mind that this function now automatically emits a deprecation warning. This warning can be disabled using the hideWarning:true option of the function.

    nf-validationnf-schema main.nf
    if (params.help) {\n    log.info paramsHelp(\"nextflow run my_pipeline --input input_file.csv\")\n    exit 0\n}\n
    nextflow.config
    validation {\n    help {\n        enabled: true\n        command: \"nextflow run my_pipeline --input input_file.csv\"\n    }\n}\n
    "},{"location":"configuration/configuration/","title":"Configuration","text":"

    The plugin can be configured using several configuration options. These options have to be in the validation scope which means you can write them in two ways:

    validation.<option> = <value>\n

    OR

    validation {\n    <option1> = <value1>\n    <option2> = <value2>\n}\n
    "},{"location":"configuration/configuration/#parametersschema","title":"parametersSchema","text":"

    This option can be used to set the parameters JSON schema to be used by the plugin. This will affect parameter validation (validateParameters()), the summary logs (paramsSummaryLog() and paramsSummaryMap()) and the creation of the help messages.

    validation.parametersSchema = \"path/to/schema.json\" // default \"nextflow_schema.json\"\n

    This option can either be a path relative to the root of the pipeline directory or a full path to the JSON schema (Be wary to not use hardcoded local paths to ensure your pipeline will keep working on other systems)

    "},{"location":"configuration/configuration/#monochromelogs","title":"monochromeLogs","text":"

    This option can be used to turn of the colored logs from nf-validation. This can be useful if you run a Nextflow pipeline in an environment that doesn't support colored logging.

    validation.monochromeLogs = <true|false> // default: false\n
    "},{"location":"configuration/configuration/#lenientmode","title":"lenientMode","text":"

    This option can be used to make the type validation more lenient. In normal cases a value of \"12\" will fail if the type is an integer. This will succeed in lenient mode since that string can be cast to an integer.

    validation.lenientMode = <true|false> // default: false\n
    "},{"location":"configuration/configuration/#failunrecognisedparams","title":"failUnrecognisedParams","text":"

    By default the validateParameters() function will only give a warning if an unrecognised parameter has been given. This usually indicates that a typo has been made and can be easily overlooked when the plugin only emits a warning. You can turn this warning into an error with the failUnrecognisedParams option.

    validation.failUnrecognisedParams = <true|false> // default: false\n
    "},{"location":"configuration/configuration/#showhiddenparams","title":"showHiddenParams","text":"

    Deprecated

    This configuration option has been deprecated since v2.1.0. Please use validation.help.showHidden instead.

    By default the parameters, that have the \"hidden\": true annotation in the JSON schema, will not be shown in the help message. Turning on this option will make sure the hidden parameters are also shown.

    validation.showHiddenParams = <true|false> // default: false\n
    "},{"location":"configuration/configuration/#ignoreparams","title":"ignoreParams","text":"

    This option can be used to turn off the validation for certain parameters. It takes a list of parameter names as input.

    validation.ignoreParams = [\"param1\", \"param2\"] // default: []\n
    "},{"location":"configuration/configuration/#defaultignoreparams","title":"defaultIgnoreParams","text":"

    Warning

    This option should only be used by pipeline developers

    This option does exactly the same as validation.ignoreParams, but provides pipeline developers with a way to set the default parameters that should be ignored. This way the pipeline users don't have to re-specify the default ignored parameters when using the validation.ignoreParams option.

    validation.defaultIgnoreParams = [\"param1\", \"param2\"] // default: []\n
    "},{"location":"configuration/configuration/#help","title":"help","text":"

    The validation.help config scope can be used to configure the creation of the help message.

    This scope contains the following options:

    "},{"location":"configuration/configuration/#enabled","title":"enabled","text":"

    This option is used to enable the creation of the help message when the help parameters are used in the nextflow run command.

    validation.help.enabled = true // default: false\n
    "},{"location":"configuration/configuration/#shortparameter","title":"shortParameter","text":"

    This option can be used to change the --help parameter to another parameter. This parameter will print out the help message with all top level parameters.

    validation.help.shortParameter = \"giveMeHelp\" // default: \"help\"\n

    --giveMeHelp will now display the help message instead of --help for this example. This parameter will print out the help message.

    "},{"location":"configuration/configuration/#fullparameter","title":"fullParameter","text":"

    This option can be used to change the --helpFull parameter to another parameter.

    validation.help.shortParameter = \"giveMeHelpFull\" // default: \"helpFull\"\n

    --giveMeHelpFull will now display the expanded help message instead of --helpFull for this example.

    "},{"location":"configuration/configuration/#showhiddenparameter","title":"showHiddenParameter","text":"

    This option can be used to change the --showHidden parameter to another parameter. This parameter tells the plugin to also include the hidden parameters into the help message.

    validation.help.showHiddenParameter = \"showMeThoseHiddenParams\" // default: \"showHidden\"\n

    --showMeThoseHiddenParams will now make sure hidden parameters will be shown instead of --showHidden for this example.

    "},{"location":"configuration/configuration/#showhidden","title":"showHidden","text":"

    By default the parameters, that have the \"hidden\": true annotation in the JSON schema, will not be shown in the help message. Turning on this option will make sure the hidden parameters are also shown.

    validation.help.showHidden = <true|false> // default: false\n
    "},{"location":"configuration/configuration/#beforetext","title":"beforeText","text":"

    This option does not affect the help message created by the paramsHelp() function

    Any string provided to this option will printed before the help message.

    validation.help.beforeText = \"Running pipeline version 1.0\" // default: \"\"\n

    Info

    All color values (like \\033[0;31m, which means the color red) will be filtered out when validation.monochromeLogs is set to true

    "},{"location":"configuration/configuration/#command","title":"command","text":"

    This option does not affect the help message created by the paramsHelp() function

    This option can be used to add an example command to the help message. This will be printed after the beforeText and before the help message.

    validation.help.command = \"nextflow run main.nf --input samplesheet.csv --outdir output\" // default: \"\"\n

    This example will print the following message:

    Typical pipeline command:\n\n  nextflow run main.nf --input samplesheet.csv --outdir output\n

    Info

    All color values (like \\033[0;31m, which means the color red) will be filtered out when validation.monochromeLogs is set to true

    "},{"location":"configuration/configuration/#aftertext","title":"afterText","text":"

    This option does not affect the help message created by the paramsHelp() function

    Any string provided to this option will be printed after the help message.

    validation.help.afterText = \"Please cite the pipeline owners when using this pipeline\" // default: \"\"\n

    Info

    All color values (like \\033[0;31m, which means the color red) will be filtered out when validation.monochromeLogs is set to true

    "},{"location":"configuration/configuration/#summary","title":"Summary","text":"

    The validation.summary config scope can be used to configure the output of the paramsSummaryLog() function.

    This scope contains the following options:

    "},{"location":"configuration/configuration/#beforetext_1","title":"beforeText","text":"

    Any string provided to this option will printed before the parameters log message.

    validation.summary.beforeText = \"Running pipeline version 1.0\" // default: \"\"\n

    Info

    All color values (like \\033[0;31m, which means the color red) will be filtered out when validation.monochromeLogs is set to true

    "},{"location":"configuration/configuration/#aftertext_1","title":"afterText","text":"

    Any string provided to this option will be printed after the parameters log message.

    validation.summary.afterText = \"Please cite the pipeline owners when using this pipeline\" // default: \"\"\n

    Info

    All color values (like \\033[0;31m, which means the color red) will be filtered out when validation.monochromeLogs is set to true

    "},{"location":"contributing/setup/","title":"Getting started with plugin development","text":""},{"location":"contributing/setup/#compiling","title":"Compiling","text":"

    To compile and run the tests use the following command:

    ./gradlew check\n
    "},{"location":"contributing/setup/#launch-it-with-installed-nextflow","title":"Launch it with installed Nextflow","text":"

    Warning

    This method will add the development version of the plugin to your Nextflow plugins Take care when using this method and make sure that you are never using a development version to run real pipelines. You can delete all nf-schema versions using this command:

    rm -rf ~/.nextflow/plugins/nf-schema*\n

    make install\n
    nextflow.config
    plugins {\n    id 'nf-schema@x.y.z'\n}\n
    "},{"location":"contributing/setup/#launch-it-with-a-local-version-of-nextflow","title":"Launch it with a local version of Nextflow","text":"
    cd .. && git clone https://github.com/nextflow-io/nextflow\ncd nextflow && ./gradlew exportClasspath\n
    includeBuild('../nextflow')\n
    ./gradlew compileGroovy\n
    ./launch.sh run -plugins nf-schema <script/pipeline name> [pipeline params]\n
    "},{"location":"contributing/setup/#change-and-preview-the-docs","title":"Change and preview the docs","text":"

    The docs are generated using Material for MkDocs. You can install the required packages as follows:

    pip install mkdocs-material pymdown-extensions pillow cairosvg\n

    To change the docs, edit the files in the docs/ folder and run the following command to generate the docs:

    mkdocs serve\n

    To preview the docs, open the URL provided by mkdocs in your browser.

    "},{"location":"nextflow_schema/","title":"Nextflow schema for parameters","text":"

    The functionality of the nf-schema plugin centres on a pipeline schema file. By convention, this file is stored in the workflow root directory and called nextflow_schema.json.

    "},{"location":"nextflow_schema/#what-it-does","title":"What it does","text":"

    The schema file provides a place to describe the pipeline configuration. It is based on the JSON Schema format standard.

    In brief, it includes information for each parameter about:

    ..and more. See the full specification for details.

    Warning

    Although it's based on JSON Schema - there are some differences. We use a few non-standard keys and impose one or two limitations that are not present in the standard specification.

    Tip

    It's highly recommended that you don't try to write the schema JSON file manually. Instead, use the provided tooling - see Creating schema for details.

    "},{"location":"nextflow_schema/#how-its-used","title":"How it's used","text":"

    The nextflow_schema.json file and format have been in use for a few years now and are widely used in the community. Some specific examples of usage are:

    "},{"location":"nextflow_schema/#looking-to-the-future","title":"Looking to the future","text":"

    The pipeline schema has been developed to provide additional functionality not present in core Nextflow. It's our hope that at some point this functionality will be added to core Nextflow, making schema files redundant.

    See the GitHub issue Evolution of Nextflow configuration file (nextflow-io/nextflow#2723) on the Nextflow repo for discussion about potential new configuration file formats, which could potentially include the kind of information that we have within schema.

    "},{"location":"nextflow_schema/create_schema/","title":"Creating schema files","text":"

    Warning

    It's highly recommended that you don't try to write the schema JSON file manually!

    The schema files can get large and complex and are difficult to debug. Don't be tempted to open in your code editor - instead use the provided tools!

    "},{"location":"nextflow_schema/create_schema/#requirements","title":"Requirements","text":"

    To work with Nextflow schema files, you need the nf-core command-line tools package. You can find full installation instructions in the nf-core documentation, but in brief, you install as with any other Python package:

    pip install nf-core\n# -- OR -- #\nconda install nf-core # (1)!\n
    1. Note: Needs bioconda channels to be configured! See the Bioconda usage docs.

    Info

    Although these tools are currently within the nf-core tooling ecosystem, they should work with any Nextflow pipeline: you don't have to be using the nf-core template for this.

    Note

    We aim to extract this functionality into stand-alone tools at a future date, as we have done with the pipeline validation code in this plugin.

    "},{"location":"nextflow_schema/create_schema/#build-a-pipeline-schema","title":"Build a pipeline schema","text":"

    Once you have nf-core/tools installed and have written your pipeline configuration, go to the pipeline root and run the following:

    nf-core schema build\n

    Warning

    The current version of nf-core tools (v2.13.1) does not support the new schema draft used in nf-schema. Running this command after building the schema will convert the schema to the right draft:

    sed -i -e 's/http:\\/\\/json-schema.org\\/draft-07\\/schema/https:\\/\\/json-schema.org\\/draft\\/2020-12\\/schema/g' -e 's/definitions/$defs/g' nextflow_schema.json\n
    A new version of the nf-core schema builder will be available soon. Keep an eye out!

    The tool will run the nextflow config command to extract your pipeline's configuration and compare the output to your nextflow_schema.json file (if it exists). It will prompt you to update the schema file with any changes, then it will ask if you wish to edit the schema using the web interface.

    This web interface is where you should add detail to your schema, customising the various fields for each parameter.

    Tip

    You can run the nf-core schema build command again and again, as many times as you like. It's designed both for initial creation but also future updates of the schema file.

    It's a good idea to \"save little and often\" by clicking Finished and saving your work locally, then running the command again to continue.

    "},{"location":"nextflow_schema/create_schema/#build-a-sample-sheet-schema","title":"Build a sample sheet schema","text":"

    Danger

    There is currently no tooling to help you write sample sheet schema

    You can find an example in Example sample sheet schema

    Watch this space..

    "},{"location":"nextflow_schema/nextflow_schema_examples/","title":"Example Nextflow schema","text":"

    You can see an example JSON Schema for a Nextflow pipeline nextflow_schema.json file below.

    This file was generated from the nf-core pipeline template, using nf-core create. It is used as a test fixture in the nf-schema package here.

    Note

    More examples can be found in the plugin testResources directory.

    {\n    \"$schema\": \"https://json-schema.org/draft/2020-12/schema\",\n    \"$id\": \"https://raw.githubusercontent.com/nf-core/testpipeline/master/nextflow_schema.json\",\n    \"title\": \"nf-core/testpipeline pipeline parameters\",\n    \"description\": \"this is a test\",\n    \"type\": \"object\",\n    \"$defs\": {\n        \"input_output_options\": {\n            \"title\": \"Input/output options\",\n            \"type\": \"object\",\n            \"fa_icon\": \"fas fa-terminal\",\n            \"description\": \"Define where the pipeline should find input data and save output data.\",\n            \"required\": [\"input\", \"outdir\"],\n            \"properties\": {\n                \"input\": {\n                    \"type\": \"string\",\n                    \"format\": \"file-path\",\n                    \"mimetype\": \"text/csv\",\n                    \"pattern\": \"^\\\\S+\\\\.(csv|tsv|yaml|json)$\",\n                    \"description\": \"Path to comma-separated file containing information about the samples in the experiment.\",\n                    \"help_text\": \"You will need to create a design file with information about the samples in your experiment before running the pipeline. Use this parameter to specify its location. It has to be a comma-separated file with 3 columns, and a header row. See [usage docs](https://nf-co.re/testpipeline/usage#samplesheet-input).\",\n                    \"fa_icon\": \"fas fa-file-csv\"\n                },\n                \"outdir\": {\n                    \"type\": \"string\",\n                    \"format\": \"directory-path\",\n                    \"description\": \"The output directory where the results will be saved. You have to use absolute paths to storage on Cloud infrastructure.\",\n                    \"fa_icon\": \"fas fa-folder-open\"\n                },\n                \"email\": {\n                    \"type\": \"string\",\n                    \"description\": \"Email address for completion summary.\",\n                    \"fa_icon\": \"fas fa-envelope\",\n                    \"help_text\": \"Set this parameter to your e-mail address to get a summary e-mail with details of the run sent to you when the workflow exits. If set in your user config file (`~/.nextflow/config`) then you don't need to specify this on the command line for every run.\",\n                    \"pattern\": \"^([a-zA-Z0-9_\\\\-\\\\.]+)@([a-zA-Z0-9_\\\\-\\\\.]+)\\\\.([a-zA-Z]{2,5})$\"\n                },\n                \"multiqc_title\": {\n                    \"type\": \"string\",\n                    \"description\": \"MultiQC report title. Printed as page header, used for filename if not otherwise specified.\",\n                    \"fa_icon\": \"fas fa-file-signature\"\n                }\n            }\n        },\n        \"reference_genome_options\": {\n            \"title\": \"Reference genome options\",\n            \"type\": \"object\",\n            \"fa_icon\": \"fas fa-dna\",\n            \"description\": \"Reference genome related files and options required for the workflow.\",\n            \"properties\": {\n                \"genome\": {\n                    \"type\": \"string\",\n                    \"description\": \"Name of iGenomes reference.\",\n                    \"fa_icon\": \"fas fa-book\",\n                    \"help_text\": \"If using a reference genome configured in the pipeline using iGenomes, use this parameter to give the ID for the reference. This is then used to build the full paths for all required reference genome files e.g. `--genome GRCh38`. \\n\\nSee the [nf-core website docs](https://nf-co.re/usage/reference_genomes) for more details.\"\n                },\n                \"fasta\": {\n                    \"type\": \"string\",\n                    \"format\": \"file-path\",\n                    \"mimetype\": \"text/plain\",\n                    \"pattern\": \"^\\\\S+\\\\.fn?a(sta)?(\\\\.gz)?$\",\n                    \"description\": \"Path to FASTA genome file.\",\n                    \"help_text\": \"This parameter is *mandatory* if `--genome` is not specified. If you don't have a BWA index available this will be generated for you automatically. Combine with `--save_reference` to save BWA index for future runs.\",\n                    \"fa_icon\": \"far fa-file-code\"\n                },\n                \"igenomes_base\": {\n                    \"type\": \"string\",\n                    \"format\": \"directory-path\",\n                    \"description\": \"Directory / URL base for iGenomes references.\",\n                    \"default\": \"s3://ngi-igenomes/igenomes\",\n                    \"fa_icon\": \"fas fa-cloud-download-alt\",\n                    \"hidden\": true\n                },\n                \"igenomes_ignore\": {\n                    \"type\": \"boolean\",\n                    \"description\": \"Do not load the iGenomes reference config.\",\n                    \"fa_icon\": \"fas fa-ban\",\n                    \"hidden\": true,\n                    \"help_text\": \"Do not load `igenomes.config` when running the pipeline. You may choose this option if you observe clashes between custom parameters and those supplied in `igenomes.config`.\"\n                }\n            }\n        },\n        \"institutional_config_options\": {\n            \"title\": \"Institutional config options\",\n            \"type\": \"object\",\n            \"fa_icon\": \"fas fa-university\",\n            \"description\": \"Parameters used to describe centralised config profiles. These should not be edited.\",\n            \"help_text\": \"The centralised nf-core configuration profiles use a handful of pipeline parameters to describe themselves. This information is then printed to the Nextflow log when you run a pipeline. You should not need to change these values when you run a pipeline.\",\n            \"properties\": {\n                \"custom_config_version\": {\n                    \"type\": \"string\",\n                    \"description\": \"Git commit id for Institutional configs.\",\n                    \"default\": \"master\",\n                    \"hidden\": true,\n                    \"fa_icon\": \"fas fa-users-cog\"\n                },\n                \"custom_config_base\": {\n                    \"type\": \"string\",\n                    \"description\": \"Base directory for Institutional configs.\",\n                    \"default\": \"https://raw.githubusercontent.com/nf-core/configs/master\",\n                    \"hidden\": true,\n                    \"help_text\": \"If you're running offline, Nextflow will not be able to fetch the institutional config files from the internet. If you don't need them, then this is not a problem. If you do need them, you should download the files from the repo and tell Nextflow where to find them with this parameter.\",\n                    \"fa_icon\": \"fas fa-users-cog\"\n                },\n                \"config_profile_name\": {\n                    \"type\": \"string\",\n                    \"description\": \"Institutional config name.\",\n                    \"hidden\": true,\n                    \"fa_icon\": \"fas fa-users-cog\"\n                },\n                \"config_profile_description\": {\n                    \"type\": \"string\",\n                    \"description\": \"Institutional config description.\",\n                    \"hidden\": true,\n                    \"fa_icon\": \"fas fa-users-cog\"\n                },\n                \"config_profile_contact\": {\n                    \"type\": \"string\",\n                    \"description\": \"Institutional config contact information.\",\n                    \"hidden\": true,\n                    \"fa_icon\": \"fas fa-users-cog\"\n                },\n                \"config_profile_url\": {\n                    \"type\": \"string\",\n                    \"description\": \"Institutional config URL link.\",\n                    \"hidden\": true,\n                    \"fa_icon\": \"fas fa-users-cog\"\n                }\n            }\n        },\n        \"max_job_request_options\": {\n            \"title\": \"Max job request options\",\n            \"type\": \"object\",\n            \"fa_icon\": \"fab fa-acquisitions-incorporated\",\n            \"description\": \"Set the top limit for requested resources for any single job.\",\n            \"help_text\": \"If you are running on a smaller system, a pipeline step requesting more resources than are available may cause the Nextflow to stop the run with an error. These options allow you to cap the maximum resources requested by any single job so that the pipeline will run on your system.\\n\\nNote that you can not _increase_ the resources requested by any job using these options. For that you will need your own configuration file. See [the nf-core website](https://nf-co.re/usage/configuration) for details.\",\n            \"properties\": {\n                \"max_cpus\": {\n                    \"type\": \"integer\",\n                    \"description\": \"Maximum number of CPUs that can be requested for any single job.\",\n                    \"default\": 16,\n                    \"fa_icon\": \"fas fa-microchip\",\n                    \"hidden\": true,\n                    \"help_text\": \"Use to set an upper-limit for the CPU requirement for each process. Should be an integer e.g. `--max_cpus 1`\"\n                },\n                \"max_memory\": {\n                    \"type\": \"string\",\n                    \"description\": \"Maximum amount of memory that can be requested for any single job.\",\n                    \"default\": \"128.GB\",\n                    \"fa_icon\": \"fas fa-memory\",\n                    \"pattern\": \"^\\\\d+(\\\\.\\\\d+)?\\\\.?\\\\s*(K|M|G|T)?B$\",\n                    \"hidden\": true,\n                    \"help_text\": \"Use to set an upper-limit for the memory requirement for each process. Should be a string in the format integer-unit e.g. `--max_memory '8.GB'`\"\n                },\n                \"max_time\": {\n                    \"type\": \"string\",\n                    \"description\": \"Maximum amount of time that can be requested for any single job.\",\n                    \"default\": \"240.h\",\n                    \"fa_icon\": \"far fa-clock\",\n                    \"pattern\": \"^(\\\\d+\\\\.?\\\\s*(s|m|h|day)\\\\s*)+$\",\n                    \"hidden\": true,\n                    \"help_text\": \"Use to set an upper-limit for the time requirement for each process. Should be a string in the format integer-unit e.g. `--max_time '2.h'`\"\n                }\n            }\n        },\n        \"generic_options\": {\n            \"title\": \"Generic options\",\n            \"type\": \"object\",\n            \"fa_icon\": \"fas fa-file-import\",\n            \"description\": \"Less common options for the pipeline, typically set in a config file.\",\n            \"help_text\": \"These options are common to all nf-core pipelines and allow you to customise some of the core preferences for how the pipeline runs.\\n\\nTypically these options would be set in a Nextflow config file loaded for all pipeline runs, such as `~/.nextflow/config`.\",\n            \"properties\": {\n                \"help\": {\n                    \"type\": [\"string\", \"boolean\"],\n                    \"description\": \"Display help text.\",\n                    \"fa_icon\": \"fas fa-question-circle\",\n                    \"hidden\": true\n                },\n                \"publish_dir_mode\": {\n                    \"type\": \"string\",\n                    \"default\": \"copy\",\n                    \"description\": \"Method used to save pipeline results to output directory.\",\n                    \"help_text\": \"The Nextflow `publishDir` option specifies which intermediate files should be saved to the output directory. This option tells the pipeline what method should be used to move these files. See [Nextflow docs](https://www.nextflow.io/docs/latest/process.html#publishdir) for details.\",\n                    \"fa_icon\": \"fas fa-copy\",\n                    \"enum\": [\"symlink\", \"rellink\", \"link\", \"copy\", \"copyNoFollow\", \"move\"],\n                    \"hidden\": true\n                },\n                \"email_on_fail\": {\n                    \"type\": \"string\",\n                    \"description\": \"Email address for completion summary, only when pipeline fails.\",\n                    \"fa_icon\": \"fas fa-exclamation-triangle\",\n                    \"pattern\": \"^([a-zA-Z0-9_\\\\-\\\\.]+)@([a-zA-Z0-9_\\\\-\\\\.]+)\\\\.([a-zA-Z]{2,5})$\",\n                    \"help_text\": \"An email address to send a summary email to when the pipeline is completed - ONLY sent if the pipeline does not exit successfully.\",\n                    \"hidden\": true\n                },\n                \"plaintext_email\": {\n                    \"type\": \"boolean\",\n                    \"description\": \"Send plain-text email instead of HTML.\",\n                    \"fa_icon\": \"fas fa-remove-format\",\n                    \"hidden\": true\n                },\n                \"max_multiqc_email_size\": {\n                    \"type\": \"string\",\n                    \"description\": \"File size limit when attaching MultiQC reports to summary emails.\",\n                    \"pattern\": \"^\\\\d+(\\\\.\\\\d+)?\\\\.?\\\\s*(K|M|G|T)?B$\",\n                    \"default\": \"25.MB\",\n                    \"fa_icon\": \"fas fa-file-upload\",\n                    \"hidden\": true\n                },\n                \"monochrome_logs\": {\n                    \"type\": \"boolean\",\n                    \"description\": \"Do not use coloured log outputs.\",\n                    \"fa_icon\": \"fas fa-palette\",\n                    \"hidden\": true\n                },\n                \"multiqc_config\": {\n                    \"type\": \"string\",\n                    \"description\": \"Custom config file to supply to MultiQC.\",\n                    \"fa_icon\": \"fas fa-cog\",\n                    \"hidden\": true\n                },\n                \"tracedir\": {\n                    \"type\": \"string\",\n                    \"description\": \"Directory to keep pipeline Nextflow logs and reports.\",\n                    \"default\": \"${params.outdir}/pipeline_info\",\n                    \"fa_icon\": \"fas fa-cogs\",\n                    \"hidden\": true\n                },\n                \"validate_params\": {\n                    \"type\": \"boolean\",\n                    \"description\": \"Boolean whether to validate parameters against the schema at runtime\",\n                    \"default\": true,\n                    \"fa_icon\": \"fas fa-check-square\",\n                    \"hidden\": true\n                },\n                \"validationShowHiddenParams\": {\n                    \"type\": \"boolean\",\n                    \"fa_icon\": \"far fa-eye-slash\",\n                    \"description\": \"Show all params when using `--help`\",\n                    \"hidden\": true,\n                    \"help_text\": \"By default, parameters set as _hidden_ in the schema are not shown on the command line when a user runs with `--help`. Specifying this option will tell the pipeline to show all parameters.\"\n                },\n                \"enable_conda\": {\n                    \"type\": \"boolean\",\n                    \"description\": \"Run this workflow with Conda. You can also use '-profile conda' instead of providing this parameter.\",\n                    \"hidden\": true,\n                    \"fa_icon\": \"fas fa-bacon\"\n                }\n            }\n        }\n    },\n    \"allOf\": [\n        {\n            \"$ref\": \"#/$defs/input_output_options\"\n        },\n        {\n            \"$ref\": \"#/$defs/reference_genome_options\"\n        },\n        {\n            \"$ref\": \"#/$defs/institutional_config_options\"\n        },\n        {\n            \"$ref\": \"#/$defs/max_job_request_options\"\n        },\n        {\n            \"$ref\": \"#/$defs/generic_options\"\n        }\n    ]\n}\n
    "},{"location":"nextflow_schema/nextflow_schema_specification/","title":"Nextflow schema specification","text":"

    The Nextflow schema file contains information about pipeline configuration parameters. The file is typically saved in the workflow root directory and called nextflow_schema.json.

    The Nextflow schema syntax is based on the JSON schema standard, with some key differences. You can find more information about JSON Schema here:

    Warning

    This file is a reference specification, not documentation about how to write a schema manually.

    Please see Creating schema files for instructions on how to create these files (and don't be tempted to do it manually in a code editor!)

    Note

    The nf-schema plugin, as well as several other interfaces using Nextflow schema, uses a stock JSON schema library for parameter validation. As such, any valid JSON schema should work for validation.

    However, please note that graphical UIs (docs, launch interfaces) are largely hand-written and may not expect JSON schema usage that is not described here. As such, it's safest to stick to the specification described here and not the core JSON schema spec.

    "},{"location":"nextflow_schema/nextflow_schema_specification/#definitions","title":"Definitions","text":"

    A slightly strange use of a JSON schema standard that we use for Nextflow schema is $defs.

    JSON schema can group variables together in an object, but then the validation expects this structure to exist in the data that it is validating. In reality, we have a very long \"flat\" list of parameters, all at the top level of params.foo.

    In order to give some structure to log outputs, documentation and so on, we group parameters into $defs. Each def is an object with a title, description and so on. However, as they are under $defs scope they are effectively ignored by the validation and so their nested nature is not a problem. We then bring the contents of each definition object back to the \"flat\" top level for validation using a series of allOf statements at the end of the schema, which reference the specific definition keys.

    {\n  \"$schema\": \"https://json-schema.org/draft/2020-12/schema\",\n  \"type\": \"object\",\n  // Definition groups\n  \"$defs\": { // (1)!\n    \"my_group_of_params\": { // (2)!\n      \"title\": \"A virtual grouping used for docs and pretty-printing\",\n      \"type\": \"object\",\n      \"required\": [\"foo\", \"bar\"], // (3)!\n      \"properties\": { // (4)!\n        \"foo\": { // (5)!\n          \"type\": \"string\"\n        },\n        \"bar\": {\n          \"type\": \"string\"\n        }\n      }\n    }\n  },\n  // Contents of each definition group brought into main schema for validation\n  \"allOf\": [\n    { \"$ref\": \"#/$defs/my_group_of_params\" } // (6)!\n  ]\n}\n
    1. An arbitrary number of definition groups can go in here - these are ignored by main schema validation.
    2. This ID is used later in the allOf block to reference the definition.
    3. Note that any required properties need to be listed within this object scope.
    4. Actual parameter specifications go in here.
    5. Shortened here for the example, see below for full parameter specification.
    6. A $ref line like this needs to be added for every definition group

    Parameters can be described outside of the $defs scope, in the regular JSON Schema top-level properties scope. However, they will be displayed as ungrouped in tools working off the schema.

    "},{"location":"nextflow_schema/nextflow_schema_specification/#nested-parameters","title":"Nested parameters","text":"

    New feature in v2.1.0

    Nextflow config allows parameters to be nested as objects, for example:

    params {\n    foo {\n        bar = \"baz\"\n    }\n}\n

    or on the CLI:

    nextflow run <pipeline> --foo.bar \"baz\"\n

    Nested parameters can be specified in the schema by adding a properties keyword to the root parameters:

    {\n  \"type\": \"object\",\n  \"properties\": {\n    \"thisIsNested\": {\n      // Annotation for the --thisIsNested parameter\n      \"type\": \"object\", // Parameters that contain subparameters need to have the \"object\" type\n      \"properties\": {\n        // Add other parameters in here\n        \"deep\": {\n          // Annotation for the --thisIsNested.deep parameter\n          \"type\": \"string\"\n        }\n      }\n    }\n  }\n}\n

    There is no limit to how deeply nested parameters can be. Mind however that deeply nested parameters are not that user friendly and will create some very ugly help messages. It's advised to not go deeper than two levels of nesting.

    "},{"location":"nextflow_schema/nextflow_schema_specification/#required-parameters","title":"Required parameters","text":"

    Any parameters that must be specified should be set as required in the schema.

    Tip

    Make sure you do set null as a default value for the parameter, otherwise it will have a value even if not supplied by the pipeline user and the required property will have no effect.

    This is not done with a property key like other things described below, but rather by naming the parameter in the required array in the definition object / top-level object.

    For more information, see the JSON schema documentation.

    {\n  \"type\": \"object\",\n  \"properties\": {\n    \"name\": { \"type\": \"string\" },\n    \"email\": { \"type\": \"string\" },\n    \"address\": { \"type\": \"string\" },\n    \"telephone\": { \"type\": \"string\" }\n  },\n  \"required\": [\"name\", \"email\"]\n}\n
    "},{"location":"nextflow_schema/nextflow_schema_specification/#parameter-name","title":"Parameter name","text":"

    The properties object key must correspond to the parameter variable name in the Nextflow config.

    For example, for params.foo, the schema should look like this:

    // ..\n\"type\": \"object\",\n\"properties\": {\n    \"foo\": {\n        \"type\": \"string\",\n        // ..\n    }\n}\n// ..\n
    "},{"location":"nextflow_schema/nextflow_schema_specification/#keys-for-all-parameters","title":"Keys for all parameters","text":""},{"location":"nextflow_schema/nextflow_schema_specification/#type","title":"type","text":"

    Variable type, taken from the JSON schema keyword vocabulary:

    Validation checks that the supplied parameter matches the expected type, and will fail with an error if not.

    This JSON schema type is not supported:

    "},{"location":"nextflow_schema/nextflow_schema_specification/#default","title":"default","text":"

    Default value for the parameter.

    Should match the type and validation patterns set for the parameter in other fields.

    Tip

    If no default should be set, completely omit this key from the schema. Do not set it as an empty string, or null.

    However, parameters with no defaults should be set to null within your Nextflow config file.

    Note

    When creating a schema using nf-core schema build, this field will be automatically created based on the default value defined in the pipeline config files.

    Generally speaking, the two should always be kept in sync to avoid unexpected problems and usage errors. In some rare cases, this may not be possible (for example, a dynamic groovy expression cannot be encoded in JSON), in which case try to specify as \"sensible\" a default within the schema as possible.

    "},{"location":"nextflow_schema/nextflow_schema_specification/#description","title":"description","text":"

    A short description of what the parameter does, written in markdown. Printed in docs and terminal help text. Should be maximum one short sentence.

    "},{"location":"nextflow_schema/nextflow_schema_specification/#help_text","title":"help_text","text":"

    Non-standard key

    A longer text with usage help for the parameter, written in markdown. Can include newlines with multiple paragraphs and more complex markdown structures.

    Typically hidden by default in documentation and interfaces, unless explicitly clicked / requested.

    "},{"location":"nextflow_schema/nextflow_schema_specification/#errormessage","title":"errorMessage","text":"

    Non-standard key

    If validation fails, an error message is printed to the terminal, so that the end user knows what to fix. However, these messages are not always very clear - especially to newcomers.

    To improve this experience, pipeline developers can set a custom errorMessage for a given parameter in a the schema. If validation fails, this errorMessage is printed instead, and the raw JSON schema validation message goes to the Nextflow debug log output.

    For example, instead of printing:

    * --input (samples.yml): \"samples.yml\" does not match regular expression [^\\S+\\.csv$]\n

    We can set

    \"input\": {\n  \"type\": \"string\",\n  \"pattern\": \"^\\S+\\.csv$\",\n  \"errorMessage\": \"File name must end in '.csv' cannot contain spaces\"\n}\n

    and get:

    * --input (samples.yml): File name must end in '.csv' cannot contain spaces\n
    "},{"location":"nextflow_schema/nextflow_schema_specification/#deprecated","title":"deprecated","text":"

    Extended key

    A boolean JSON flag that instructs anything using the schema that this parameter/field is deprecated and should not be used. This can be useful to generate messages telling the user that a parameter has changed between versions.

    JSON schema states that this is an informative key only, but in nf-schema this will cause a validation error if the parameter/field is used.

    Tip

    Using the errorMessage keyword can be useful to provide more information about the deprecation and what to use instead.

    "},{"location":"nextflow_schema/nextflow_schema_specification/#enum","title":"enum","text":"

    An array of enumerated values: the parameter must match one of these values exactly to pass validation.

    {\n  \"enum\": [\"red\", \"amber\", \"green\"]\n}\n
    "},{"location":"nextflow_schema/nextflow_schema_specification/#fa_icon","title":"fa_icon","text":"

    Non-standard key

    A text identifier corresponding to an icon from Font Awesome. Used for easier visual navigation of documentation and pipeline interfaces.

    Should be the font-awesome class names, for example:

    \"fa_icon\": \"fas fa-file-csv\"\n
    "},{"location":"nextflow_schema/nextflow_schema_specification/#hidden","title":"hidden","text":"

    Non-standard key

    A boolean JSON flag that instructs anything using the schema that this is an unimportant parameter.

    Typically used to keep the pipeline docs / UIs uncluttered with common parameters which are not used by the majority of users. For example, --plaintext_email and --monochrome_logs.

    \"hidden\": true\n
    "},{"location":"nextflow_schema/nextflow_schema_specification/#string-specific-keys","title":"String-specific keys","text":""},{"location":"nextflow_schema/nextflow_schema_specification/#pattern","title":"pattern","text":"

    Regular expression which the string must match in order to pass validation.

    For example, this pattern only validates if the supplied string ends in .fastq, .fq, .fastq.gz or .fq.gz:

    {\n  \"type\": \"string\",\n  \"pattern\": \".*.f(ast)?q(.gz)?$\"\n}\n
    "},{"location":"nextflow_schema/nextflow_schema_specification/#minlength-maxlength","title":"minLength, maxLength","text":"

    Specify a minimum / maximum string length with minLength and maxLength.

    {\n  \"type\": \"string\",\n  \"minLength\": 2,\n  \"maxLength\": 3\n}\n
    "},{"location":"nextflow_schema/nextflow_schema_specification/#format","title":"format","text":"

    Formats can be used to give additional validation checks against string values for certain properties.

    Non-standard key (values)

    The format key is a standard JSON schema key, however we primarily use it for validating file / directory path operations with non-standard schema values.

    Example usage is as follows:

    {\n  \"type\": \"string\",\n  \"format\": \"file-path\"\n}\n

    The available format types are below:

    file-path States that the provided value is a file. Does not check its existence, but it does check if the path is not a directory. directory-path States that the provided value is a directory. Does not check its existence, but if it exists, it does check that the path is not a file. path States that the provided value is a path (file or directory). Does not check its existence. file-path-pattern States that the provided value is a glob pattern that will be used to fetch files. Checks that the pattern is valid and that at least one file is found."},{"location":"nextflow_schema/nextflow_schema_specification/#exists","title":"exists","text":"

    When a format is specified for a value, you can provide the key exists set to true in order to validate that the provided path exists. Set this to false to validate that the path does not exist.

    Example usage is as follows:

    {\n  \"type\": \"string\",\n  \"format\": \"file-path\",\n  \"exists\": true\n}\n

    Note

    If the parameter is an S3 URL path, this validation is ignored.

    "},{"location":"nextflow_schema/nextflow_schema_specification/#mimetype","title":"mimetype","text":"

    MIME type for a file path. Setting this value informs downstream tools about what kind of file is expected.

    Should only be set when format is file-path.

    {\n  \"type\": \"string\",\n  \"format\": \"file-path\",\n  \"mimetype\": \"text/csv\"\n}\n
    "},{"location":"nextflow_schema/nextflow_schema_specification/#schema","title":"schema","text":"

    Path to a JSON schema file used to validate the supplied file.

    Should only be set when format is file-path.

    Tip

    Setting this field is key to working with sample sheet validation and channel generation, as described in the next section of the nf-schema docs.

    These schema files are typically stored in the pipeline assets directory, but can be anywhere.

    {\n  \"type\": \"string\",\n  \"format\": \"file-path\",\n  \"schema\": \"assets/foo_schema.json\"\n}\n

    Note

    If the parameter is set to null, false or an empty string, this validation is ignored. The file won't be validated.

    "},{"location":"nextflow_schema/nextflow_schema_specification/#numeric-specific-keys","title":"Numeric-specific keys","text":""},{"location":"nextflow_schema/nextflow_schema_specification/#minimum-maximum","title":"minimum, maximum","text":"

    Specify a minimum / maximum value for an integer or float number length with minimum and maximum.

    If x is the value being validated, the following must hold true:

    {\n  \"type\": \"number\",\n  \"minimum\": 0,\n  \"maximum\": 100\n}\n

    Note

    The JSON schema doc also mention exclusiveMinimum, exclusiveMaximum and multipleOf keys. Because nf-schema uses stock JSON schema validation libraries, these should work for validating keys. However, they are not officially supported within the Nextflow schema ecosystem and so some interfaces may not recognise them.

    "},{"location":"nextflow_schema/nextflow_schema_specification/#array-specific-keys","title":"Array-specific keys","text":""},{"location":"nextflow_schema/nextflow_schema_specification/#uniqueitems","title":"uniqueItems","text":"

    All items in the array should be unique.

    {\n  \"type\": \"array\",\n  \"uniqueItems\": true\n}\n
    "},{"location":"nextflow_schema/nextflow_schema_specification/#uniqueentries","title":"uniqueEntries","text":"

    Non-standard key

    The combination of all values in the given keys should be unique. For this key to work you need to make sure the array items are of type object and contains the keys in the uniqueEntries list.

    {\n  \"type\": \"array\",\n  \"items\": {\n    \"type\": \"object\",\n    \"uniqueEntries\": [\"foo\", \"bar\"],\n    \"properties\": {\n      \"foo\": { \"type\": \"string\" },\n      \"bar\": { \"type\": \"string\" }\n    }\n  }\n}\n

    This schema tells nf-schema that the combination of foo and bar should be unique across all objects in the array.

    "},{"location":"nextflow_schema/sample_sheet_schema_examples/","title":"Example sample sheet schema","text":""},{"location":"nextflow_schema/sample_sheet_schema_examples/#nf-corernaseq-example","title":"nf-core/rnaseq example","text":"

    The nf-core/rnaseq pipeline was one of the first to have a sample sheet schema. You can see this, used for validating sample sheets with --input here: assets/schema_input.json.

    Tip

    Note the approach used for validating filenames in the fastq_2 column. The column is optional, so if a pattern was supplied by itself then validation would fail when no string is supplied.

    Instead, we say that the string must either match that pattern or it must have a maxLength of 0 (an empty string).

    {\n  \"$schema\": \"https://json-schema.org/draft/2020-12/schema\",\n  \"$id\": \"https://raw.githubusercontent.com/nf-core/rnaseq/master/assets/schema_input.json\",\n  \"title\": \"nf-core/rnaseq pipeline - params.input schema\",\n  \"description\": \"Schema for the file provided with params.input\",\n  \"type\": \"array\",\n  \"items\": {\n    \"type\": \"object\",\n    \"properties\": {\n      \"sample\": {\n        \"type\": \"string\",\n        \"pattern\": \"^\\\\S+$\",\n        \"errorMessage\": \"Sample name must be provided and cannot contain spaces\",\n        \"meta\": [\"my_sample\"]\n      },\n      \"fastq_1\": {\n        \"type\": \"string\",\n        \"pattern\": \"^\\\\S+\\\\.f(ast)?q\\\\.gz$\",\n        \"format\": \"file-path\",\n        \"errorMessage\": \"FastQ file for reads 1 must be provided, cannot contain spaces and must have extension '.fq.gz' or '.fastq.gz'\"\n      },\n      \"fastq_2\": {\n        \"errorMessage\": \"FastQ file for reads 2 cannot contain spaces and must have extension '.fq.gz' or '.fastq.gz'\",\n        \"type\": \"string\",\n        \"pattern\": \"^\\\\S+\\\\.f(ast)?q\\\\.gz$\",\n        \"format\": \"file-path\"\n      },\n      \"strandedness\": {\n        \"type\": \"string\",\n        \"errorMessage\": \"Strandedness must be provided and be one of 'forward', 'reverse' or 'unstranded'\",\n        \"enum\": [\"forward\", \"reverse\", \"unstranded\"],\n        \"meta\": [\"my_strandedness\"]\n      }\n    },\n    \"required\": [\"sample\", \"fastq_1\", \"strandedness\"]\n  }\n}\n
    "},{"location":"nextflow_schema/sample_sheet_schema_examples/#nf-schema-test-case","title":"nf-schema test case","text":"

    You can see a very feature-complete example JSON Schema for a sample sheet schema file below.

    It is used as a test fixture in the nf-schema package here.

    Note

    More examples can be found in the plugin testResources directory.

    {\n    \"$schema\": \"https://json-schema.org/draft/2020-12/schema\",\n    \"$id\": \"https://raw.githubusercontent.com/nextflow-io/nf-schema/master/plugins/nf-schema/src/testResources/schema_input.json\",\n    \"title\": \"Samplesheet validation schema\",\n    \"description\": \"Schema for the samplesheet used in this pipeline\",\n    \"type\": \"array\",\n    \"items\": {\n        \"type\": \"object\",\n        \"properties\": {\n            \"field_1\": {\n                \"type\": \"string\",\n                \"meta\": [\"string1\",\"string2\"],\n                \"default\": \"value\"\n            },\n            \"field_2\": {\n                \"type\": \"integer\",\n                \"meta\": [\"integer1\",\"integer2\"],\n                \"default\": 0\n            },\n            \"field_3\": {\n                \"type\": \"boolean\",\n                \"meta\": [\"boolean1\",\"boolean2\"],\n                \"default\": true\n            },\n            \"field_4\": {\n                \"type\": \"string\"\n            },\n            \"field_5\": {\n                \"type\": \"number\"\n            },\n            \"field_6\": {\n                \"type\": \"boolean\"\n            },\n            \"field_7\": {\n                \"type\": \"string\",\n                \"format\": \"file-path\",\n                \"exists\": true,\n                \"pattern\": \"^.*\\\\.txt$\"\n            },\n            \"field_8\": {\n                \"type\": \"string\",\n                \"format\": \"directory-path\",\n                \"exists\": true\n            },\n            \"field_9\": {\n                \"type\": \"string\",\n                \"format\": \"path\",\n                \"exists\": true\n            },\n            \"field_10\": {\n                \"type\": \"string\"\n            },\n            \"field_11\": {\n                \"type\": \"integer\"\n            },\n            \"field_12\": {\n                \"type\": \"string\",\n                \"default\": \"itDoesExist\"\n            }\n        },\n        \"required\": [\"field_4\", \"field_6\"],\n        \"dependentRequired\": {\n            \"field_1\": [\"field_2\", \"field_3\"]\n        }\n    },\n    \"allOf\": [\n        {\"uniqueEntries\": [\"field_11\", \"field_10\"]},\n        {\"uniqueEntries\": [\"field_10\"]}\n    ]\n}\n
    "},{"location":"nextflow_schema/sample_sheet_schema_specification/","title":"Sample sheet schema specification","text":"

    Sample sheet schema files are used by the nf-schema plugin for validation of sample sheet contents and type conversion / channel generation.

    The Nextflow schema syntax is based on the JSON schema standard. You can find more information about JSON Schema here:

    "},{"location":"nextflow_schema/sample_sheet_schema_specification/#schema-structure","title":"Schema structure","text":"

    Validation by the plugin works by parsing the supplied file contents into a groovy object, then passing this to the JSON schema validation library. As such, the structure of the schema must match the structure of the parsed file.

    Typically, samplesheets are CSV files, with fields represented as columns and samples as rows. TSV, JSON and YAML samplesheets are also supported by this plugin

    In this case, the parsed object will be an array (see JSON schema docs). The array type is associated with an items key which in our case contains a single object. The object has properties, where the keys must match the headers of the CSV file.

    So, for CSV samplesheets, the top-level schema should look something like this:

    {\n  \"$schema\": \"https://json-schema.org/draft/2020-12/schema\",\n  \"type\": \"array\",\n  \"items\": {\n    \"type\": \"object\",\n    \"properties\": {\n      \"field_1\": { \"type\": \"string\" },\n      \"field_2\": { \"type\": \"string\" }\n    }\n  }\n}\n

    If your sample sheet has a different format (for example, a nested YAML file), you will need to build your schema to match the parsed structure.

    "},{"location":"nextflow_schema/sample_sheet_schema_specification/#properties","title":"Properties","text":"

    Every array object will contain keys for each field. Each field should be described as an element in the object properties section.

    The keys of each property must match the header text used in the sample sheet.

    Fields that are present in the sample sheet, but not in the schema will be ignored and produce a warning.

    Tip

    The order of columns in the sample sheet is not relevant, as long as the header text matches.

    Warning

    The order of properties in the schema is important. This order defines the order of output channel properties when using the samplesheetToList() function.

    "},{"location":"nextflow_schema/sample_sheet_schema_specification/#common-keys","title":"Common keys","text":"

    The majority of schema keys for sample sheet schema validation are identical to the Nextflow schema. For example: type, pattern, format, errorMessage, exists and so on.

    Please refer to the Nextflow schema specification docs for details.

    "},{"location":"nextflow_schema/sample_sheet_schema_specification/#sample-sheet-keys","title":"Sample sheet keys","text":"

    Below are the properties that are specific to sample sheet schema. These exist in addition to those described in the Nextflow schema specification.

    "},{"location":"nextflow_schema/sample_sheet_schema_specification/#meta","title":"meta","text":"

    Type: List or String

    The current field will be considered a meta value when this parameter is present. This parameter should contain a list of the meta fields or a string stating a single meta field to assign this value to. The default is no meta for each field.

    For example:

    {\n  \"meta\": \"id\"\n}\n

    will convert the field value to a meta value, resulting in the channel [[id:value]...] See here for an example in the sample sheet.

    "},{"location":"parameters/help_text/","title":"Help text","text":""},{"location":"parameters/help_text/#configure-help-message","title":"Configure help message","text":"

    Add the following configuration to your configuration files to enable the creation of help messages:

    nextflow.config
    validation {\n    help {\n        enabled: true\n    }\n}\n

    That's it! Every time the pipeline user passes the --help and --helpFull parameters to the pipeline, the help message will be created!

    The help message can be customized with a series of different options. See help configuration docs for a list of all options.

    "},{"location":"parameters/help_text/#help-message","title":"Help message","text":"

    Following example shows a snippet of a JSON schema which can be used to perfect visualize the differences between the different help messages. This schema contains one group of parameters called Input parameters that contains two parameters: --input and --outdir. There are also two ungrouped parameters in this schema: --reference and --type. --reference is a nested parameter that contains the .fasta, .fai and .aligners subparameters. .aligners also contains two subparameters: .bwa and .bowtie.

    There are three different help messages:

    1. Using --help will only show the top level parameters (--input, --outdir, --reference and --type in the example). The type, description, possible options and defaults of these parameters will also be added to the message if they are present in the JSON schema.
    2. Using --helpFull will print all parameters (no matter how deeply nested they are) (--input, --outdir, --reference.fasta, --reference.fai, --reference.aligners.bwa, --reference.aligners.bowtie and --type in the example)
    3. --help can also be used with a parameter given to it. This will print out a detailed help message of the parameter. This will also show the subparameters present for the parameter.
    JSON schema--help--helpFull--help input--help reference.aligners
    ...\n\"$defs\": { // A section to define several definition in the JSON schema\n    \"Input parameters\": { // A group called \"Input parameters\"\n        \"properties\": { // All properties (=parameters) in this group\n            \"input\": {\n                \"type\": \"string\",\n                \"description\": \"The input samplesheet\",\n                \"format\": \"file-path\",\n                \"pattern\": \"^.$\\.csv$\",\n                \"help_text\": \"This file needs to contain all input samples\",\n                \"exists\": true\n            },\n            \"outdir\": {\n                \"type\": \"string\",\n                \"description\": \"The output directory\",\n                \"format\": \"directory-path\",\n                \"default\": \"results\"\n            }\n        }\n    }\n},\n\"properties\": { // Ungrouped parameters go here\n    \"reference\": {\n        \"type\": \"object\", // A parameter that contains nested parameters is always an \"object\"\n        \"description\": \"A group of parameters to configure the reference sets\",\n        \"properties\": { // All parameters nested in the --reference parameter\n            \"fasta\": {\n                \"type\": \"string\",\n                \"description\": \"The FASTA file\"\n            },\n            \"fai\": {\n                \"type\": \"string\",\n                \"description\": \"The FAI file\"\n            },\n            \"aligners\": {\n                \"type\": \"object\",\n                \"description\": \"A group of parameters specifying the aligner indices\",\n                \"properties\": { // All parameters nested in the --reference.aligners parameter\n                    \"bwa\": {\n                        \"type\": \"string\",\n                        \"description\": \"The BWA index\"\n                    },\n                    \"bowtie\": {\n                        \"type\": \"string\",\n                        \"description\": \"The BOWTIE index\"\n                    }\n                }\n            }\n        }\n    },\n    \"type\": {\n        \"type\": \"string\",\n        \"description\": \"The analysis type\",\n        \"enum\": [\"WES\",\"WGS\"]\n    }\n}\n...\n
    --reference  [object]          A group of parameters to configure the reference sets\n--type       [string]          The analysis type (accepted: WES, WGS)\n--help       [boolean, string] Show the help message for all top level parameters. When a parameter is given to `--help`, the full help message of that parameter will be printed.\n--helpFull   [boolean]         Show the help message for all non-hidden parameters.\n--showHidden [boolean]         Show all hidden parameters in the help message. This needs to be used in combination with `--help` or `--helpFull`.\n\nInput parameters\n    --input  [string] The input samplesheet\n    --outdir [string] The output directory [default: results]\n
    --reference.fasta           [string]          The FASTA file\n--reference.fai             [string]          The FAI file\n--reference.aligners.bwa    [string]          The BWA index\n--reference.aligners.bowtie [string]          The BOWTIE index\n--type                      [string]          The analysis type (accepted: WES, WGS)\n--help                      [boolean, string] Show the help message for all top level parameters. When a parameter is given to `--help`, the full help message of that parameter will be printed.\n--helpFull                  [boolean]         Show the help message for all non-hidden parameters.\n--showHidden                [boolean]         Show all hidden parameters in the help message. This needs to be used in combination with `--help` or `--helpFull`.\n\nInput parameters\n    --input                 [string] The input samplesheet\n    --outdir                [string] The output directory [default: results]\n
    --input\n    type       : string\n    description: The input samplesheet\n    format     : file-path\n    pattern    : ^.$\\.csv$\n    help_text  : This file needs to contain all input samples\n    exists     : true\n
    --reference.aligners\n    type       : object\n    description: A group of parameters specifying the aligner indices\n    options    :\n        --reference.aligners.bwa    [string] The BWA index\n        --reference.aligners.bowtie [string] The BOWTIE index\n

    The help message will always show the ungrouped parameters first. --help, --helpFull and --showHidden will always be automatically added to the help message. These defaults can be overwritten by adding them as ungrouped parameters to the JSON schema.

    After the ungrouped parameters, the grouped parameters will be printed.

    "},{"location":"parameters/help_text/#hidden-parameters","title":"Hidden parameters","text":"

    Params that are set as hidden in the JSON Schema are not shown in the help message. To show these parameters, pass the --showHidden parameter to the nextflow command.

    "},{"location":"parameters/help_text/#coloured-logs","title":"Coloured logs","text":"

    By default, the help output is coloured using ANSI escape codes.

    If you prefer, you can disable these by setting the validation.monochromeLogs configuration option to true

    Default (coloured)Monochrome logs

    "},{"location":"parameters/help_text/#paramshelp","title":"paramsHelp()","text":"

    Deprecated

    This function has been deprecated in v2.1.0. Use the help configuration instead

    This function returns a help message with the command to run a pipeline and the available parameters. Pass it to log.info to print in the terminal.

    It accepts three arguments:

    1. An example command, typically used to run the pipeline, to be included in the help string
    2. An option to set the file name of a Nextflow Schema file: parameters_schema: <schema.json> (Default: nextflow_schema.json)
    3. An option to hide the deprecation warning: hideWarning: <true/false> (Default: false)

    Note

    paramsHelp() doesn't stop pipeline execution after running. You must add this into your pipeline code if it's the desired functionality.

    Typical usage:

    main.nfnextflow.confignextflow_schema.json
    include { paramsHelp } from 'plugin/nf-schema'\n\nif (params.help) {\n    log.info paramsHelp(\"nextflow run my_pipeline --input input_file.csv\")\n    exit 0\n}\n
    plugins {\n  id 'nf-schema@2.0.0'\n}\n\nparams {\n  input = \"samplesheet.csv\"\n  outdir = \"results\"\n}\n
    {\n    \"$schema\": \"https://json-schema.org/draft/2020-12/schema\",\n    \"$id\": \"https://raw.githubusercontent.com/nf-core/testpipeline/master/nextflow_schema.json\",\n    \"title\": \"nf-core/testpipeline pipeline parameters\",\n    \"description\": \"this is a test\",\n    \"type\": \"object\",\n    \"$defs\": {\n        \"input_output_options\": {\n            \"title\": \"Input/output options\",\n            \"type\": \"object\",\n            \"fa_icon\": \"fas fa-terminal\",\n            \"description\": \"Define where the pipeline should find input data and save output data.\",\n            \"required\": [\"input\", \"outdir\"],\n            \"properties\": {\n                \"input\": {\n                    \"type\": \"string\",\n                    \"format\": \"file-path\",\n                    \"mimetype\": \"text/csv\",\n                    \"schema\": \"assets/schema_input.json\",\n                    \"pattern\": \"^\\\\S+\\\\.(csv|tsv|yaml|json)$\",\n                    \"description\": \"Path to comma-separated file containing information about the samples in the experiment.\",\n                    \"help_text\": \"You will need to create a design file with information about the samples in your experiment before running the pipeline. Use this parameter to specify its location. It has to be a comma-separated file with 3 columns, and a header row. See [usage docs](https://nf-co.re/testpipeline/usage#samplesheet-input).\",\n                    \"fa_icon\": \"fas fa-file-csv\"\n                },\n                \"outdir\": {\n                    \"type\": \"string\",\n                    \"format\": \"directory-path\",\n                    \"description\": \"The output directory where the results will be saved. You have to use absolute paths to storage on Cloud infrastructure.\",\n                    \"fa_icon\": \"fas fa-folder-open\"\n                }\n            }\n        }\n    },\n    \"allOf\": [\n        {\n            \"$ref\": \"#/$defs/input_output_options\"\n        }\n    ]\n}\n

    Output:

    N E X T F L O W  ~  version 23.04.1\nLaunching `pipeline/main.nf` [infallible_turing] DSL2 - revision: 8bf4c8d053\n\nTypical pipeline command:\n\n  nextflow run my_pipeline --input input_file.csv\n\nInput/output options\n  --input  [string]  Path to comma-separated file containing information about the samples in the experiment.\n  --outdir [string]  The output directory where the results will be saved. You have to use absolute paths to storage on Cloud infrastructure.\n\n------------------------------------------------------\n

    Warning

    We shouldn't be using exit as it kills the Nextflow head job in a way that is difficult to handle by systems that may be running it externally, but at the time of writing there is no good alternative. See nextflow-io/nextflow#3984.

    "},{"location":"parameters/summary_log/","title":"Summary log","text":""},{"location":"parameters/summary_log/#paramssummarylog","title":"paramsSummaryLog()","text":"

    This function returns a string that can be logged to the terminal, summarizing the parameters provided to the pipeline.

    Note

    The summary prioritizes displaying only the parameters that are different than the default schema values. Parameters which don't have a default in the JSON Schema and which have a value of null, \"\", false or 'false' won't be returned in the map. This is to streamline the extensive parameter lists often associated with pipelines, and highlight the customized elements. This feature is essential for users to verify their configurations, like checking for typos or confirming proper resolution, without wading through an array of default settings.

    The function takes two arguments:

    Typical usage:

    main.nfnextflow.confignextflow_schema.json
    include { paramsSummaryLog } from 'plugin/nf-schema'\n\nlog.info paramsSummaryLog(workflow)\n
    plugins {\n  id 'nf-schema@2.0.0'\n}\n\nparams {\n  input = \"samplesheet.csv\"\n  outdir = \"results\"\n}\n
    {\n    \"$schema\": \"https://json-schema.org/draft/2020-12/schema\",\n    \"$id\": \"https://raw.githubusercontent.com/nf-core/testpipeline/master/nextflow_schema.json\",\n    \"title\": \"nf-core/testpipeline pipeline parameters\",\n    \"description\": \"this is a test\",\n    \"type\": \"object\",\n    \"$defs\": {\n        \"input_output_options\": {\n            \"title\": \"Input/output options\",\n            \"type\": \"object\",\n            \"fa_icon\": \"fas fa-terminal\",\n            \"description\": \"Define where the pipeline should find input data and save output data.\",\n            \"required\": [\"input\", \"outdir\"],\n            \"properties\": {\n                \"input\": {\n                    \"type\": \"string\",\n                    \"format\": \"file-path\",\n                    \"mimetype\": \"text/csv\",\n                    \"schema\": \"assets/schema_input.json\",\n                    \"pattern\": \"^\\\\S+\\\\.(csv|tsv|yaml|json)$\",\n                    \"description\": \"Path to comma-separated file containing information about the samples in the experiment.\",\n                    \"help_text\": \"You will need to create a design file with information about the samples in your experiment before running the pipeline. Use this parameter to specify its location. It has to be a comma-separated file with 3 columns, and a header row. See [usage docs](https://nf-co.re/testpipeline/usage#samplesheet-input).\",\n                    \"fa_icon\": \"fas fa-file-csv\"\n                },\n                \"outdir\": {\n                    \"type\": \"string\",\n                    \"format\": \"directory-path\",\n                    \"description\": \"The output directory where the results will be saved. You have to use absolute paths to storage on Cloud infrastructure.\",\n                    \"fa_icon\": \"fas fa-folder-open\"\n                }\n            }\n        }\n    },\n    \"allOf\": [\n        {\n            \"$ref\": \"#/$defs/input_output_options\"\n        }\n    ]\n}\n

    Output:

    N E X T F L O W  ~  version 23.04.1\nLaunching `pipeline/main.nf` [sleepy_goldberg] DSL2 - revision: 7a280216f3\n\nCore Nextflow options\n  runName    : sleepy_goldberg\n  launchDir  : /Users/demo/GitHub/nextflow-io/nf-schema/examples/paramsSummaryLog\n  workDir    : /Users/demo/GitHub/nextflow-io/nf-schema/examples/paramsSummaryLog/work\n  projectDir : /Users/demo/GitHub/nextflow-io/nf-schema/examples/paramsSummaryLog/pipeline\n  userName   : demo\n  profile    : standard\n  configFiles:\n\nInput/output options\n  input      : samplesheet.csv\n  outdir     : results\n\n!! Only displaying parameters that differ from the pipeline defaults !!\n------------------------------------------------------\n
    "},{"location":"parameters/summary_log/#coloured-logs","title":"Coloured logs","text":"

    By default, the summary output is coloured using ANSI escape codes.

    If you prefer, you can disable these by using the argument monochrome_logs, e.g. paramsHelp(monochrome_logs: true). Alternatively this can be set at a global level via parameter --monochrome_logs or adding params.monochrome_logs = true to a configuration file. Not --monochromeLogs or params.monochromeLogs is also supported.

    Default (coloured)Monochrome logs

    "},{"location":"parameters/summary_log/#paramssummarymap","title":"paramsSummaryMap()","text":"

    This function returns a Groovy Map summarizing parameters/workflow options used by the pipeline. As above, it only returns the provided parameters that are different to the default values.

    This function takes the same arguments as paramsSummaryLog(): the workflow object and an optional schema file path.

    Note

    Parameters which don't have a default in the JSON Schema and which have a value of null, \"\", false or 'false' won't be returned in the map.

    Typical usage:

    main.nfnextflow.confignextflow_schema.json
    include { paramsSummaryMap } from 'plugin/nf-schema'\n\nprintln paramsSummaryMap(workflow)\n
    plugins {\n  id 'nf-schema@2.0.0'\n}\n\nparams {\n  input = \"samplesheet.csv\"\n  outdir = \"results\"\n}\n
    {\n    \"$schema\": \"https://json-schema.org/draft/2020-12/schema\",\n    \"$id\": \"https://raw.githubusercontent.com/nf-core/testpipeline/master/nextflow_schema.json\",\n    \"title\": \"nf-core/testpipeline pipeline parameters\",\n    \"description\": \"this is a test\",\n    \"type\": \"object\",\n    \"$defs\": {\n        \"input_output_options\": {\n            \"title\": \"Input/output options\",\n            \"type\": \"object\",\n            \"fa_icon\": \"fas fa-terminal\",\n            \"description\": \"Define where the pipeline should find input data and save output data.\",\n            \"required\": [\"input\", \"outdir\"],\n            \"properties\": {\n                \"input\": {\n                    \"type\": \"string\",\n                    \"format\": \"file-path\",\n                    \"mimetype\": \"text/csv\",\n                    \"schema\": \"assets/schema_input.json\",\n                    \"pattern\": \"^\\\\S+\\\\.(csv|tsv|yaml|json)$\",\n                    \"description\": \"Path to comma-separated file containing information about the samples in the experiment.\",\n                    \"help_text\": \"You will need to create a design file with information about the samples in your experiment before running the pipeline. Use this parameter to specify its location. It has to be a comma-separated file with 3 columns, and a header row. See [usage docs](https://nf-co.re/testpipeline/usage#samplesheet-input).\",\n                    \"fa_icon\": \"fas fa-file-csv\"\n                },\n                \"outdir\": {\n                    \"type\": \"string\",\n                    \"format\": \"directory-path\",\n                    \"description\": \"The output directory where the results will be saved. You have to use absolute paths to storage on Cloud infrastructure.\",\n                    \"fa_icon\": \"fas fa-folder-open\"\n                }\n            }\n        }\n    },\n    \"allOf\": [\n        {\n            \"$ref\": \"#/$defs/input_output_options\"\n        }\n    ]\n}\n

    Output:

    N E X T F L O W  ~  version 23.04.1\nLaunching `pipeline/main.nf` [happy_lamport] DSL2 - revision: c45338cd96\n\n[Core Nextflow options:[runName:happy_lamport, launchDir:/Users/ewels/GitHub/nextflow-io/nf-schema/examples/paramsSummaryMap, workDir:/Users/ewels/GitHub/nextflow-io/nf-schema/examples/paramsSummaryMap/work, projectDir:/Users/ewels/GitHub/nextflow-io/nf-schema/examples/paramsSummaryMap/pipeline, userName:ewels, profile:standard, configFiles:], Input/output options:[input:samplesheet.csv, outdir:results]]\n
    "},{"location":"parameters/validation/","title":"Validation of pipeline parameters","text":""},{"location":"parameters/validation/#validateparameters","title":"validateParameters()","text":"

    This function takes all pipeline parameters and checks that they adhere to the specifications defined in the JSON Schema.

    The function takes two optional arguments:

    You can provide the parameters as follows:

    validateParameters(parameters_schema: 'custom_nextflow_parameters.json', monochrome_logs: true)\n

    Monochrome logs can also be set globally providing the parameter --monochrome_logs or adding params.monochrome_logs = true to a configuration file. The form --monochromeLogs is also supported.

    Tip

    As much of the Nextflow ecosystem assumes the nextflow_schema.json filename, it's recommended to stick with the default, if possible.

    See the Schema specification for information about what validation data you can encode within the schema for each parameter.

    "},{"location":"parameters/validation/#example","title":"Example","text":"

    The example below has a deliberate typo in params.input (.txt instead of .csv). The validation function catches this for two reasons:

    The function causes Nextflow to exit immediately with an error.

    Outputmain.nfnextflow.confignextflow_schema.json
    N E X T F L O W  ~  version 23.04.1\nLaunching `pipeline/main.nf` [amazing_crick] DSL2 - revision: 53bd9eac20\n\nERROR ~ Validation of pipeline parameters failed!\n\n -- Check '.nextflow.log' file for details\nThe following invalid input values have been detected:\n\n* --input (samplesheet.txt): \"samplesheet.txt\" does not match regular expression [^\\S+\\.(csv|tsv|yml|yaml)$]\n* --input (samplesheet.txt): the file or directory 'samplesheet.txt' does not exist\n
    include { validateParameters } from 'plugin/nf-schema'\n\nvalidateParameters()\n
    plugins {\n  id 'nf-schema@2.0.0'\n}\n\nparams {\n  input = \"samplesheet.txt\"\n  outdir = \"results\"\n}\n
    {\n    \"$schema\": \"https://json-schema.org/draft/2020-12/schema\",\n    \"$id\": \"https://raw.githubusercontent.com/nf-core/testpipeline/master/nextflow_schema.json\",\n    \"title\": \"nf-core/testpipeline pipeline parameters\",\n    \"description\": \"this is a test\",\n    \"type\": \"object\",\n    \"$defs\": {\n        \"input_output_options\": {\n            \"title\": \"Input/output options\",\n            \"type\": \"object\",\n            \"fa_icon\": \"fas fa-terminal\",\n            \"description\": \"Define where the pipeline should find input data and save output data.\",\n            \"required\": [\"input\", \"outdir\"],\n            \"properties\": {\n                \"input\": {\n                    \"type\": \"string\",\n                    \"format\": \"file-path\",\n                    \"mimetype\": \"text/csv\",\n                    \"schema\": \"assets/schema_input.json\",\n                    \"pattern\": \"^\\\\S+\\\\.(csv|tsv|yaml|json)$\",\n                    \"exists\": true,\n                    \"description\": \"Path to comma-separated file containing information about the samples in the experiment.\",\n                    \"help_text\": \"You will need to create a design file with information about the samples in your experiment before running the pipeline. Use this parameter to specify its location. It has to be a comma-separated file with 3 columns, and a header row. See [usage docs](https://nf-co.re/testpipeline/usage#samplesheet-input).\",\n                    \"fa_icon\": \"fas fa-file-csv\"\n                },\n                \"outdir\": {\n                    \"type\": \"string\",\n                    \"format\": \"directory-path\",\n                    \"description\": \"The output directory where the results will be saved. You have to use absolute paths to storage on Cloud infrastructure.\",\n                    \"fa_icon\": \"fas fa-folder-open\"\n                }\n            }\n        }\n    },\n    \"allOf\": [\n        {\n            \"$ref\": \"#/$defs/input_output_options\"\n        }\n    ]\n}\n
    "},{"location":"parameters/validation/#failing-for-unrecognized-parameters","title":"Failing for unrecognized parameters","text":"

    When parameters which are not specified in the JSON Schema are provided, the parameter validation function returns a WARNING. This is because user-specific institutional configuration profiles may make use of params that are unknown to the pipeline.

    The down-side of this is that warnings about typos in parameters can go unnoticed.

    To force the pipeline execution to fail with an error instead, you can provide the validation.failUnrecognisedParams = true configuration option:

    Default Fail unrecognised params Outputnextflow.configmain.nf
    N E X T F L O W  ~  version 23.04.1\nLaunching `pipeline/main.nf` [jovial_linnaeus] DSL2 - revision: 53bd9eac20\n\nWARN: The following invalid input values have been detected:\n\n* --foo: bar\n\nHello World!\n
    plugins {\n  id 'nf-schema@2.0.0'\n}\n\nparams {\n  input = \"samplesheet.csv\"\n  outdir = \"results\"\n  foo = \"bar\"\n}\n
    include { validateParameters } from 'plugin/nf-schema'\n\nvalidateParameters()\n\nprintln \"Hello World!\"\n
    Outputnextflow.configmain.nf
    N E X T F L O W  ~  version 23.04.1\nLaunching `pipeline/main.nf` [pedantic_descartes] DSL2 - revision: 53bd9eac20\n\nERROR ~ ERROR: Validation of pipeline parameters failed!\n\n -- Check '.nextflow.log' file for details\nThe following invalid input values have been detected:\n\n* --foo: bar\n
    plugins {\n  id 'nf-schema@2.0.0'\n}\n\nvalidation.failUnrecognisedParams = true\n\nparams {\n  input = \"samplesheet.csv\"\n  outdir = \"results\"\n  foo = \"bar\"\n}\n
    include { validateParameters } from 'plugin/nf-schema'\n\nvalidateParameters()\n\nprintln \"Hello World!\"\n
    "},{"location":"parameters/validation/#ignoring-unrecognized-parameters","title":"Ignoring unrecognized parameters","text":"

    Sometimes, a parameter that you want to set may not be described in the pipeline schema for a good reason. Maybe it's something you're using in your Nextflow configuration setup for your compute environment, or it's a complex parameter that cannot be handled in the schema, such as nested parameters.

    In these cases, to avoid getting warnings when an unrecognised parameter is set, you can use --validationSchemaIgnoreParams / params.validationSchemaIgnoreParams.

    This should be a comma-separated list of strings that correspond to parameter names.

    "},{"location":"parameters/validation/#variable-type-checking","title":"Variable type checking","text":"

    By default, validateParameters() is strict about expecting parameters to adhere to their expected type. If the schema says that params.foo should be an integer and the user sets params.foo = \"12\" (a string with a number), it will fail.

    If this causes problems, the user can run validation in \"lenient mode\", whereby the JSON Schema validation tries to cast parameters to their correct type. For example, providing an integer as a string will no longer fail validation.

    Note

    The validation does not affect the parameter variable types in your pipeline. It attempts to cast a temporary copy of the params only, during the validation step.

    To enable lenient validation mode, set validation.lenientMode = true in your configuration file.

    "},{"location":"samplesheets/examples/","title":"Sample sheet channel manipulation examples","text":""},{"location":"samplesheets/examples/#introduction","title":"Introduction","text":"

    Understanding channel structure and manipulation is critical for getting the most out of Nextflow. nf-schema helps initialise your channels from the text inputs to get you started, but further work might be required to fit your exact use case. In this page we run through some common cases for transforming the output of samplesheetToList().

    "},{"location":"samplesheets/examples/#glossary","title":"Glossary","text":""},{"location":"samplesheets/examples/#default-mode","title":"Default mode","text":"

    Each item in the list emitted by samplesheetToList() is a tuple, corresponding with each row of the sample sheet. Each item will be composed of a meta value (if present) and any additional elements from columns in the sample sheet, e.g.:

    sample,fastq_1,fastq_2,bed\nsample1,fastq1.R1.fq.gz,fastq1.R2.fq.gz,sample1.bed\nsample2,fastq2.R1.fq.gz,fastq2.R2.fq.gz,\n

    Might create a list where each element consists of 4 items, a map value followed by three files:

    // Columns:\n[ val([ sample: sample ]), file(fastq1), file(fastq2), file(bed) ]\n\n// Resulting in:\n[ [ id: \"sample\" ], fastq1.R1.fq.gz, fastq1.R2.fq.gz, sample1.bed]\n[ [ id: \"sample2\" ], fastq2.R1.fq.gz, fastq2.R2.fq.gz, [] ] // A missing value from the sample sheet is an empty list\n

    This list can be converted to a channel that can be used as input of a process where the input declaration is:

    tuple val(meta), path(fastq_1), path(fastq_2), path(bed)\n

    It may be necessary to manipulate this channel to fit your process inputs. For more documentation, check out the Nextflow operator docs, however here are some common use cases with samplesheetToList().

    "},{"location":"samplesheets/examples/#using-a-sample-sheet-with-no-headers","title":"Using a sample sheet with no headers","text":"

    Sometimes you only have one possible input in the pipeline sample sheet. In this case it doesn't make sense to have a header in the sample sheet. This can be done by removing the properties section from the sample sheet and changing the type of the element from object the desired type:

    {\n  \"$schema\": \"https://json-schema.org/draft/2020-12/schema\",\n  \"description\": \"Schema for the file provided with params.input\",\n  \"type\": \"array\",\n  \"items\": {\n    \"type\": \"string\"\n  }\n}\n

    When using samplesheets like this CSV file:

    test_1\ntest_2\n

    or this YAML file:

    - test_1\n- test_2\n

    The output of samplesheetToList() will look like this:

    test_1\ntest_2\n
    "},{"location":"samplesheets/examples/#changing-the-structure-of-channel-items","title":"Changing the structure of channel items","text":"

    Each item in the list will be a tuple, but some processes will use multiple files as a list in their input channel, this is common in nf-core modules. For example, consider the following input declaration in a process, where FASTQ could be > 1 file:

    process ZCAT_FASTQS {\n    input:\n        tuple val(meta), path(fastq)\n\n    \"\"\"\n    zcat $fastq\n    \"\"\"\n}\n

    The output of samplesheetToList() (converted to a channel) can be used by default with a process with the following input declaration:

    val(meta), path(fastq_1), path(fastq_2)\n

    To manipulate each item within a channel, you should use the Nextflow .map() operator. This will apply a function to each element of the channel in turn. Here, we convert the flat tuple into a tuple composed of a meta and a list of FASTQ files:

    Channel.fromList(samplesheetToList(params.input, \"path/to/json/schema\"))\n    .map { meta, fastq_1, fastq_2 -> tuple(meta, [ fastq_1, fastq_2 ]) }\n    .set { input }\n\ninput.view() // Channel has 2 elements: meta, fastqs\n

    This is now compatible with the process defined above and will not raise a warning about input cardinality:

    ZCAT_FASTQS(input)\n
    "},{"location":"samplesheets/examples/#removing-elements-in-channel-items","title":"Removing elements in channel items","text":"

    For example, to remove the BED file from the channel created above, we could not return it from the map. Note the absence of the bed item in the return of the closure below:

    Channel.fromList(samplesheetToList(params.input, \"path/to/json/schema\"))\n    .map { meta, fastq_1, fastq_2, bed -> tuple(meta, fastq_1, fastq_2) }\n    .set { input }\n\ninput.view() // Channel has 3 elements: meta, fastq_1, fastq_2\n

    In this way you can drop items from a channel.

    "},{"location":"samplesheets/examples/#separating-channel-items","title":"Separating channel items","text":"

    We could perform this twice to create one channel containing the FASTQs and one containing the BED files, however Nextflow has a native operator to separate channels called .multiMap(). Here, we separate the FASTQs and BEDs into two separate channels using multiMap. Note, the channels are both contained in input and accessed as an attribute using dot notation:

    Channel.fromList(samplesheetToList(params.input, \"path/to/json/schema\"))\n    .multiMap { meta, fastq_1, fastq_2, bed ->\n        fastq: tuple(meta, fastq_1, fastq_2)\n        bed:   tuple(meta, bed)\n    }\n    .set { input }\n

    The channel has two attributes, fastq and bed, which can be accessed separately.

    input.fastq.view() // Channel has 3 elements: meta, fastq_1, fastq_2\ninput.bed.view()   // Channel has 2 elements: meta, bed\n

    Importantly, multiMap applies to every item in the channel and returns an item to both channels for every input, i.e. input, input.fastq and input.bed all contain the same number of items, however each item will be different.

    "},{"location":"samplesheets/examples/#separate-items-based-on-a-condition","title":"Separate items based on a condition","text":"

    You can use the .branch() operator to separate the channel entries based on a condition. This is especially useful when you can get multiple types of input data.

    This example shows a channel which can have entries for WES or WGS data. WES data includes a BED file denoting the target regions, but WGS data does not. These analysis are different so we want to separate the WES and WGS entries from each other. We can separate the two using .branch based on the presence of the BED file:

    // Channel with four elements - see docs for examples\nparams.input = \"samplesheet.csv\"\n\nChannel.fromList(samplesheetToList(params.input, \"path/to/json/schema\"))\n    .branch { meta, fastq_1, fastq_2, bed ->\n        // If BED does not exist\n        WGS: !bed\n            return [meta, fastq_1, fastq_2]\n        // If BED exists\n        WES: bed\n            // The original channel structure will be used when no return statement is used.\n    }\n    .set { input }\n\ninput.WGS.view() // Channel has 3 elements: meta, fastq_1, fastq_2\ninput.WES.view() // Channel has 4 elements: meta, fastq_1, fastq_2, bed\n

    Unlike .multiMap(), the outputs of .branch() will contain a different number of items.

    "},{"location":"samplesheets/examples/#combining-a-channel","title":"Combining a channel","text":"

    After splitting the channel, it may be necessary to rejoin the channel. There are many ways to join a channel, but here we will demonstrate the simplest which uses the Nextflow join operator to rejoin any of the channels from above based on the first element in each item, the meta value.

    input.fastq.view() // Channel has 3 elements: meta, fastq_1, fastq_2\ninput.bed.view()   // Channel has 2 elements: meta, bed\n\ninput.fastq\n    .join( input.bed )\n    .set { input_joined }\n\ninput_joined.view() // Channel has 4 elements: meta, fastq_1, fastq_2, bed\n
    "},{"location":"samplesheets/examples/#count-items-with-a-common-value","title":"Count items with a common value","text":"

    This example is based on this code from Marcel Ribeiro-Dantas.

    It's useful to determine the count of channel entries with similar values when you want to merge them later on (to prevent pipeline bottlenecks with .groupTuple()).

    This example contains a channel where multiple samples can be in the same family. Later on in the pipeline we want to merge the analyzed files so one file gets created for each family. The result will be a channel with an extra meta field containing the count of channel entries with the same family name.

    // channel created with samplesheetToList() previous to modification:\n// [[id:example1, family:family1], example1.txt]\n// [[id:example2, family:family1], example2.txt]\n// [[id:example3, family:family2], example3.txt]\n\nparams.input = \"sample sheet.csv\"\n\nChannel.fromList(samplesheetToList(params.input, \"path/to/json/schema\"))\n    .tap { ch_raw }                       // Create a copy of the original channel\n    .map { meta, txt -> [ meta.family ] } // Isolate the value to count on\n    .reduce([:]) { counts, family ->      // Creates a map like this: [family1:2, family2:1]\n        counts[family] = (counts[family] ?: 0) + 1\n        counts\n    }\n    .combine(ch_raw)                     // Add the count map to the original channel\n    .map { counts, meta, txt ->          // Add the counts of the current family to the meta\n        new_meta = meta + [count:counts[meta.family]]\n        [ new_meta, txt ]\n    }\n    .set { input }\n\ninput.view()\n// [[id:example1, family:family1, count:2], example1.txt]\n// [[id:example2, family:family1, count:2], example2.txt]\n// [[id:example3, family:family2, count:1], example3.txt]\n
    "},{"location":"samplesheets/samplesheetToList/","title":"Create a list from a sample sheet","text":""},{"location":"samplesheets/samplesheetToList/#samplesheettolist","title":"samplesheetToList()","text":"

    This function validates and converts a sample sheet to a Groovy list. This is done using information encoded within a sample sheet schema (see the docs).

    The function has two required arguments:

    1. The path to the samplesheet
    2. The path to the JSON schema file corresponding to the samplesheet.

    These can be either a string with the relative path (from the root of the pipeline) or a file object of the schema.

    samplesheetToList(\"path/to/samplesheet\", \"path/to/json/schema\")\n

    Note

    All data points in the CSV and TSV samplesheets will be converted to their derived type. (e.g. \"true\" will be converted to the Boolean true and \"2\" will be converted to the Integer 2). You can still convert these types back to a String if this is not the expected behaviour with .map { val -> val.toString() }

    This function can be used together with existing channel factories/operators to create one channel entry per samplesheet entry.

    "},{"location":"samplesheets/samplesheetToList/#use-as-a-channel-factory","title":"Use as a channel factory","text":"

    The function can be used with the .fromList channel factory to generate a queue channel:

    Channel.fromList(samplesheetToList(\"path/to/samplesheet\", \"path/to/json/schema\"))\n

    Note

    This will mimic the fromSamplesheet channel factory, found in the previous nf-validation.

    "},{"location":"samplesheets/samplesheetToList/#use-as-a-channel-operator","title":"Use as a channel operator","text":"

    Alternatively, the function can be used with the .flatMap channel operator to create a channel from samplesheet paths that are already in a channel:

    Channel.of(\"path/to/samplesheet\").flatMap { samplesheetToList(it, \"path/to/json/schema\") }\n
    "},{"location":"samplesheets/samplesheetToList/#basic-example","title":"Basic example","text":"

    In this example, we create a simple channel from a CSV sample sheet.

    N E X T F L O W  ~  version 23.04.0\nLaunching `pipeline/main.nf` [distraught_marconi] DSL2 - revision: 74f697a0d9\n[mysample1, input1_R1.fq.gz, input1_R2.fq.gz, forward]\n[mysample2, input2_R1.fq.gz, input2_R2.fq.gz, forward]\n
    main.nfsamplesheet.csvnextflow.configassets/schema_input.json
    include { samplesheetToList } from 'plugin/nf-schema'\n\nch_input = Channel.fromList(samplesheetToList(params.input, \"assets/schema_input.json\"))\n\nch_input.view()\n
    sample,fastq_1,fastq_2,strandedness\nmysample1,input1_R1.fq.gz,input1_R2.fq.gz,forward\nmysample2,input2_R1.fq.gz,input2_R2.fq.gz,forward\n
    plugins {\n  id 'nf-schema@2.0.0'\n}\n\nparams {\n  input = \"samplesheet.csv\"\n  output = \"results\"\n}\n
    {\n  \"$schema\": \"https://json-schema.org/draft/2020-12/schema\",\n  \"$id\": \"https://raw.githubusercontent.com/nf-schema/example/master/assets/schema_input.json\",\n  \"title\": \"nf-schema example - params.input schema\",\n  \"description\": \"Schema for the file provided with params.input\",\n  \"type\": \"array\",\n  \"items\": {\n    \"type\": \"object\",\n    \"properties\": {\n      \"sample\": {\n        \"type\": \"string\",\n        \"pattern\": \"^\\\\S+$\",\n        \"errorMessage\": \"Sample name must be provided and cannot contain spaces\"\n      },\n      \"fastq_1\": {\n        \"type\": \"string\",\n        \"pattern\": \"^\\\\S+\\\\.f(ast)?q\\\\.gz$\",\n        \"errorMessage\": \"FastQ file for reads 1 must be provided, cannot contain spaces and must have extension '.fq.gz' or '.fastq.gz'\"\n      },\n      \"fastq_2\": {\n        \"errorMessage\": \"FastQ file for reads 2 cannot contain spaces and must have extension '.fq.gz' or '.fastq.gz'\",\n        \"type\": \"string\",\n        \"pattern\": \"^\\\\S+\\\\.f(ast)?q\\\\.gz$\"\n      },\n      \"strandedness\": {\n        \"type\": \"string\",\n        \"errorMessage\": \"Strandedness must be provided and be one of 'forward', 'reverse' or 'unstranded'\",\n        \"enum\": [\"forward\", \"reverse\", \"unstranded\"]\n      }\n    },\n    \"required\": [\"sample\", \"fastq_1\", \"strandedness\"]\n  }\n}\n
    "},{"location":"samplesheets/samplesheetToList/#order-of-fields","title":"Order of fields","text":"

    This example demonstrates that the order of columns in the sample sheet file has no effect.

    Danger

    It is the order of fields in the sample sheet JSON schema which defines the order of items in the channel returned by samplesheetToList(), not the order of fields in the sample sheet file.

    N E X T F L O W  ~  version 23.04.0\nLaunching `pipeline/main.nf` [elated_kowalevski] DSL2 - revision: 74f697a0d9\n[forward, mysample1, input1_R2.fq.gz, input1_R1.fq.gz]\n[forward, mysample2, input2_R2.fq.gz, input2_R1.fq.gz]\n
    samplesheet.csvassets/schema_input.jsonmain.nfnextflow.config
    sample,fastq_1,fastq_2,strandedness\nmysample1,input1_R1.fq.gz,input1_R2.fq.gz,forward\nmysample2,input2_R1.fq.gz,input2_R2.fq.gz,forward\n
    {\n  \"$schema\": \"https://json-schema.org/draft/2020-12/schema\",\n  \"$id\": \"https://raw.githubusercontent.com/nf-schema/example/master/assets/schema_input.json\",\n  \"title\": \"nf-schema example - params.input schema\",\n  \"description\": \"Schema for the file provided with params.input\",\n  \"type\": \"array\",\n  \"items\": {\n    \"type\": \"object\",\n    \"properties\": {\n      \"strandedness\": {\n        \"type\": \"string\",\n        \"errorMessage\": \"Strandedness must be provided and be one of 'forward', 'reverse' or 'unstranded'\",\n        \"enum\": [\"forward\", \"reverse\", \"unstranded\"]\n      },\n      \"sample\": {\n        \"type\": \"string\",\n        \"pattern\": \"^\\\\S+$\",\n        \"errorMessage\": \"Sample name must be provided and cannot contain spaces\"\n      },\n      \"fastq_2\": {\n        \"errorMessage\": \"FastQ file for reads 2 cannot contain spaces and must have extension '.fq.gz' or '.fastq.gz'\",\n        \"type\": \"string\",\n        \"pattern\": \"^\\\\S+\\\\.f(ast)?q\\\\.gz$\"\n      },\n      \"fastq_1\": {\n        \"type\": \"string\",\n        \"pattern\": \"^\\\\S+\\\\.f(ast)?q\\\\.gz$\",\n        \"errorMessage\": \"FastQ file for reads 1 must be provided, cannot contain spaces and must have extension '.fq.gz' or '.fastq.gz'\"\n      }\n    },\n    \"required\": [\"sample\", \"fastq_1\", \"strandedness\"]\n  }\n}\n
    include { samplesheetToList } from 'plugin/nf-schema'\n\nch_input = Channel.fromList(samplesheetToList(params.input, \"assets/schema_input.json\"))\n\nch_input.view()\n
    plugins {\n  id 'nf-schema@2.0.0'\n}\n\nparams {\n  input = \"samplesheet.csv\"\n  output = \"results\"\n}\n
    "},{"location":"samplesheets/samplesheetToList/#channel-with-meta-map","title":"Channel with meta map","text":"

    In this example, we use the schema to mark two columns as meta fields. This returns a channel with a meta map.

    N E X T F L O W  ~  version 23.04.0\nLaunching `pipeline/main.nf` [romantic_kare] DSL2 - revision: 74f697a0d9\n[[my_sample_id:mysample1, my_strandedness:forward], input1_R1.fq.gz, input1_R2.fq.gz]\n[[my_sample_id:mysample2, my_strandedness:forward], input2_R1.fq.gz, input2_R2.fq.gz]\n
    assets/schema_input.jsonmain.nfsamplesheet.csvnextflow.config
    {\n  \"$schema\": \"https://json-schema.org/draft/2020-12/schema\",\n  \"$id\": \"https://raw.githubusercontent.com/nf-schema/example/master/assets/schema_input.json\",\n  \"title\": \"nf-schema example - params.input schema\",\n  \"description\": \"Schema for the file provided with params.input\",\n  \"type\": \"array\",\n  \"items\": {\n    \"type\": \"object\",\n    \"properties\": {\n      \"sample\": {\n        \"type\": \"string\",\n        \"pattern\": \"^\\\\S+$\",\n        \"errorMessage\": \"Sample name must be provided and cannot contain spaces\",\n        \"meta\": [\"my_sample_id\"]\n      },\n      \"fastq_1\": {\n        \"type\": \"string\",\n        \"pattern\": \"^\\\\S+\\\\.f(ast)?q\\\\.gz$\",\n        \"errorMessage\": \"FastQ file for reads 1 must be provided, cannot contain spaces and must have extension '.fq.gz' or '.fastq.gz'\"\n      },\n      \"fastq_2\": {\n        \"errorMessage\": \"FastQ file for reads 2 cannot contain spaces and must have extension '.fq.gz' or '.fastq.gz'\",\n        \"type\": \"string\",\n        \"pattern\": \"^\\\\S+\\\\.f(ast)?q\\\\.gz$\"\n      },\n      \"strandedness\": {\n        \"type\": \"string\",\n        \"errorMessage\": \"Strandedness must be provided and be one of 'forward', 'reverse' or 'unstranded'\",\n        \"enum\": [\"forward\", \"reverse\", \"unstranded\"],\n        \"meta\": [\"my_strandedness\"]\n      }\n    },\n    \"required\": [\"sample\", \"fastq_1\", \"strandedness\"]\n  }\n}\n
    include { samplesheetToList } from 'plugin/nf-schema'\n\nch_input = Channel.fromList(samplesheetToList(params.input, \"assets/schema_input.json\"))\n\nch_input.view()\n
    sample,fastq_1,fastq_2,strandedness\nmysample1,input1_R1.fq.gz,input1_R2.fq.gz,forward\nmysample2,input2_R1.fq.gz,input2_R2.fq.gz,forward\n
    plugins {\n  id 'nf-schema@2.0.0'\n}\n\nparams {\n  input = \"samplesheet.csv\"\n  output = \"results\"\n}\n
    "},{"location":"samplesheets/validate_sample_sheet/","title":"Validate a sample sheet file contents","text":"

    When a parameter provides the schema field, the validateParameters() function will automatically parse and validate the provided file contents using this JSON schema. It can validate CSV, TSV, JSON and YAML files.

    The path of the schema file must be relative to the root of the pipeline directory. See an example in the input field from the example schema.json.

    {\n  \"properties\": {\n    \"input\": {\n      \"type\": \"string\",\n      \"format\": \"file-path\",\n      \"pattern\": \"^\\\\S+\\\\.csv$\",\n      \"schema\": \"src/testResources/samplesheet_schema.json\",\n      \"description\": \"Path to comma-separated file containing information about the samples in the experiment.\"\n    }\n  }\n}\n

    Note

    The samplesheetToList function also validate the files before converting them. If you convert the samplesheet, it's not necessary to add a schema to the parameter corresponding to the samplesheet.

    For more information about the sample sheet JSON schema refer to sample sheet docs.

    "}]} \ No newline at end of file