Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added nf-iridanext 0.1.0 #59

Merged
merged 4 commits into from
Jan 22, 2024
Merged

Conversation

apetkau
Copy link
Contributor

@apetkau apetkau commented Dec 15, 2023

Plugin information

Name: nf-iridanext
Repository: https://github.com/phac-nml/nf-iridanext

Plugin purpose

This plugin adds some pre-processing and post-processing tasks that will be useful for us to integrate Nextflow pipelines within our larger system (called IRIDA Next) for managing microbial genomics data and analysis results. In particular, the main post-processing task is to create a JSON file containing output files from a Nextflow pipeline plus contextual metadata for loading into our system.

Example

As an example, this plugin would allow the following Nextflow configuration to be used to create the below JSON output (more examples on https://github.com/phac-nml/nf-iridanext):

nextflow.config

iridanext {
    enabled = true
    output {
        path = "${params.outdir}/iridanext.output.json.gz"
        files {
            global = ["**/summary/summary.txt.gz"]
            samples = ["**/assembly/*.assembly.fa.gz"]
        }
        metadata {
            samples {
                csv {
                    path = "**/output.csv"
                    idcol = "column1"
                }
            }
        }
    }
}

iridanext.output.json.gz

{
    "files": {
        "global": [{"path": "summary/summary.txt.gz"}],
        "samples": {
            "SAMPLE1": [{"path": "assembly/SAMPLE1.assembly.fa.gz"}],
            "SAMPLE2": [{"path": "assembly/SAMPLE2.assembly.fa.gz"}]}
    },
    "metadata": {
       "samples": {
            "SAMPLE1": {"key1": "2","key2": "3"},
            "SAMPLE2": {"key1": "4","key2": "5"},
        }
    }
}

That is, we can create a JSON file that lists output files associated with the pipeline as a whole (the global section), files associated with individual samples (based on sample identifiers in the samplesheet.csv), as well as contextual metadata for each sample. This custom JSON file is what is used to load genomics data analysis results + sample metadata into our larger system.

Applicability

This plugin is primarily intended to be used by pipeline developers integrating pipelines into our system. However, the overall goal is to isolate as much of the necessary adaptions as possible for integrating pipelines into our system in order to increase pipeline compatibility (ideally I would like to have all changes we need to adapt pipelines isolated into Nextflow configuration files, but we're not quite there yet).

@apetkau
Copy link
Contributor Author

apetkau commented Jan 22, 2024

Update January 22 2024: I added version 0.2.0 of our nf-iridanext plugin with a Nextflow config syntax that captures everything I wanted to do with this plugin right now.

Copy link
Member

@ewels ewels left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Many thanks for this!

@ewels ewels merged commit 606be93 into nextflow-io:main Jan 22, 2024
1 check passed
@apetkau
Copy link
Contributor Author

apetkau commented Jan 22, 2024

@ewels thanks so much 😄

@apetkau apetkau deleted the add/nf-iridanext branch January 22, 2024 21:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants