-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
organize for metapipeline #23
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added some comments:
// Calculate HsMetrics | ||
if ( params.collect_metrics ) { | ||
|
||
//Get intervalLists (only needed for collectHsMetrics) | ||
//if you already have the interval list use that, otherwise, generate interval list from BedToIntervalList process |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not necessary in this PR but I'd recommend trying to put the processes under each if statement into their own workflows to simplify main.nf
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also not necessarily in this PR, but I'd recommend using PlantUML to diagram for pipelines as it then allows for the diagram to be tracked as code and is easily modifiable compared to just image files that can be hard to modify and track changes for
Generally, yes we want the outputs under the main tool of the pipeline; since this pipeline really only has one main tool for performing its main function, that should stay as is (as opposed to pipelines like call-sSNV where there are multiple main tools since multiple tools for calling sSNVs are implemented)
It should, through NFTest, become a formal test case; in terms of eventually publicizing the pipeline (either through metapipeline or on its own), consider any potential concerns about whether the targeted sequencing dataset can be public in any way
I would recommend yes; we have a modularized process for it so it should be straight-forward to add - https://github.com/uclahs-cds/pipeline-Nextflow-module/tree/main/modules/PipeVal/generate-checksum |
Can you share an up-to-date implementation example from an existing pipeline? |
Here's an example of it that was used when the module was created and tested: https://github.com/uclahs-cds/pipeline-align-DNA/blob/8b5ba8508df4f3affd9391711190e56f8e2cf1ec/module/align_DNA_BWA_MEM2.nf#L19-L26 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally looks good! There are some things that can be cleaned up but not needed in this PR. Can you move the outputs with the checksums to the dev directory?
Anything else to add @sorelfitzgibbon ?
I think I have a few comments, please give me until tomorrow. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Just a few minor comments.
config/methods.config
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is probably part of the "clean up" @yashpatel6 suggested leaving for another PR: Much of the methods.config
is out of date and could be replaced with those in pipeline-Nextflow-config
as a submodule.
Done! |
Description
Updates to turn this pipeline into a useful component of metapipline.
Closes #15
Urgent things to address before official release:
bedtools slop
and provided with thebedtools
distribution: /hot/user/nzeltser/tools/bedtools2/genomes/human.hg38.genomeSAMtools-1.16.1
as this is the tool behind the main read depth calculation process - do we want to keep this?Testing Results
Testing set is the BZPRGPT2 PRAD targeted panel pilot cohort 2 with original target file.
Checklist
I have read the code review guidelines and the code review best practice on GitHub check-list.
I have reviewed the Nextflow pipeline standards.
The name of the branch is meaningful and well formatted following the standards, using [AD_username (or 5 letters of AD if AD is too long)]-[brief_description_of_branch].
I have set up or verified the branch protection rule following the github standards before opening this pull request.
I have added my name to the contributors listings in the
manifest
block in thenextflow.config
as part of this pull request, am listedalready, or do not wish to be listed. (This acknowledgement is optional.)
I have added the changes included in this pull request to the
CHANGELOG.md
under the next release version or unreleased, and updated the date.I have updated the version number in the
metadata.yaml
andmanifest
block of thenextflow.config
file following semver, or the version number has already been updated. (Leave it unchecked if you are unsure about new version number and discuss it with the infrastructure team in this PR.)I have tested the pipeline on at least one A-mini sample.