-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feat/ensemblvep download #61
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The functionality allows for the controlled caching of VEP files.
The vep cache can be dowloaded as a process and versioning can be more tightly controlled
conf/modules.config
Outdated
@@ -71,6 +71,17 @@ process { | |||
ext.args = { "--output-filename=${meta.id}.exomiser" } | |||
} | |||
|
|||
withName: ENSEMBLVEP_DOWNLOAD { | |||
container = 'quay.io/biocontainers/ensembl-vep:110.0--pl5321h2a3209d_0' // Trying with the conda container |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This container seems to be based on vep version 110. Shouldn't we be using version 111?
If it were, let's say, version 113 downloading the data for version 111, I wouldn't be too concerned, as newer versions typically support older data. However, the reverse might not be true and could potentially cause issues. Ideally, we should use the same version that we are currently using to avoid any compatibility problems.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Woups. Definitely I agree with you. This was a mistake on my side when copying the container link. I'll put the PR back into draft, change this and verify it does not change anything.
@@ -76,7 +76,7 @@ See [docs/output.md](docs/output.md) for more details about pipeline outputs. | |||
|
|||
## Credits | |||
|
|||
Ferlab-Ste-Justine/Post-processing-Pipeline was originally written by Damien Geneste, David Morais, Felix-Antoine Le Sieur, Jeremy Costanza, Lysiane Bouchard. | |||
Ferlab-Ste-Justine/Post-processing-Pipeline was originally written by Damien Geneste, David Morais, Felix-Antoine Le Sieur, Jeremy Costanza, Lysiane Bouchard, Georgette Femerling. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❤️
| `vep_cache_version` | _Optional_ | Version of the vep cache. e.g. `111` | | ||
| `vep_genome` | _Optional_ | Genome assembly version of the vep cache | | ||
| `download_cache` | _Optional_ | Download vep cache (default: false) | | ||
| `outdir_cache` | _Optional_ | Path to write the cache to. If not declared, cache will be written to `<outputdir>/cache/` | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You might want to prefix the download_cache and outdir_cache parameters with "vep" to clarify that these are vep parameters. I leave the decision to you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding the missing vep parameters!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, agreed. Will do that
### `Changed` | ||
- [#59](https://github.com/Ferlab-Ste-Justine/Post-processing-Pipeline/pull/59) Add ".snv" to VEP output filename prefix | ||
|
||
## v2.4.1-dev - 16/01/2025 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great initiative! I wanted to mention this changelog issue before but didn't want to overwhelm you. I hope that you did not suffered too much 😄 .
Previously, the version in the changelog matched the last tag created + dev (e.g., 2.4.1-dev followed tag 2.4.1). To see changes from 2.4.1 compared to 2.4.0, you had to look at 2.4.0-dev, which was confusing.
I noticed this issue a while ago but I was hesitant to change it since we've been using this system for a while. I agree that using the "unreleased" convention, as you did, is clearer.
I support this change. Make sure, however, that it doesn't make the linter crash. I.e. you should verify that you can successfully run the nf-core lint
command. It's possible that it will add an extra warning that we will want to ignore (you might have to edit file .nf-core.yml to do so).
If possible, I would tend to rename the section v2.4.1-dev to v2.4.1, as, if I am not mistaking, it is now accurately reflects changes from v2.4.0 to v2.4.1.
If we go for this change, it would be a good idea to mention it to Felix-Antoine as well if you did not already.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah! makes sense. I was a bit hesitant on how the changes were logged and assumed that 2.4.0 dev listed the changes made in the 2.4.0 release instead of the changes made starting from 2.4.0. I can adapt to any system.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should switch to the convention you are introducing here. It's way more standard I think and intuitive. I actually wanted to propose this change to the team eventually. I'm glad that you introduced it right away.
I would just rename the header "v2.4.1-dev - 16/01/2025" header to "v2.4.1 - 16/01/2025". It's really just this header. I would not touch the others. We can add a note indicating that we change our way of maintaining the changelog from 2.4.1.
nextflow.config
Outdated
@@ -92,7 +96,7 @@ params { | |||
includeConfig 'conf/base.config' | |||
|
|||
// Load nf-core custom profiles from different Institutions | |||
try { | |||
try { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a big deal, but you might want to remove the uncessary space after the {
character.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was going through the errors, but realized better to have another PR for that. So I may have by mistake inserted a space there.
take: | ||
ch_samplesheet | ||
|
||
main: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with moving the constant definitions in the main. I can't remember the reason why they were placed before the main, however. You might want to double check with @FelixAntoineLeSieur to make sure it is ok, i.e. no potential harm.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to nf, if they are outside, it does not follow the convention of declaring a workflow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is probably not standard to put them outside indeed and you had a good thinking in seeking to standardize the workflow. There was a reason, however, I think, for putting them outside ... but I can't remember what it was. @FelixAntoineLeSieur might remember.
Don't worry too much about that. If you tested on juno and it runs smooth, I guess this is not causing problems.
ch_samplesheet | ||
|
||
main: | ||
def HOMO_SAPIENS_SPECIES = "homo_sapiens" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't mind to move this here, but I recommend to use lower case instead upper case, as we typically use upper case for global variables. This one is not global.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I maintained the same naming convention, just moved the line.
Defining variables outside of workflows is not conventional in nextflow so it throws an error in format (not in running).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is also only used within the workflow, so why would it be a global?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I absolutely don't mind if it is a local variable to the workflow. It is used only by the workflow so far indeed. We created a constant to encourage reuse and avoid hard coding the info at multiple places. But hey, I don't think it will ever be used outside the workflow, so your change is really fine.
My comment was just about using upper case for the variable name. I would tend to rename it homo_sapiens_species
instead HOMO_SAPIENS_SPECIES
. You don't have to; it's a suggestion.
3911cfb
to
852f17b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good work :)
Description
Added the
nf-core
module to download the ensemblvep cache.Added parameters:
download_cache
outdir_cache
Tests
Correctly downloads cache data when download_cache is set to true. If it is downloaded,
vep_cache
takes the value of the output. Does not interfere with any other process.PR checklist
nf-core lint
).nextflow run . -profile test,docker --outdir <OUTDIR>
).nextflow run . -profile debug,test,docker --outdir <OUTDIR>
).docs/usage.md
is updated.docs/output.md
is updated.docs/reference_data.md
is updated.CHANGELOG.md
is updated.README.md
is updated (including new tool citations and authors/contributors).