Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

new module : jvarkit/vcffilterjdk #6621

Merged
merged 24 commits into from
Sep 20, 2024

Conversation

lindenb
Copy link
Contributor

@lindenb lindenb commented Sep 10, 2024

This module implements jvarkt/vcffilterjdk https://pubmed.ncbi.nlm.nih.gov/29186339/ which filters VCF files using a java expression compiled at runtime. eg:

PR checklist

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added a new tool - have you followed the module conventions in the contribution docs
  • If necessary, include test data in your PR.
  • Remove all TODO statements.
  • Emit the versions.yml file.
  • Follow the naming conventions.
  • Follow the parameters requirements.
  • Follow the input/output options guidelines.
  • Add a resource label
  • Use BioConda and BioContainers if possible to fulfil software requirements.
  • Ensure that the test works with either Docker / Singularity. Conda CI tests can be quite flaky:
    • For modules:
      • nf-core modules test <MODULE> --profile docker
      • nf-core modules test <MODULE> --profile singularity
      • nf-core modules test <MODULE> --profile conda

@lindenb lindenb requested a review from a team as a code owner September 10, 2024 15:37
@lindenb lindenb requested review from Aratz and removed request for a team September 10, 2024 15:37
modules/nf-core/jvarkit/vcffilterjdk/main.nf Show resolved Hide resolved
modules/nf-core/jvarkit/vcffilterjdk/main.nf Outdated Show resolved Hide resolved
modules/nf-core/jvarkit/vcffilterjdk/main.nf Outdated Show resolved Hide resolved
def script_file = code?"--script \"${code}\"":""
def regions_file = regions_file? (tbi ? " --regions-file" : " --targets-file")+" \"${regions_file}\" ":""

extension = getVcfExtension(args3); /* custom function, see below */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would not use a function here, just put it straight in

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jfy133 this statement contradicts what I've been told here https://nfcore.slack.com/archives/CJRH30T6V/p1725629116371819

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

grumble grumble @nvnieuwk have you being saying yes to others on this? I thought we were trying to avoid such complexity in modules as an approximate rule ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought it seemed reasonable 😁 I didn't know about us trying to avoid this, my bad. If that's the case than we shouldn't do this

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My philosphy if you do something more than two times then funciton it, otherwise it's adding more complexity. But #small-mound-i-feel-strongly-about-but-not-enough-to-fight-everyone

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's what's happening here right? (sorry didn't take a closer look to it yet, it's a chaotic day) 😁

modules/nf-core/jvarkit/vcffilterjdk/main.nf Outdated Show resolved Hide resolved
Groovy Map containing code information
- code:
type: file
description: custom user Code . May be empty if script if provided via `task.ext.args2`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
description: custom user Code . May be empty if script if provided via `task.ext.args2`.
description: custom user code. May be empty if script if provided via `task.ext.args2`.

modules/nf-core/jvarkit/vcffilterjdk/main.nf Show resolved Hide resolved
modules/nf-core/jvarkit/vcffilterjdk/main.nf Show resolved Hide resolved
@@ -0,0 +1,8 @@
process {


Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change

Copy link
Member

@jfy133 jfy133 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just the question to @nvnieuwk about the function thing (but I think this not a blocker now), and then the remaining meta.yaml comment about the fasta description and then I think we are good to go

@lindenb
Copy link
Contributor Author

lindenb commented Sep 17, 2024

@jfy133 back to you. I added a test to check that a test with 'bed' is OK.

Nevertheless Conda is failing "The operation was canceled." for an unknown reason.

2024-09-17T10:31:37.0804427Z �[1mTest Process JVARKIT_VCFFILTERJDK�[0m
2024-09-17T10:31:37.0807880Z 
2024-09-17T10:31:37.3147564Z   �[1mTest�[0m [97675082] 'sarscov2 - vcf' 
2024-09-17T10:31:42.7333540Z     > N E X T F L O W  ~  version 24.04.4
2024-09-17T10:31:49.3193163Z     > Launching `/opt/actions-runner/_work/modules/modules/.nf-test-976750823fd503359dcc2752e76f56be.nf` [friendly_einstein] DSL2 - revision: c82d0a3daf
2024-09-17T10:31:54.9997691Z     > Creating env using conda: /opt/actions-runner/_work/modules/modules/modules/nf-core/jvarkit/vcffilterjdk/tests/../environment.yml [cache /home/ubuntu/tests/976750823fd503359dcc2752e76f56be/work/conda/environment-80e529a60994f670fdb6e8744e6ca02d]
2024-09-17T10:32:21.8039184Z     > [bc/53e700] Submitted process > JVARKIT_VCFFILTERJDK (vcf_test)
2024-09-17T10:32:29.1725912Z     �[32mPASSED�[0m (52.083s)
2024-09-17T10:32:29.2465899Z   �[1mTest�[0m [8c58e2fc] 'sarscov2 - vcf+bed' 
2024-09-17T10:32:32.6761650Z     > N E X T F L O W  ~  version 24.04.4
2024-09-17T10:32:38.9065917Z     > Launching `/opt/actions-runner/_work/modules/modules/.nf-test-8c58e2fc3aaff92467e7cd7bdcbfa81c.nf` [silly_sammet] DSL2 - revision: 2b7bee574f
2024-09-17T10:32:43.9372119Z     > Creating env using conda: /opt/actions-runner/_work/modules/modules/modules/nf-core/jvarkit/vcffilterjdk/tests/../environment.yml [cache /home/ubuntu/tests/8c58e2fc3aaff92467e7cd7bdcbfa81c/work/conda/environment-80e529a60994f670fdb6e8744e6ca02d]
2024-09-17T10:32:56.4748922Z     > [27/a7a413] Submitted process > JVARKIT_VCFFILTERJDK (vcf_test)
2024-09-17T10:33:02.7976573Z     �[32mPASSED�[0m (33.622s)
2024-09-17T10:33:02.8394609Z   �[1mTest�[0m [5d5a66a3] 'sarscov2 - vcf - stub' 
2024-09-17T10:33:03.8993562Z ##[error]The runner has received a shutdown signal. This can happen when the runner service is stopped, or a manually started runner is canceled.
2024-09-17T10:33:07.2068586Z     > N E X T F L O W  ~  version 24.04.4
2024-09-17T10:33:13.6338430Z     > Launching `/opt/actions-runner/_work/modules/modules/.nf-test-5d5a66a38876def6cbbcfc8ea9184254.nf` [deadly_ride] DSL2 - revision: c82d0a3daf
2024-09-17T10:33:16.4608064Z ##[error]The operation was canceled.

@jfy133
Copy link
Member

jfy133 commented Sep 18, 2024

@jfy133 back to you. I added a test to check that a test with 'bed' is OK.

Nevertheless Conda is failing "The operation was canceled." for an unknown reason.

2024-09-17T10:31:37.0804427Z �[1mTest Process JVARKIT_VCFFILTERJDK�[0m
2024-09-17T10:31:37.0807880Z 
2024-09-17T10:31:37.3147564Z   �[1mTest�[0m [97675082] 'sarscov2 - vcf' 
2024-09-17T10:31:42.7333540Z     > N E X T F L O W  ~  version 24.04.4
2024-09-17T10:31:49.3193163Z     > Launching `/opt/actions-runner/_work/modules/modules/.nf-test-976750823fd503359dcc2752e76f56be.nf` [friendly_einstein] DSL2 - revision: c82d0a3daf
2024-09-17T10:31:54.9997691Z     > Creating env using conda: /opt/actions-runner/_work/modules/modules/modules/nf-core/jvarkit/vcffilterjdk/tests/../environment.yml [cache /home/ubuntu/tests/976750823fd503359dcc2752e76f56be/work/conda/environment-80e529a60994f670fdb6e8744e6ca02d]
2024-09-17T10:32:21.8039184Z     > [bc/53e700] Submitted process > JVARKIT_VCFFILTERJDK (vcf_test)
2024-09-17T10:32:29.1725912Z     �[32mPASSED�[0m (52.083s)
2024-09-17T10:32:29.2465899Z   �[1mTest�[0m [8c58e2fc] 'sarscov2 - vcf+bed' 
2024-09-17T10:32:32.6761650Z     > N E X T F L O W  ~  version 24.04.4
2024-09-17T10:32:38.9065917Z     > Launching `/opt/actions-runner/_work/modules/modules/.nf-test-8c58e2fc3aaff92467e7cd7bdcbfa81c.nf` [silly_sammet] DSL2 - revision: 2b7bee574f
2024-09-17T10:32:43.9372119Z     > Creating env using conda: /opt/actions-runner/_work/modules/modules/modules/nf-core/jvarkit/vcffilterjdk/tests/../environment.yml [cache /home/ubuntu/tests/8c58e2fc3aaff92467e7cd7bdcbfa81c/work/conda/environment-80e529a60994f670fdb6e8744e6ca02d]
2024-09-17T10:32:56.4748922Z     > [27/a7a413] Submitted process > JVARKIT_VCFFILTERJDK (vcf_test)
2024-09-17T10:33:02.7976573Z     �[32mPASSED�[0m (33.622s)
2024-09-17T10:33:02.8394609Z   �[1mTest�[0m [5d5a66a3] 'sarscov2 - vcf - stub' 
2024-09-17T10:33:03.8993562Z ##[error]The runner has received a shutdown signal. This can happen when the runner service is stopped, or a manually started runner is canceled.
2024-09-17T10:33:07.2068586Z     > N E X T F L O W  ~  version 24.04.4
2024-09-17T10:33:13.6338430Z     > Launching `/opt/actions-runner/_work/modules/modules/.nf-test-5d5a66a38876def6cbbcfc8ea9184254.nf` [deadly_ride] DSL2 - revision: c82d0a3daf
2024-09-17T10:33:16.4608064Z ##[error]The operation was canceled.

The conda error was just Azure saying nope, a restart seemed to have fixed it :)

Copy link
Member

@jfy133 jfy133 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Almost there! Still some confusing meta.yaml docs

e.g. [ id:'test_reference' ]
- fasta:
type: file
description: Groovy Map containing fasta reference genome information
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have not changed it since I first commented on this - does the tool expect multiple FASTA files?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand the problem your comment here. No, there is only one fasta file. Where is the confusion here ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'Groovy Map containing' is what is confusing me, it's jut the file no?

e.g. [ id:'test_reference' ]
- fai:
type: file
description: Groovy Map containing fasta index information
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above... should this be a map?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... hum.. no, there is only one fai file (?..)

then {
assertAll(
{ assert process.success },
{ assert file(process.out.vcf[0][1]).exists() },
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does .variantsMD5 not work here, like you did above?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The BED provided in the test dataset does not overlap any variant. The output vcf is empty, When I test the variants md5, I then get an error md5sum for empty file

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK fair enough

assertAll(
{ assert process.success },
{ assert snapshot(
path(process.out.vcf[0][1]),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For stubs I believe you can just use

Suggested change
path(process.out.vcf[0][1]),
process.out,

within the snapshot

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jfy133 that didn't work, there is an error (snap) (I keep path(process.out.vcf[0][1]),)

Copy link
Member

@jfy133 jfy133 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is what I mean @lindenb

e.g. [ id:'test_reference' ]
- fasta:
type: file
description: Groovy Map containing fasta reference genome information
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
description: Groovy Map containing fasta reference genome information
description: File containing fasta reference genome information

e.g. [ id:'test_reference' ]
- fai:
type: file
description: Groovy Map containing fasta index information
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
description: Groovy Map containing fasta index information
description: File containing fasta index information

e.g. [ id:'test_reference' ]
- dict:
type: file
description: Groovy Map containing reference genome information for GATK sequence dictionary
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
description: Groovy Map containing reference genome information for GATK sequence dictionary
description: File containing reference genome information for GATK sequence dictionary

e.g. [ id:'test_reference' ]
- code:
type: file
description: custom user code . May be empty if script if provided via `task.ext.args2`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
description: custom user code . May be empty if script if provided via `task.ext.args2`.
description: File containing custom user code. May be empty if script if provided via `task.ext.args2`.

@lindenb
Copy link
Contributor Author

lindenb commented Sep 20, 2024

I kept that function getVcfExtension.

There should be no new error now, but with the new version 3.0.0dev of nf-core-tools there is a new lint error.

( That is quite demoralizing, especially as this version is not available on conda and I cannot test/fix this locally)


╭─ [✗] 2 Module Tests Failed ──────────────────────────────────────────────────╮
│                       ╷                          ╷                           │
│ Module name           │ File path                │ Test message              │
│╶──────────────────────┼──────────────────────────┼──────────────────────────╴│
│ jvarkit/vcffilterjdk  │ modules/nf-core/jvarkit… │ Module meta.yml does not  │
│                       │                          │ match main.nf. Inputs     │
│                       │                          │ should contain: [['meta', │
│                       │                          │ 'vcf', 'tbi',             │
│                       │                          │ 'regions_file'],          │
│                       │                          │ ['meta2', 'fasta'],       │
│                       │                          │ ['meta3', 'fai'],         │
│                       │                          │ ['meta4', 'dict'],        │
│                       │                          │ ['meta5', 'code'],        │
│                       │                          │ ['meta6', 'pedigree']]    │
│                       │                          │ Run nf-core modules lint  │
│                       │                          │ --fix to update the       │
│                       │                          │ meta.yml file.            │
│ jvarkit/vcffilterjdk  │ modules/nf-core/jvarkit… │ Module meta.yml does not  │
│                       │                          │ match main.nf. Outputs    │
│                       │                          │ should contain: {'vcf':   │
│                       │                          │ ['meta',                  │
│                       │                          │ '.${extension}'], 'tbi':  │
│                       │                          │ ['meta', '.tbi'], 'csi':  │
│                       │                          │ ['meta', '*.csi'],        │
│                       │                          │ 'versions':               │
│                       │                          │ ['versions.yml']} Run     │
│                       │                          │ nf-core modules lint      │
│                       │                          │ --fix to update the       │
│                       │                          │ meta.yml file.            │
│                       ╵                          ╵                           │
╰──────────────────────────────────────────────────────────────────────────────╯```

@jfy133
Copy link
Member

jfy133 commented Sep 20, 2024

OK this looks good to me now @lindenb thanks for the patience!

I agree the failing linting is not nice, I've complained on your behalf. I will force merge :D

@jfy133 jfy133 merged commit 24b3817 into nf-core:master Sep 20, 2024
10 of 12 checks passed
@lindenb
Copy link
Contributor Author

lindenb commented Sep 20, 2024

@jfy133 cool ! thanks !

@lindenb lindenb deleted the pl_jvarkit_vcffilterjdk branch September 20, 2024 09:04
@lindenb lindenb mentioned this pull request Sep 20, 2024
14 tasks
herpov pushed a commit to herpov/modules that referenced this pull request Oct 2, 2024
* vcffilterdjdk

* update params

* update params

* oppsss tag and TODO

* target/region

* answers to review

* f...g space

* fix conda problem https://nfcore.slack.com/archives/CJRH30T6V/p1726233311260959

* add test+bed

* reset polyx

* prevent test exception md5sum for empty file

* update main.nf.test

* update meta.yml

* remove suggestion

---------

Co-authored-by: James A. Fellows Yates <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants