Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementation of KmerFinder subworkflow Custom Quast, and Custom MultiQC Reports #135

Merged
merged 62 commits into from
May 24, 2024

Conversation

Daniel-VM
Copy link
Contributor

PR checklist

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added a new tool - have you followed the pipeline conventions in the contribution docs
  • If necessary, also make a PR on the nf-core/bacass branch on the nf-core/test-datasets repository.
  • Make sure your code lints (nf-core lint).
  • Usage Documentation in docs/usage.md is updated.
  • Output Documentation in docs/output.md is updated.
  • CHANGELOG.md is updated.
  • README.md is updated (including new tool citations and authors/contributors).

PR Description

This PR introduces significant enhancements to the nf-core/bacass pipeline by incorporating a KmerFinder subworkflow and improving Quast Assembly QC and MultiQC reports.

Key Implementations:

  1. KmerFinder Subworkflow:
  • Added a local KmerFinder module for read quality control (QC) and purity assessment.
  • Developed a module to compile KmerFinder results into a comprehensive CSV summary.
  • Implemented grouping of input samples based on the KmerFinder-estimated reference genome.
  • Created a module to identify and download reference genomes from the NCBI database, used for retrieving relevant QUAST metrics.
  1. Quast Assembly QC by Grouping Samples:
  • Modified Quast execution to run twice when KmerFinder is invoked:
  • Initial 'general' Quast without reference genome files.
  • Subsequent 'by reference genome' Quast to aggregate samples and their reference genomes.
  1. Custom MultiQC Reports:
  • Incorporated a custom MultiQC module into the workflow.
  • Added multiqc_config.yml files for short, long, and hybrid assembly modes.
  • Generated a custom MultiQC HTML report consolidating metrics from KmerFinder, Quast, and other sources, presented in an overview table.

Closes #109

RUN TESTS

Test of this PR can be run via:
nextflow run main.nf -profile test_full,docker --outdir results

or

nextflow run main.nf  \
    -profile test,docker \
    --skip_kmerfinder false \
    --kmerfinderdb 'https://zenodo.org/records/10458361/files/20190108_kmerfinder_stable_dirs.tar.gz?download=1' \
    --ncbi_assembly_metadata 'https://ftp.ncbi.nlm.nih.gov/genomes/ASSEMBLY_REPORTS/assembly_summary_refseq.txt' \
    --outdir results

Copy link

github-actions bot commented May 20, 2024

nf-core lint overall result: Passed ✅

Posted for pipeline commit 7b66a5e

+| ✅ 196 tests passed       |+
#| ❔   4 tests were ignored |#

❔ Tests ignored:

✅ Tests passed:

Run details

  • nf-core/tools version 2.14.1
  • Run at 2024-05-23 14:52:40

@Daniel-VM Daniel-VM marked this pull request as ready for review May 20, 2024 10:45
@Daniel-VM Daniel-VM requested a review from d4straub May 20, 2024 10:49
Copy link
Collaborator

@d4straub d4straub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi there, seems really great (but unfortunately no time to properly run a test).
I have a few comments, most of them only typo.

README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
conf/modules.config Outdated Show resolved Hide resolved
conf/modules.config Outdated Show resolved Hide resolved
conf/modules.config Outdated Show resolved Hide resolved
modules/local/kmerfinder.nf Outdated Show resolved Hide resolved
modules/nf-core/quast/main.nf Outdated Show resolved Hide resolved
nextflow_schema.json Outdated Show resolved Hide resolved
workflows/bacass.nf Outdated Show resolved Hide resolved
workflows/bacass.nf Outdated Show resolved Hide resolved
@Daniel-VM
Copy link
Contributor Author

Many thanks @d4straub ! Your suggestions have been very helpful. I have also added fixes because custom_multiqc wasn't working propertly after merging into dev.

Copy link
Collaborator

@d4straub d4straub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great!

@Daniel-VM Daniel-VM merged commit 7114a6d into nf-core:dev May 24, 2024
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants