Skip to content

DSL2 - CLASSIFY_MTDNA_HAPLOGROUP #1134

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 73 commits into from
Closed

DSL2 - CLASSIFY_MTDNA_HAPLOGROUP #1134

wants to merge 73 commits into from

Conversation

trianglegrrl
Copy link

@trianglegrrl trianglegrrl commented May 1, 2025

Work in progress - adding mtDNA haplogroup classification

Haplogrep3 integration

  • --tree needs to be passed in from haplogrep3_tree_id
  • should we be doing this from a consensus sequence vs vcf?
  • verify RSRS support
  • Get tests passing
  • Update manual testing docs
  • Test with full run (ancient0003, haplogrep3 web + CLI = "M32'56")
nextflow run main.nf -profile docker --input full_mito.csv --outdir ./results/ --run_genotyping --genotyping_tool ug --genotyping_source raw  --run_mtdna_haplogroup --fasta /references/mito/rCRS.fasta --fasta_fai /references/mito/rCRS.fasta.fai --skip_preprocessing --skip_damagecalculation --skip_qualimap -resume

[... 1h50m passes]

╰─ cat results/haplogrep3/ancient0003.txt
"SampleID"      "Haplogroup"    "Rank"  "Quality"       "Range"
"ancient0003"   "M32'56"        "1"     "0.9579"        "1-16569"

PR checklist

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added a new tool - have you followed the pipeline conventions in the contribution docs
  • If necessary, also make a PR on the nf-core/eager branch on the nf-core/test-datasets repository.
  • Make sure your code lints (nf-core pipelines lint).
  • Ensure the test suite passes (nextflow run . -profile test,docker --outdir <OUTDIR>).
  • Check for unexpected warnings in debug mode (nextflow run . -profile debug,test,docker --outdir <OUTDIR>).
  • Usage Documentation in docs/usage.md is updated.
  • Output Documentation in docs/output.md is updated.
  • CHANGELOG.md is updated.
  • README.md is updated (including new tool citations and authors/contributors).

@trianglegrrl trianglegrrl changed the base branch from master to dev May 1, 2025 13:31
@nf-core-bot
Copy link
Member

Warning

Newer version of the nf-core template is available.

Your pipeline is using an old version of the nf-core template: 3.2.0.
Please update your pipeline to the latest version.

For more documentation on how to update your pipeline, please see the nf-core documentation and Synchronisation documentation.

@trianglegrrl trianglegrrl force-pushed the dsl2-haplogrep3 branch 4 times, most recently from 596e6d0 to 148ab55 Compare May 1, 2025 17:34
@trianglegrrl trianglegrrl changed the title VERY WIP for human mtdna haplogroup classification DSL2 - CLASSIFY_MTDNA_HAPLOGROUP May 1, 2025
@trianglegrrl trianglegrrl changed the title DSL2 - CLASSIFY_MTDNA_HAPLOGROUP DSL2 - [WIP] CLASSIFY_MTDNA_HAPLOGROUP May 1, 2025
// https://github.com/nf-core/modules/tree/master/subworkflows
// You can also ask for help via your pull request or on the #subworkflows channel on the nf-core Slack workspace:
// https://nf-co.re/join
// TODO nf-core: A subworkflow SHOULD import at least two modules
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this true here? We don't really need more than one in this case, but it seems logical to have this as a subworkflow.

(The Y-DNA stuff will definitely offer making a reference genome that can be used by Yleaf as part of a subworkflow.)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Importing one one module is fine 👍
I agree that a SWF makes more sense.


if (params.run_mtdna_haplogroup) {
if (!params.run_genotyping) {
error "Cannot run mtDNA haplogroup classification (--run_mtdna_haplogroup) without running genotyping (--run_genotyping). VCF files are required as input."
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So there used to be a vcf2genome in eager2 that output a consensus fasta. Is that still a thing? Because haplogrep3 also works with mtDNA sequences aligned to rCRS or RSRS.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no. vcf2genome is discontinued and will not be supported in eager 3.* .
Consensus sequence calling will be added soon™ ( #1142 ).

@nf-core nf-core deleted a comment from github-actions bot May 1, 2025
@trianglegrrl trianglegrrl changed the title DSL2 - [WIP] CLASSIFY_MTDNA_HAPLOGROUP DSL2 - CLASSIFY_MTDNA_HAPLOGROUP May 2, 2025
@trianglegrrl
Copy link
Author

Whoops rebase disaster... See #1148 for proper commits

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants