Skip to content

Commit

Permalink
fix: replace ragoo with ragtag (#591)
Browse files Browse the repository at this point in the history
* replaced ragoo with ragtag and changed env

* changes in benchmarking and report

* fix

* Changed RagTag options

* deactivate non SARS-CoV-2 benchmarking

---------

Co-authored-by: Alexander Thomas <[email protected]>
  • Loading branch information
vBassewitz and alethomas authored Oct 31, 2024
1 parent 02fb4da commit e64e9db
Show file tree
Hide file tree
Showing 8 changed files with 16 additions and 17 deletions.
1 change: 0 additions & 1 deletion .github/workflows/main.yml
Original file line number Diff line number Diff line change
Expand Up @@ -203,7 +203,6 @@ jobs:
benchmark_strain_calling,
benchmark_assembly,
benchmark_mixtures,
benchmark_non_sars_cov_2,
benchmark_reads,
compare_assemblers,
]
Expand Down
2 changes: 1 addition & 1 deletion workflow/envs/ragoo.yaml → workflow/envs/ragtag.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4,4 +4,4 @@ channels:
- imperial-college-research-computing
- nodefaults
dependencies:
- ragoo =1.1
- ragtag =2.1
2 changes: 1 addition & 1 deletion workflow/report/assembly_illumina.rst
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
Assembly of sample {{ snakemake.wildcards.sample }}.
Reads were assembled with {{ "`metaSPAdes <https://github.com/ablab/spades>`_" if snakemake.params.is_amp else "`Megahit <https://github.com/voutcn/megahit>`_" }}, followed by reference based contig ordering and concatenation with `RaGOO <https://github.com/malonge/RaGOO>`_.
Reads were assembled with {{ "`metaSPAdes <https://github.com/ablab/spades>`_" if snakemake.params.is_amp else "`Megahit <https://github.com/voutcn/megahit>`_" }}, followed by reference based contig ordering and concatenation with `RagTag <https://github.com/malonge/RagTag>`_.
Then, assembly was polished by applying variants with an allele frequency of 100% (called by `Varlociraptor <https://varlociraptor.github.io>`_ at FDR 5%).
2 changes: 1 addition & 1 deletion workflow/report/assembly_ont.rst
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
Assembly of sample {{ snakemake.wildcards.sample }}.
Reads were assembled with `SPAdes <https://github.com/ablab/spades>`_, followed by reference based contig ordering and concatenation with `RaGOO <https://github.com/malonge/RaGOO>`_.
Reads were assembled with `SPAdes <https://github.com/ablab/spades>`_, followed by reference based contig ordering and concatenation with `RagTag <https://github.com/malonge/RagTag>`_.
2 changes: 1 addition & 1 deletion workflow/report/qc-report.rst
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
QC overview for samples from {{ snakemake.wildcards.date }}.
Readcounts for raw, trimmed and filtered reads, length of initially assembled (`Megahit <https://github.com/voutcn/megahit>`_/`metaSPAdes <https://github.com/ablab/spades>`_) and reference ordered contigs (`RaGOO <https://github.com/malonge/RaGOO>`_).
Readcounts for raw, trimmed and filtered reads, length of initially assembled (`Megahit <https://github.com/voutcn/megahit>`_/`metaSPAdes <https://github.com/ablab/spades>`_) and reference ordered contigs (`RagTag <https://github.com/malonge/RagTag>`_).
Percentil overview for contamination in the raw reads (`kraken <https://github.com/DerrickWood/kraken>`_) and called COVID-19-strain including defining SNPs (`Pangolin <https://github.com/cov-lineages/pangolin>`_) and variants of interest from variant calling (`Varlociraptor <https://varlociraptor.github.io>`_).
12 changes: 6 additions & 6 deletions workflow/rules/assembly.smk
Original file line number Diff line number Diff line change
Expand Up @@ -128,17 +128,17 @@ rule order_contigs:
output:
temp("results/{date}/contigs/ordered-unfiltered/{sample}.fasta"),
log:
"logs/{date}/ragoo/{sample}.log",
"logs/{date}/ragtag/{sample}.log",
params:
outdir=get_output_dir,
conda:
"../envs/ragoo.yaml"
"../envs/ragtag.yaml"
shadow:
"minimal"
shell:
"(mkdir -p {params.outdir}/{wildcards.sample} && cd {params.outdir}/{wildcards.sample} &&"
" ragoo.py ../../../../../{input.contigs} ../../../../../{input.reference} &&"
" cd ../../../../../ && mv {params.outdir}/{wildcards.sample}/ragoo_output/ragoo.fasta {output})"
" ragtag.py scaffold -C -w ../../../../../{input.reference} ../../../../../{input.contigs} &&"
" cd ../../../../../ && mv {params.outdir}/{wildcards.sample}/ragtag_output/ragtag.scaffold.fasta {output})"
" > {log} 2>&1"


Expand All @@ -148,11 +148,11 @@ rule filter_chr0:
output:
temp("results/{date}/contigs/ordered/{sample}.fasta"),
log:
"logs/{date}/ragoo/{sample}_cleaned.log",
"logs/{date}/ragtag/{sample}_cleaned.log",
conda:
"../envs/python.yaml"
script:
"../scripts/ragoo-remove-chr0.py"
"../scripts/ragtag-remove-chr0.py"


# polish illumina de novo assembly
Expand Down
10 changes: 5 additions & 5 deletions workflow/rules/benchmarking.smk
Original file line number Diff line number Diff line change
Expand Up @@ -277,17 +277,17 @@ rule order_contigs_assembly_comparison:
"results/{date}/assembly/{sample}/{assembler}/{sample}.ordered.contigs.fasta"
),
log:
"logs/{date}/ragoo/{assembler}/{sample}.log",
"logs/{date}/ragtag/{assembler}/{sample}.log",
params:
outdir=get_output_dir,
conda:
"../envs/ragoo.yaml"
"../envs/ragtag.yaml"
shadow:
"minimal"
shell:
"(cd {params.outdir} &&"
" ragoo.py ../../../../../{input.contigs} ../../../../../{input.reference} &&"
" cd ../../../../../ && mv {params.outdir}/ragoo_output/ragoo.fasta {output})"
" ragtag.py scaffold -C -w ../../../../../{input.reference} ../../../../../{input.contigs} &&"
" cd ../../../../../ && mv {params.outdir}/ragtag_output/ragtag.scaffold.fasta {output})"
" > {log} 2>&1"


Expand All @@ -297,7 +297,7 @@ use rule filter_chr0 as filter_chr0_assembly_comparison with:
output:
"results/{date}/assembly/{sample}/{assembler}/{sample}.contigs.ordered.filtered.fasta",
log:
"logs/{date}/ragoo/{assembler}/{sample}_cleaned.log",
"logs/{date}/ragtag/{assembler}/{sample}_cleaned.log",


use rule align_contigs as align_contigs_assembly_comparison with:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@


def remove_chr0(data_path, out_path):
"""This function removes the Chr0 contig generated by raGOO.
"""This function removes the Chr0 contig generated by ragTag.
It also renames the id in the FASTA file to the actual sample name.
In the case where no pseudomolecule is constructed other than the Chr0, it ensures,
that the FASTA fill contains a 'filler-contig' with a sequence of 'N'.
Expand Down

0 comments on commit e64e9db

Please sign in to comment.