Skip to content

Commit

Permalink
doc update dorado workflow
Browse files Browse the repository at this point in the history
  • Loading branch information
fraser-combe committed Oct 4, 2024
1 parent 5f87c15 commit 9d22c74
Show file tree
Hide file tree
Showing 5 changed files with 38 additions and 0 deletions.
32 changes: 32 additions & 0 deletions docs/workflows/standalone/dorado_basecalling.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
# Dorado Basecalling Workflow - Version 1.0

## Quick Facts

| **Workflow Type** | **Applicable Kingdom** | **Last Known Changes** | **Command-line Compatibility** | **Workflow Level** |
|---|---|---|---|---|
| [Standalone](../../workflows_overview/workflows_type.md/#standalone) | [Any Taxa](../../workflows_overview/workflows_kingdom.md/#any-taxa) | Dorado v1.0 | Yes | Sample-level |

## Dorado Basecalling Overview

The Dorado Basecalling workflow is used to convert Oxford Nanopore `POD5` sequencing files into `FASTQ` format by utilizing a GPU-accelerated environment. This workflow is ideal for high-throughput applications where fast and accurate basecalling is essential.

### Inputs

| **Terra Task Name** | **Variable** | **Type** | **Description** | **Default Value** | **Terra Status** | **Workflow** |
|---|---|---|---|---|---|---|
| basecall | **input_files** | Array[File] | Array of `POD5` files to be basecalled | None | Required | Dorado |
| basecall | **sample_names** | Array[String] | Array of sample names corresponding to the input files | None | Required | Dorado |
| basecall | **dorado_model** | String | Dorado basecalling model (e.g., `[email protected]`) | None | Required | Dorado |
| basecall | **output_prefix** | String | Prefix to apply to output files | None | Required | Dorado |
| basecall | **cpu** | Int | Number of CPUs to allocate to the task | 8 | Optional | Dorado |
| basecall | **memory** | String | Amount of memory/RAM to allocate to the task | 32GB | Optional | Dorado |
| basecall | **docker** | String | The Docker container used for this task | us-docker.pkg.dev/general-theiagen/staphb/dorado:0.8.0 | Optional | Dorado |
| basecall | **gpuCount** | Int | Number of GPUs to use for basecalling | 1 | Optional | Dorado |
| basecall | **gpuType** | String | Type of GPU to use | nvidia-tesla-t4 | Optional | Dorado |

### Outputs

| **Variable** | **Type** | **Description** | **Workflow** |
|---|---|---|---|
| basecalled_fastqs | Array[File] | Array of FASTQ files generated from basecalling | Dorado |
| logs | Array[File] | Array of log files capturing the basecalling process | Dorado |
1 change: 1 addition & 0 deletions docs/workflows_overview/workflows_alphabetically.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ title: Alphabetical Workflows
| [**Core_Gene_SNP**](../workflows/phylogenetic_construction/core_gene_snp.md) | Pangenome analysis | Bacteria | Set-level | Some optional features incompatible, Yes | v2.1.0 | [Core_Gene_SNP_PHB](https://dockstore.org/workflows/github.com/theiagen/public_health_bioinformatics/Core_Gene_SNP_PHB:main?tab=info) |
| [**Create_Terra_Table**](../workflows/data_import/create_terra_table.md)| Upload data to Terra and then run this workflow to have the table automatically created | Any taxa | | Yes | v2.2.0 | [Create_Terra_Table_PHB](https://dockstore.org/workflows/github.com/theiagen/public_health_bioinformatics/Create_Terra_Table_PHB:main?tab=info) |
| [**CZGenEpi_Prep**](../workflows/phylogenetic_construction/czgenepi_prep.md)| Prepare metadata and fasta files for easy upload to the CZ GEN EPI platform. | Monkeypox virus, SARS-CoV-2, Viral | Set-level | No | v1.3.0 | [CZGenEpi_Prep_PHB](https://dockstore.org/workflows/github.com/theiagen/public_health_bioinformatics/CZGenEpi_Prep_PHB:main?tab=info) |
| [**Dorado_Basecalling**](../workflows/standalone/dorado_basecalling.md)| GPU-accelerated basecalling of Oxford Nanopore sequencing data | Any taxa | Sample-level | Yes | v1.0.0 | [Dorado_Basecalling_PHB](https://dockstore.org/workflows/github.com/theiagen/public_health_bioinformatics/Dorado_Basecalling_PHB:main?tab=info) |
| [**Find_Shared_Variants**](../workflows/phylogenetic_construction/find_shared_variants.md)| Combines and reshapes variant data from Snippy_Variants to illustrate variants shared across multiple samples | Bacteria, Mycotics | Set-level | Yes | v2.0.0 | [Find_Shared_Variants_PHB](https://dockstore.org/workflows/github.com/theiagen/public_health_bioinformatics/Find_Shared_Variants_PHB:main?tab=info) |
| [**Freyja Workflow Series**](../workflows/genomic_characterization/freyja.md)| Recovers relative lineage abundances from mixed sample data and generates visualizations | SARS-CoV-2, Viral | Sample-level, Set-level | Yes | v2.2.0 | [Freyja_FASTQ_PHB](https://dockstore.org/workflows/github.com/theiagen/public_health_bioinformatics/Freyja_FASTQ_PHB:main?tab=info), [Freyja_Plot_PHB](https://dockstore.org/workflows/github.com/theiagen/public_health_bioinformatics/Freyja_Plot_PHB:main?tab=info), [Freyja_Dashboard_PHB](https://dockstore.org/workflows/github.com/theiagen/public_health_bioinformatics/Freyja_Dashboard_PHB:main?tab=info), [Freyja_Update_PHB](https://dockstore.org/workflows/github.com/theiagen/public_health_bioinformatics/Freyja_Update_PHB:main?tab=info) |
| [**GAMBIT_Query**](../workflows/standalone/gambit_query.md)| Taxon identification of genome assembly using GAMBIT | Bacteria, Mycotics | Sample-level | Yes | v2.0.0 | [Gambit_Query_PHB](https://dockstore.org/workflows/github.com/theiagen/public_health_bioinformatics/Gambit_Query_PHB:main?tab=info) |
Expand Down
1 change: 1 addition & 0 deletions docs/workflows_overview/workflows_kingdom.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ title: Workflows by Kingdom
| [**BaseSpace_Fetch**](../workflows/data_import/basespace_fetch.md)| Import data from BaseSpace into Terra | Any taxa | Sample-level | Yes | v2.0.0 | [BaseSpace_Fetch_PHB](https://dockstore.org/workflows/github.com/theiagen/public_health_bioinformatics/BaseSpace_Fetch_PHB:main?tab=info) |
| [**Concatenate_Column_Content**](../workflows/data_export/concatenate_column_content.md) | Concatenate contents of a specified Terra data table column for many samples ("entities") | Any taxa | Set-level | Yes | v2.1.0 | [Concatenate_Column_Content_PHB](https://dockstore.org/workflows/github.com/theiagen/public_health_bioinformatics/Concatenate_Column_Content_PHB:main?tab=info) |
| [**Create_Terra_Table**](../workflows/data_import/create_terra_table.md)| Upload data to Terra and then run this workflow to have the table automatically created | Any taxa | | Yes | v2.2.0 | [Create_Terra_Table_PHB](https://dockstore.org/workflows/github.com/theiagen/public_health_bioinformatics/Create_Terra_Table_PHB:main?tab=info) |
| [**Dorado_Basecalling**](../workflows/standalone/dorado_basecalling.md)| GPU-accelerated basecalling of Oxford Nanopore sequencing data | Any taxa | Sample-level | Yes | v1.0.0 | [Dorado_Basecalling_PHB](https://dockstore.org/workflows/github.com/theiagen/public_health_bioinformatics/Dorado_Basecalling_PHB:main?tab=info) |
| [**Kraken2**](../workflows/standalone/kraken2.md) | Taxa identification from reads | Any taxa | Sample-level | Yes | v2.0.0 | [Kraken2_PE_PHB](https://dockstore.org/workflows/github.com/theiagen/public_health_bioinformatics/Kraken2_PE_PHB:main?tab=info), [Kraken2_SE_PHB](https://dockstore.org/workflows/github.com/theiagen/public_health_bioinformatics/Kraken2_SE_PHB:main?tab=info) |
| [**NCBI_Scrub**](../workflows/standalone/ncbi_scrub.md)| Runs NCBI's HRRT on Illumina FASTQs | Any taxa | Sample-level | Yes | v2.2.1 | [NCBI_Scrub_PE_PHB](https://dockstore.org/workflows/github.com/theiagen/public_health_bioinformatics/NCBI_Scrub_PE_PHB:main?tab=info), [NCBI_Scrub_SE_PHB](https://dockstore.org/workflows/github.com/theiagen/public_health_bioinformatics/NCBI_Scrub_SE_PHB:main?tab=info) |
| [**RASUSA**](../workflows/standalone/rasusa.md)| Randomly subsample sequencing reads to a specified coverage | Any taxa | Sample-level | Yes | v2.0.0 | [RASUSA_PHB](https://dockstore.org/workflows/github.com/theiagen/public_health_bioinformatics/RASUSA_PHB:main?tab=info) |
Expand Down
1 change: 1 addition & 0 deletions docs/workflows_overview/workflows_type.md
Original file line number Diff line number Diff line change
Expand Up @@ -71,6 +71,7 @@ title: Workflows by Type
| **Name** | **Description** | **Applicable Kingdom** | **Workflow Level** | **Command-line Compatibility**[^1] | **Last Known Changes** | **Dockstore** |
|---|---|---|---|---|---|---|
| [**Cauris_CladeTyper**](../workflows/standalone/cauris_cladetyper.md)| C. auris clade assignment | Mycotics | Sample-level | Yes | v1.0.0 | [Cauris_CladeTyper_PHB](https://dockstore.org/workflows/github.com/theiagen/public_health_bioinformatics/Cauris_CladeTyper_PHB:main?tab=info) |
| [**Dorado_Basecalling**](../workflows/standalone/dorado_basecalling.md)| GPU-accelerated basecalling of Oxford Nanopore sequencing data | Any taxa | Sample-level | Yes | v1.0.0 | [Dorado_Basecalling_PHB](https://dockstore.org/workflows/github.com/theiagen/public_health_bioinformatics/Dorado_Basecalling_PHB:main?tab=info) |
| [**GAMBIT_Query**](../workflows/standalone/gambit_query.md)| Taxon identification of genome assembly using GAMBIT | Bacteria, Mycotics | Sample-level | Yes | v2.0.0 | [Gambit_Query_PHB](https://dockstore.org/workflows/github.com/theiagen/public_health_bioinformatics/Gambit_Query_PHB:main?tab=info) |
| [**Kraken2**](../workflows/standalone/kraken2.md) | Taxa identification from reads | Any taxa | Sample-level | Yes | v2.0.0 | [Kraken2_PE_PHB](https://dockstore.org/workflows/github.com/theiagen/public_health_bioinformatics/Kraken2_PE_PHB:main?tab=info), [Kraken2_SE_PHB](https://dockstore.org/workflows/github.com/theiagen/public_health_bioinformatics/Kraken2_SE_PHB:main?tab=info) |
| [**NCBI-AMRFinderPlus**](../workflows/standalone/ncbi_amrfinderplus.md)| Runs NCBI's AMRFinderPlus on genome assemblies (bacterial and fungal) | Bacteria, Mycotics | Sample-level | Yes | v2.0.0 | [NCBI-AMRFinderPlus_PHB](https://dockstore.org/workflows/github.com/theiagen/public_health_bioinformatics/NCBI-AMRFinderPlus_PHB:main?tab=info) |
Expand Down
3 changes: 3 additions & 0 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,7 @@ nav:
- Zip_Column_Content: workflows/data_export/zip_column_content.md
- Standalone:
- Cauris_CladeTyper: workflows/standalone/cauris_cladetyper.md
- Dorado_basecalling: workflows/standalone/dorado_basecalling.md
- GAMBIT_Query: workflows/standalone/gambit_query.md
- Kraken2: workflows/standalone/kraken2.md
- NCBI-AMRFinderPlus: workflows/standalone/ncbi_amrfinderplus.md
Expand All @@ -67,6 +68,7 @@ nav:
- BaseSpace_Fetch: workflows/data_import/basespace_fetch.md
- Concatenate_Column_Content: workflows/data_export/concatenate_column_content.md
- Create_Terra_Table: workflows/data_import/create_terra_table.md
- Dorado_basecalling: workflows/standalone/dorado_basecalling.md
- Kraken2: workflows/standalone/kraken2.md
- NCBI-Scrub: workflows/standalone/ncbi_scrub.md
- RASUSA: workflows/standalone/rasusa.md
Expand Down Expand Up @@ -126,6 +128,7 @@ nav:
- Core_Gene_SNP: workflows/phylogenetic_construction/core_gene_snp.md
- Create_Terra_Table: workflows/data_import/create_terra_table.md
- CZGenEpi_Prep: workflows/phylogenetic_construction/czgenepi_prep.md
- Dorado_basecalling: workflows/standalone/dorado_basecalling.md
- Find_Shared_Variants: workflows/phylogenetic_construction/find_shared_variants.md
- Freyja Workflow Series: workflows/genomic_characterization/freyja.md
- GAMBIT_Query: workflows/standalone/gambit_query.md
Expand Down

0 comments on commit 9d22c74

Please sign in to comment.