Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Viridian 0.9 checklist #50

Open
iqbal-lab opened this issue Apr 22, 2022 · 0 comments
Open

Viridian 0.9 checklist #50

iqbal-lab opened this issue Apr 22, 2022 · 0 comments

Comments

@iqbal-lab
Copy link
Contributor

iqbal-lab commented Apr 22, 2022

Unchanged

  • command line invocation , main output files (consensus, VCF, json)

New features that people are likely to care about:

  1. Coverage on SNP (not indel) alleles reported in VCF
  2. A new summary block added to the output JSON which should include everything anyone needs to look at results and QC (separating this from internal/debug output in the JSON). Intention is users should generally only need to look here.
  3. Output JSON records state of how far through the process the workflow has gone.

Subtle changes in quality of assembly sequence

  1. Slightly fewer Ns called in primers due to improved primer-aware QC
  2. Consensus will include the very first and last primers in the genome
  3. Lengths of runs of Ns now fixed to match the length of associated sequence in the reference (ie not intended to imply indels)

Improved robustness

  • Additional robustness against pathological cases with ARTIC4.1 (which we have not actually ever seen in real life)

Performance/speed

  • High depth samples are subsampled to 1000x (per amplicon) thus controlling runtime and RAM. Will control runtime for the crazy-big samples in the ENA .

New debug output

  • new TSV file produced which shows coverage of different bases at each position

Internal changes

  • Large amount of code refactoring

Test process

  1. Unit tests improved
  2. Tests on simulated genomes with specific error-modes in primers
  3. Tests on nanopore/illumina ARTIC/midnight truth samples using covid-truth-eval framework
  4. Robustness check running on tens of thousands of genomes from ENA.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant