You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As a geneticist for rare diseases, I would like to analyze the deletions in a patient's genome so that the events characterized by the disease can be detected with the help of databases. This helps to narrow down the diagnosis and to initiate tailored therapies corresponding to the genon type.
This includes the following aspects:
Input: We want to allow short and long reads and deal with them differently.
-> generating candidate haplotypes
-> local realignment using the pair HMM Model against the candidate haplotypes -> matrix of likelihoods for each read
-> local assembly: assemble these window aligned reads into an assembly graph of local variation
-> infer variants from assembled haplotypes: "Despite its name, HaplotypeCaller does not actually call haplotypes. Rather, it generates haplotypes as an intermediate step to discover variants at individual loci. Here we describe how the GATK engine determines which alt alleles exist in locally assembled haplotypes." (-> variant qualtiy score model)
We need to decouple the output from the functionality #6 so that we can write it to an output file #8 with an output option seqan/product_backlog#21. Then a VCF parser has to be developed in SeqAn3 #9#10, which we want to use for iGenVar #11. ✅
This is an overview over all epics.
As a geneticist for rare diseases, I would like to analyze the deletions in a patient's genome so that the events characterized by the disease can be detected with the help of databases. This helps to narrow down the diagnosis and to initiate tailored therapies corresponding to the genon type.
This includes the following aspects:
=> As a python binding writer I want dynamic search configurations so I can handle the different configurations with runtime dispatching. product_backlog#31
=> Balanced Binning Directory product_backlog#32
=> Minimiser View works with reverse k-mers product_backlog#84
Output: We want to output the deletions in VCF format using a VCF parser from SeqAn3 (needs to be implemented).=> [Search] Use a buffer to cache the intermediate results between search incovations such that memory allocations are reduced on multiple invocations of search. product_backlog#29✅We want to modularise the different parts of IGenVar so that the user can decide which methods to use and so that we can compare different combinations of methods more easily.=> As application developer I want built application binaries put into a bin directory so I can easily find the binaries after building. product_backlog#44✅Testing: We want to test all functionalities and also prove this with code coverage.=> Improve the documentation for the search module product_backlog#30✅=> [Search] Add support for generalised CSAs in the sdsl, such that we can index text collections without artificially adding a sentinel character to our indices. product_backlog#24
Input
Differentiating between the inputs will be processed in the course of Issue seqan/product_backlog#17.✅Create a Structure for BAM Indexing seqan/product_backlog#88
Algorithms
Call SNPs & Indels:
-> generating candidate haplotypes
-> local realignment using the pair HMM Model against the candidate haplotypes -> matrix of likelihoods for each read
-> local assembly: assemble these window aligned reads into an assembly graph of local variation
-> infer variants from assembled haplotypes: "Despite its name, HaplotypeCaller does not actually call haplotypes. Rather, it generates haplotypes as an intermediate step to discover variants at individual loci. Here we describe how the GATK engine determines which alt alleles exist in locally assembled haplotypes." (-> variant qualtiy score model)
Call SVs:
Call Deletions from long reads seqan/product_backlog#32 ✅
Call Insertions from long reads seqan/product_backlog#93 ✅
Add all Methods of Vaquita seqan/product_backlog#84
Call SVs in short reads seqan/product_backlog#17
Cluster SVs: seqan/product_backlog#26
Refinement
TODO... (sViper, ...)
Output
We need to decouple the output from the functionality #6 so that we can write it to an output file #8 with an output option seqan/product_backlog#21.Then a VCF parser has to be developed in SeqAn3 #9 #10, which we want to use for iGenVar #11.✅Testing ✅
We want to check the code with CLI seqan/product_backlog#4 and API tests seqan/product_backlog#12 seqan/product_backlog#13 and cover it completely.
-> We now have a codecoverage of > 85%! seqan/product_backlog#116 ✅
In order to implement the CodeCoverage, we are waiting for an update in the app template: seqan/app-template#30.Update: CLI tests are implemented. 🎉
Refinements, bugs, and requests
The text was updated successfully, but these errors were encountered: