-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How are the base quality score generated? #50
Comments
The quality scoring is indeed a bit of an issue because the input trace qualities are not very useful. The assemble command simply scales a flat quality prior by the fraction of traces supporting the consensus nucleotide. For 2 input traces, it is thus indeed only 1 or 2 traces supporting the consensus nucleotide. For more input traces, you should see a range of quality values. |
OK thanks - that's interesting. I'm using Tracy to detect errors in sequencing data, which can range from 1 trace (where I use As per your explanation, this sounds like forming a consensus between 2 traces for a given nucleotide doesn't consider the quality of the base call, and rather just looks at the fraction of traces involved in generating the consensus. Below summarises my understanding for 4 different base quality configurations for the assembly of 2 trace files - is this accurate? To my mind, the 3rd and 4th scenarios should have lower quality values than the 1st. Part of the problem for me is that I want to have some estimate of the per-base quality score, so that I can confidently calculate the per-base error rate. In practice, this is hard using Is there a workaround? |
This piqued my interest, would you mind expanding on it a bit? In my department, one of the concerns I come across as a proponent of tracy is the lack of informative quality scoring and the fact that Ns appear in our sequences at a very very low rate compared to other basecalling algorithms - combined, these attributes make my colleagues cautious. |
Hi,
I am using
tracy assemble
to assemble between 2 - 4 trace files. I am outputting the consensus as a.fastq
file, and then aligning this to a reference sequence.Downstream, I am performing some analysis that filters on per-nucleotide quality scores, and I am not sure that I understand how the these are translated from the base signal from the chromatogram to the base quality of the consensus calculated within
tracy assemble
. Typically, I only see 2 different base quality scores on a consensus (e.g. 19 and 24).Do you have any insight into this?
I'm calling
tracy
like so:The text was updated successfully, but these errors were encountered: