N bases #1813
-
Beta Was this translation helpful? Give feedback.
Replies: 8 comments
-
Hi, what command do you run to analyze the data? |
Beta Was this translation helpful? Give feedback.
-
preset of analyze rnaseq-full-length with MiXCR version 4.3.2 I believe. Is there a possibility where updating to newer version of MiXCR might solve the issue? Also, if updating to new MiXCR version is hard to do(predefined sets of workflow), is there a way to modify the parameter to handle this? |
Beta Was this translation helpful? Give feedback.
-
From looking at alignment files, it seems like alignment gaps leads to these ambiguous base of N. |
Beta Was this translation helpful? Give feedback.
-
Not exactly. In the example above, there is no ambiguity, but rather a single nucleotide deletion in FR3, which will shift the reading frame, rendering the clone non-productive. The appearance of “N” occurs during the
|
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
Did you try using: Regarding the first case: a deletion of A nucleotide in FR3 will lead to a frameshift in translation of CDR3, FR4 and C gene and this clone will not be functional. |
Beta Was this translation helpful? Give feedback.
-
I have tried using -MassembleContigs.parameters.discardAmbiguousNucleotideCalls=true and it does discard ambiguous nucleotides and replaces with the reference sequence. Possibility that I am considering here is that the variants captured in these reads are mutations rather than sequencing error and therefore i was wondering if there is a way to output all possible variation at those N base positions (rather than getting replaced with ambiguous base). |
Beta Was this translation helpful? Give feedback.
-
I see. Generally speaking, there is an algorithm behind
These are the default values for MiXCR v4.7 with the That said, based on our experience, the default parameters work best, as they have been empirically evaluated on hundreds of different datasets. |
Beta Was this translation helpful? Give feedback.
I see. Generally speaking, there is an algorithm behind
assembleContigs
that splits a clone if there is enough data to support both variants, which is the output you’re looking for. This algorithm considers the shares of each variant, the Phred quality of the nucleotides, their location on the read, and the surrounding context (for example, if you have NN, there might be multiple possible resolutions) among other things. In some cases, there isn’t enough data to determine if the clone should be split, and MiXCR will then place an N. Several parameters guide this process, but the main ones are: