N bases #1813

bshim181 · 2024-09-23T13:26:19Z

bshim181
Sep 23, 2024

i have noticed in that in the reports files, if there is a base N in the gene feature sequences, it translates to amino acid X.
I was wondering if there is a way to handle those base Ns. is there a way to replace those bases based on the reference?

Answered by mizraelson

Oct 2, 2024

I see. Generally speaking, there is an algorithm behind assembleContigs that splits a clone if there is enough data to support both variants, which is the output you’re looking for. This algorithm considers the shares of each variant, the Phred quality of the nucleotides, their location on the read, and the surrounding context (for example, if you have NN, there might be multiple possible resolutions) among other things. In some cases, there isn’t enough data to determine if the clone should be split, and MiXCR will then place an N. Several parameters guide this process, but the main ones are:

-MassembleContigs.parameters.branchingMinimalQualityShare=0.1
-MassembleContigs.parameters.branc…

View full answer

mizraelson · 2024-09-24T00:28:24Z

mizraelson
Sep 24, 2024
Collaborator

Hi, what command do you run to analyze the data?

0 replies

bshim181 · 2024-09-24T14:08:02Z

bshim181
Sep 24, 2024
Author

preset of analyze rnaseq-full-length with MiXCR version 4.3.2 I believe. Is there a possibility where updating to newer version of MiXCR might solve the issue?

Also, if updating to new MiXCR version is hard to do(predefined sets of workflow), is there a way to modify the parameter to handle this?

0 replies

bshim181 · 2024-09-27T01:49:36Z

bshim181
Sep 27, 2024
Author

From looking at alignment files, it seems like alignment gaps leads to these ambiguous base of N.

0 replies

mizraelson · 2024-09-27T03:46:09Z

mizraelson
Sep 27, 2024
Collaborator

Not exactly. In the example above, there is no ambiguity, but rather a single nucleotide deletion in FR3, which will shift the reading frame, rendering the clone non-productive.

The appearance of “N” occurs during the assembleContigs step, when MiXCR extends the initially assembled CDR3 clones to cover more regions of the sequence. This is where ambiguity can arise. You can discard such sequences by adding the following to the analyze command:

-MassembleContigs.parameters.discardAmbiguousNucleotideCalls=true to the analyze command.

0 replies

bshim181 · 2024-10-01T15:51:19Z

bshim181
Oct 1, 2024
Author

Regards to the image I have sent above, so I looked at an example where N base appeared in the sequence.

This is the sequence I looked at, I believe in the FR3 region with two Ns in the sequence.

I see two different pools of reads. Out of total of 21 reads that cover map to this clone, about half of the reads have this variation.

At these two positions with N, I am seeing deletion in the first N position and mismatch in the second N position (mismatch between reference=G and query=C)

For another half of reads, I am seeing mismatch in the first N position and the match to the reference in the second position.

Rather than replacing these bases with N, is there a possibility to output all possible sequences with variants? we are also interested in mutations within vdj sequences and these read evidences might be pointing toward potential biologically relevant targets.

0 replies

mizraelson · 2024-10-01T23:27:43Z

mizraelson
Oct 1, 2024
Collaborator

Did you try using:
-MassembleContigs.parameters.discardAmbiguousNucleotideCalls=true ? Do you still see Ns in the sequences?

Regarding the first case: a deletion of A nucleotide in FR3 will lead to a frameshift in translation of CDR3, FR4 and C gene and this clone will not be functional.

0 replies

bshim181 · 2024-10-02T13:26:01Z

bshim181
Oct 2, 2024
Author

I have tried using -MassembleContigs.parameters.discardAmbiguousNucleotideCalls=true and it does discard ambiguous nucleotides and replaces with the reference sequence.

Possibility that I am considering here is that the variants captured in these reads are mutations rather than sequencing error and therefore i was wondering if there is a way to output all possible variation at those N base positions (rather than getting replaced with ambiguous base).

0 replies

mizraelson · 2024-10-02T22:41:36Z

mizraelson
Oct 2, 2024
Collaborator

I see. Generally speaking, there is an algorithm behind assembleContigs that splits a clone if there is enough data to support both variants, which is the output you’re looking for. This algorithm considers the shares of each variant, the Phred quality of the nucleotides, their location on the read, and the surrounding context (for example, if you have NN, there might be multiple possible resolutions) among other things. In some cases, there isn’t enough data to determine if the clone should be split, and MiXCR will then place an N. Several parameters guide this process, but the main ones are:

-MassembleContigs.parameters.branchingMinimalQualityShare=0.1
-MassembleContigs.parameters.branchingMinimalSumQuality=60
-MassembleContigs.parameters.outputMinimalQualityShare=0.75

These are the default values for MiXCR v4.7 with the rna-seq preset. You can find explanations for all parameters on our website. I recommend trying the latest version first and adjusting the parameters if needed (generally, the lower the thresholds, the more likely MiXCR will split a clone into two).

That said, based on our experience, the default parameters work best, as they have been empirically evaluated on hundreds of different datasets.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

N bases #1813

{{title}}

Replies: 8 comments

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

N bases #1813

bshim181 Sep 23, 2024

Replies: 8 comments

mizraelson Sep 24, 2024 Collaborator

bshim181 Sep 24, 2024 Author

bshim181 Sep 27, 2024 Author

mizraelson Sep 27, 2024 Collaborator

bshim181 Oct 1, 2024 Author

mizraelson Oct 1, 2024 Collaborator

bshim181 Oct 2, 2024 Author

mizraelson Oct 2, 2024 Collaborator

bshim181
Sep 23, 2024

mizraelson
Sep 24, 2024
Collaborator

bshim181
Sep 24, 2024
Author

bshim181
Sep 27, 2024
Author

mizraelson
Sep 27, 2024
Collaborator

bshim181
Oct 1, 2024
Author

mizraelson
Oct 1, 2024
Collaborator

bshim181
Oct 2, 2024
Author

mizraelson
Oct 2, 2024
Collaborator