Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove the mpileup BAM_CREF_SKIP filter. #2281

Open
wants to merge 1 commit into
base: develop
Choose a base branch
from

Conversation

jkbonfield
Copy link
Contributor

Mpileup removes alignments using the cigar ref skip operator ("N"). This was originally added in 2011 in samtools/samtools#d1643d6 with the commit message of "fixed a bug in indel calling related to unmapped and refskip reads".

Unfortunately I don't know what that bug was, but removing the code shows it still works (at least for some data!). We need better understanding of what's going on and why it was added, so perhaps we should add a command line option to control this instead?

Fixes #2277

Mpileup removes alignments using the cigar ref skip operator ("N").
This was originally added in 2011 in samtools/samtools#d1643d6 with
the commit message of "fixed a bug in indel calling related to
unmapped and refskip reads".

Unfortunately I don't know what that bug was, but removing the code
shows it still works (at least for some data!).  We need better
understanding of what's going on and why it was added, so perhaps we
should add a command line option to control this instead?

Fixes samtools#2277
jkbonfield added a commit to jkbonfield/bcftools that referenced this pull request Sep 17, 2024
Mpileup removes alignments using the cigar ref skip operator ("N").
This was originally added in 2011 in samtools/samtools#d1643d6 with
the commit message of "fixed a bug in indel calling related to
unmapped and refskip reads".

Unfortunately I don't know what that bug was, but removing the code
shows it still works (at least for some data!).  We need better
understanding of what's going on and why it was added, but this PR
makes it optional, keeping the default as before.  Note there appears
to be no filtering of BAM_CREF_SKIP in indels-2.0 so the option would
be a nop there.

This is an alternative PR to samtools#2281.  I've leave it to the project
maintainer as to what is preferable: removing the (no longer needed?)
filtering, or keeping the default behaviour identical and adding a new
option instead (which is safer, but possibly leads to accidental bad
calls due to not noticing a new option has appeared).

Fixes samtools#2277
@jkbonfield
Copy link
Contributor Author

See also #2282 as an alternative to this PR. That is probably the way to go for compatibility, but then again I doubt anyone is currently using bcftools on RNASeq data with ref-skip based alignments given all such alignments were simply discarded, so it's unlikely the change in behaviour would break any currently working pipelines.

@pd3
Copy link
Member

pd3 commented Oct 3, 2024

I believe the edited BAM_CREF_SKIP checking code

bcftools/bam2bcf_indel.c

Lines 855 to 859 in ef8b974

for (kk = 0; kk < p->b->core.n_cigar; ++kk)
if ((cigar[kk]&BAM_CIGAR_MASK) == BAM_CREF_SKIP)
break;
if (kk < p->b->core.n_cigar)
continue;
simply skips these reads from realignment. I am imagining a proper fix would be to split the reads into parts and deal with each separately. Unfortunately the current framework might not be bent easily for that.

If we should merge any version of this, we'd need to include a small test to demonstrate where it can be helpful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Significant difference between IDV and AD when calling certain RNA-seq indels with mpileup
2 participants