Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make sure we throw out chimeric reads #14

Open
petercombs opened this issue May 9, 2016 · 0 comments
Open

Make sure we throw out chimeric reads #14

petercombs opened this issue May 9, 2016 · 0 comments

Comments

@petercombs
Copy link
Contributor

There's at least some reads in my data set where there are multiple SNPs per read, and those SNPs disagree as to the parental origin of the read. It's rare—suggesting it's probably a sequencing error—but we ought to deal with it properly.

Probably the cleanest thing to do is just toss any read that is ambiguous, but one could imagine if there are 3 or more SNPs going with the consensus.

Also, I'm not sure if this should be a separate issue or not, but if there's a sequencing error that has neither the annotated reference or alternate allele, that should probably be tossed as well.

screen shot 2016-05-09 at 12 47 48 pm

In the attached screenshot, red reads are melanogaster, blue reads are simulans, and grey reads are at least somewhat ambiguous—there's one read with both mel and sim SNPs, and another with an unannotated allele.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant