-
Notifications
You must be signed in to change notification settings - Fork 190
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
group method: not all UMI-containing reads assigned UG tag #635
Comments
Hi @epiliper - Could you please post the |
Command used:
Here is the QNAME of a typical read in my input bamfile:
Where TGTCAGGCTAT is the UMI. When I run the above command, it runs with these options:
Thanks again. |
Update: I think I know what might be happening... Looking at part of the definition of
If I'm understanding this right, Does this sound like a reasonable cause? I apologize if I missed this in the docs; I guess checking the percentage of reads tagged with "UG" after running |
Second time in a week that an issue has been 'solved' by a user before I get around to it. I should be neglectful more often 😉 Yes, you're right that group doesn't add read tags to read2s, which are just written out as soon as they are read in, along with unmapped and/or chimeric reads depending on the options Lines 242 to 253 in 7e799bc
Looking over the documentation, I don't think this is stated anywhere. @IanSudbery, do you agree this is an oversight, or have I missed it too?! I'm happy to rectify. |
I don't see it in the documentation either. I think it should definitely be
documented. I also wonder if a tool similar to prepare-for-rsem could add
grouping info to the read2s.
Ian Sudbery
(He/Him)
Senior Lecturer in Bioinformatics,
Sheffield Institute for Nucleic Acids,
School of Biosciences,
The University of Sheffield.
web: www.sudlab.co.uk
Tel: 0114 222 2738
Twitter: IanSudbery
Show Calendar Availability
<https://calendar.google.com/calendar/u/0?cid=aS5zdWRiZXJ5QHNoZWZmaWVsZC5hYy51aw>
…On Tue, 26 Mar 2024 at 12:52, Tom Smith ***@***.***> wrote:
Second time in a week that an issue has been 'solved' by a user before I
get around to it. I should be neglectful more often 😉
Yes, you're right that group doesn't add read tags to read2s, which are
just written out as soon as they are read in, along with unmapped and/or
chimeric reads depending on the options
https://github.com/CGATOxford/UMI-tools/blob/7e799bc120f185128e3983cbc328e180a0b6b263/umi_tools/group.py#L242-L253
Looking over the documentation, I don't think this is stated anywhere.
@IanSudbery <https://github.com/IanSudbery>, do you agree this is an
oversight, or have I missed it too?! I'm happy to rectify.
—
Reply to this email directly, view it on GitHub
<#635 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABJELDXZMV55M36RQUT3GNDY2FOPDAVCNFSM6AAAAABFANQ64KVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMRQGM2DMNZYGQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
OK, I'll put updating the docs regarding this on the to-do. If we do want to include read tags on read2, I'm torn between allowing this in |
Thank you so much for confirming the cause of half-tagged reads! Peace of mind has been restored. If you're asking me for my thoughts:
All of this is of course just proposals; Either way, we'd be interested to hear more about this considering it would have a massive impact on how we set up our NGS runs. Thanks again! |
Hi! After running
group
on a sorted bamfile of roughly 3.4 million reads, I noticed that only half (~1.6 million reads) were actually given UG tags.I checked this by running
samtools view -d UG
on the grouped file and counting the number of reads meeting these filter conditions.All reads in the bamfile I used should contain UMIs. Is this expected behavior, and if so, what might cause some reads to not get assigned to a read group?
Thanks in advance for your patience; I'd really appreciate any help/explanation.
I'm a new user so apologies if I'm missing the obvious.
The text was updated successfully, but these errors were encountered: