You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, please look at the following comment from a closed issue. Opening a new issue here since I haven't heard back from anyone (presumably because commenting on a closed issue doesn't automatically reopen it).
Thanks.
" As a follow up, looking at the code it seems to me that you use 20 as the threshold for this. i.e. if one end is the same, we allow the other end to be up to 20 bases away for it to still be considered a duplicate. Is that correct?
However, even in that case, I'm confused because I see multiple cases where the end is the same, the start is <20 bases away, but these are still not counted separately (i.e., they are considered duplicates) by sinto. e.g. with the following 4 reads:
Hi, please look at the following comment from a closed issue. Opening a new issue here since I haven't heard back from anyone (presumably because commenting on a closed issue doesn't automatically reopen it).
Thanks.
" As a follow up, looking at the code it seems to me that you use 20 as the threshold for this. i.e. if one end is the same, we allow the other end to be up to 20 bases away for it to still be considered a duplicate. Is that correct?
However, even in that case, I'm confused because I see multiple cases where the end is the same, the start is <20 bases away, but these are still not counted separately (i.e., they are considered duplicates) by sinto. e.g. with the following 4 reads:
A00261:525:HK77VDSX3:1:1133:17969:2613 99 chrM 9947 60 150M = 10023 226 GGTTTGACTATTTCTGTATGTCTCCATCTATTGATGAGGGTCTTACTCTTTTAGTATAAATAGTACCGTTAACTTCCAATTAACTAGTTTTGACAACATTCAAAAAAGAGTAATAAACTTCGCCTTAATTTTAATAATCAACACCCTCCT FFFFFFFFFFFFFFFFFFFFFFFFFFFFF::FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF NM:i:0 MD:Z:150 AS:i:150 XS:i:34 CR:Z:ACAGGCTCAGGAGGGT CY:Z:FFFFFFFFFFFFFFFF CB:Z:AAAGCAAGTGGAAACG-1 BC:Z:TCGAATTG QT:Z:FFFFFFFF RG:Z:Sample_output:MissingLibrary:1:HK77VDSX3:1
A00261:525:HK77VDSX3:1:1133:17969:2613 147 chrM 10023 60 150M = 9947 -226 CAATTAACTAGTTTTGACAACATTCAAAAAAGAGTAATAAACTTCGCCTTAATTTTAATAATCAACACCCTCCTAGCCTTACTACTAATAATTATTACATTTTGACTACCACAACTCAACGGCTACATAGAAAAATCCACCCCTTACGAG :FFFFFFFFFFFFFFFF:FFFFFF:FFFF:FFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF NM:i:0 MD:Z:150 AS:i:150 XS:i:0 CR:Z:ACAGGCTCAGGAGGGT CY:Z:FFFFFFFFFFFFFFFF CB:Z:AAAGCAAGTGGAAACG-1 BC:Z:TCGAATTG QT:Z:FFFFFFFF RG:Z:Sample_output:MissingLibrary:1:HK77VDSX3:1
A00261:525:HK77VDSX3:1:1370:20518:3302 99 chrM 10092 60 81M = 10092 81 CTCCTAGCCTTACTACTAATAATTATTACATTTTGACTACCACAACTCAACGGCTACATAGAAAAATCCACCCCTTACGAG FFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF NM:i:0 MD:Z:81 AS:i:81 XS:i:0 CR:Z:ACAGGCTCAGGAGGGT CY:Z:FFFFFF,FFFFFFFFF CB:Z:AAAGCAAGTGGAAACG-1 BC:Z:CGAGTGAT QT:Z:FFFFFFFF RG:Z:Sample_output:MissingLibrary:1:HK77VDSX3:1 TR:Z:CTGTCTCTTATACACATCTCCGAGCCCACGAGACCGAGTGATATCTCGTATGCCGTCTTCTGCTTGAAA TQ:Z:FFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFF
A00261:525:HK77VDSX3:1:1370:20518:3302 147 chrM 10092 60 81M = 10092 -81 CTCCTAGCCTTACTACTAATAATTATTACATTTTGACTACCACAACTCAACGGCTACATAGAAAAATCCACCCCTTACGAG FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF NM:i:0 MD:Z:81 AS:i:81 XS:i:0 CR:Z:ACAGGCTCAGGAGGGT CY:Z:FFFFFF,FFFFFFFFF CB:Z:AAAGCAAGTGGAAACG-1 BC:Z:CGAGTGAT QT:Z:FFFFFFFF RG:Z:Sample_output:MissingLibrary:1:HK77VDSX3:1 TR:Z:CTGTCTCTTATACACATCTGACGCTGCCGACGACAGACGCGACCCTCCTGAGCCTGTGTGTAGATCTCG TQ:Z:::FFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
I would have expected the following two start,end pairs to be considered separate fragments:
9950 10167
10095 10167
but sinto actually only counts the second fragment here (i.e. 10095 10167), and ignores the first. What am I missing?
Thanks
"
Originally posted by @rtyags in #48 (comment)
The text was updated successfully, but these errors were encountered: