-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
odgi untangle multiple reference file - potential bug #581
Comments
Can you try to run a multiple entangling and see if the second best hit is
the part that never gets touched? One thing that we've seen is that when
the identity between two sequences in the reference is 100% we will not
emit one of them in the entangling because they have exactly the same match
quality to all sequences.
…On Sun, Jul 7, 2024, 23:59 Catriona-Miller ***@***.***> wrote:
Hi,
I have been trying to use chm13 instead of grch38 as a reference file for
odgi untangle with the same code as your readthedocs tutorial. Since there
are two chm13 paths for the areas I'm interested in, I've been using the -R
flag with a file that lists the paths (e.g. attached) such as below:
(echo query.name query.start query.end ref.name ref.start ref.end score
inv self.cov n.th | tr ' ' '\t'; odgi untangle -i VKORC1_gene_sorted.og
-R target_chm13.txt --threads 8 -m 256 -P | bedtools sort -i - ) | awk '$8
== "-" { x=$6; $6=$5; $5=x; } { print }' | tr ' ' '\t' >
chr16_chm13_VKORC1_untangle1.bed
However, reading the outputted bed file into R, it only ever uses one of
the two chm13 paths as a reference. I've tried adding an extra blank line
at the start of target_chm13.txt and tried swapping the order of the two
paths but it always uses the second path as a reference. Unsure if this is
a bug or a misunderstanding of the process on my end.
Thanks
target_chm13.txt
<https://github.com/user-attachments/files/16123261/target_chm13.txt>
—
Reply to this email directly, view it on GitHub
<#581>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AABDQEMLSNW3IQKLABJAEADZLITDRAVCNFSM6AAAAABKQDRXVCVHI2DSMVQWIX3LMV43ASLTON2WKOZSGM4TINZQHA4DGNI>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
Try setting -n > 1 in odgi untangle
-n[N], --n-best=[N] Report up to the Nth best target
(reference) mapping for each query
segment (default: 1).
________________________________
From: Erik Garrison ***@***.***>
Sent: Tuesday, July 9, 2024 15:55
To: pangenome/odgi ***@***.***>
Cc: Subscribed ***@***.***>
Subject: Re: [pangenome/odgi] odgi untangle multiple reference file - potential bug (Issue #581)
Can you try to run a multiple entangling and see if the second best hit is
the part that never gets touched? One thing that we've seen is that when
the identity between two sequences in the reference is 100% we will not
emit one of them in the entangling because they have exactly the same match
quality to all sequences.
On Sun, Jul 7, 2024, 23:59 Catriona-Miller ***@***.***> wrote:
Hi,
I have been trying to use chm13 instead of grch38 as a reference file for
odgi untangle with the same code as your readthedocs tutorial. Since there
are two chm13 paths for the areas I'm interested in, I've been using the -R
flag with a file that lists the paths (e.g. attached) such as below:
(echo query.name query.start query.end ref.name ref.start ref.end score
inv self.cov n.th | tr ' ' '\t'; odgi untangle -i VKORC1_gene_sorted.og
-R target_chm13.txt --threads 8 -m 256 -P | bedtools sort -i - ) | awk '$8
== "-" { x=$6; $6=$5; $5=x; } { print }' | tr ' ' '\t' >
chr16_chm13_VKORC1_untangle1.bed
However, reading the outputted bed file into R, it only ever uses one of
the two chm13 paths as a reference. I've tried adding an extra blank line
at the start of target_chm13.txt and tried swapping the order of the two
paths but it always uses the second path as a reference. Unsure if this is
a bug or a misunderstanding of the process on my end.
Thanks
target_chm13.txt
<https://github.com/user-attachments/files/16123261/target_chm13.txt>
—
Reply to this email directly, view it on GitHub
<#581>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AABDQEMLSNW3IQKLABJAEADZLITDRAVCNFSM6AAAAABKQDRXVCVHI2DSMVQWIX3LMV43ASLTON2WKOZSGM4TINZQHA4DGNI>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
—
Reply to this email directly, view it on GitHub<#581 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AO26XHXQSI3XHOXXQ2SDOVDZLPTVFAVCNFSM6AAAAABKQDRXVCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMJXHAYTGOBYHE>.
You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>
|
Thanks both. I set -n 2 but still I only ever get the second path as the ref path in output. E.g. see the two files that are output for the below code. The only difference is the order I've listed the two paths in target_chm13.txt (echo query.name query.start query.end ref.name ref.start ref.end score chr16_chm13_VKORC1_untangle_trial.txt |
Hi,
I have been trying to use chm13 instead of grch38 as a reference file for odgi untangle with the same code as your readthedocs tutorial. Since there are two chm13 paths for the areas I'm interested in, I've been using the -R flag with a file that lists the paths (e.g. attached) such as below:
(echo query.name query.start query.end ref.name ref.start ref.end score inv self.cov n.th | tr ' ' '\t'; odgi untangle -i VKORC1_gene_sorted.og -R target_chm13.txt --threads 8 -m 256 -P | bedtools sort -i - ) | awk '$8 == "-" { x=$6; $6=$5; $5=x; } { print }' | tr ' ' '\t' > chr16_chm13_VKORC1_untangle1.bed
However, reading the outputted bed file into R, it only ever uses one of the two chm13 paths as a reference. I've tried adding an extra blank line at the start of target_chm13.txt and tried swapping the order of the two paths but it always uses the second path as a reference. Unsure if this is a bug or a misunderstanding of the process on my end.
Thanks
target_chm13.txt
The text was updated successfully, but these errors were encountered: