Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

scaffolding ideas #112

Open
jakebiesinger opened this issue Dec 18, 2013 · 0 comments
Open

scaffolding ideas #112

jakebiesinger opened this issue Dec 18, 2013 · 0 comments

Comments

@jakebiesinger
Copy link
Contributor

@anbangx @Elmira88 @JavierJia @Nan-Zhang

I was thinking about scaffolding last night. I mentioned in #101 some ideas and future optimizations but thought I'd do a brain dump here as well.

Here's an idea for a "SplitRepeat" job: take the current framework but for each walk node that has 2+ back- or incoming-edges, we can request the candidate's score according to the read sequences in those back edges. When two candidates seem to have nearly equal scores, we can turn turn to the backedge scores to try to split the repeated frontier node (or some portion of the walk).

In ascii art:

         b1
           \  ----c1
w1--w2--w3--f<
              ----c2

in the simple version of this algorithm, we request scores of the subkmers f--c1 and f--c2 from b1 which aggregates the score to f, just like the walk nodes w1-w3 do. If we get a "strong" signal from b1 about one of the paths (say, b1--f--c1) and it's not the same as the path that's strong in the walk (say, w1--w2--w3--f--c2), then we could assume that f is a repeat node, shared by the two different paths. We could the split f into f1 and f2, resulting in the paths b1--f1--c1 and w1--w2--w3--f--c2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant