Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Assembly scoring: fix #158 and solve ties in assembly scoring #161 #163

Merged
merged 3 commits into from
Jan 30, 2017

Conversation

lafita
Copy link
Member

@lafita lafita commented Jan 30, 2017

I changed the final call reason for interfaces as described in #158.
I also set the unassigned value of probability scores to 0.5.
I also implemented a first solution for breaking ties in the highest scoring assemblies of a crystal. Discussion in #161.

Because it is a probability, the NOPRED score should be the most
uncertain value.
General solution for DNA helix described in eppic-team#161
@lafita lafita added this to the 3.0 milestone Jan 30, 2017
indx = 0;
List<Integer> indices = new ArrayList<Integer>();
for (Assembly a:uniques) {
if (a.getScore() == maxScore){
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I use some epsilon for the comparison, or do we trust score differences even if they are very tiny?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say we should do this for strict ties only (those comparing as equals), let's trust the tiny differences to do the sorting job otherwise.

uniques.get(maxIndx).setCall(CallType.BIO);
} else {
for (Integer i:indices) {
uniques.get(i).setCall(CallType.NO_PREDICTION);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What call should we assign to all the assemblies with the same high score? Here I choose NOPRED.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about we use BIO for several assemblies? Is that possible? or would something else break because of that?

Otherwise if we don't want to do several BIOs, then I'd rather go for a strategy like @sbliven proposed: choose the one with lowest stoichiometry.

Copy link
Member Author

@lafita lafita Jan 30, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I cannot recall anything that would break if two assemblies are assigned as BIO, I am OK with this solution. The only problem I can see is with the user interpretation, maybe we should think what does a user expect or show a help message for such cases.

Copy link
Contributor

@josemduarte josemduarte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Let's decide on the 2 issues on the comments and then merge.

@josemduarte
Copy link
Contributor

Actually for the fist comment there's no changes to make, we do only strict ties. For the second comment there's a decision to be made on whether use NOPRED, BIO or use the lowest stoichiometry for the BIO call.

Let's do that in a separate commit. I'd rather merge this now so that I can do another round of testing with this in.

@josemduarte josemduarte merged commit b8246a6 into eppic-team:master Jan 30, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants