-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Assembly scoring: fix #158 and solve ties in assembly scoring #161 #163
Conversation
Because it is a probability, the NOPRED score should be the most uncertain value.
General solution for DNA helix described in eppic-team#161
indx = 0; | ||
List<Integer> indices = new ArrayList<Integer>(); | ||
for (Assembly a:uniques) { | ||
if (a.getScore() == maxScore){ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should I use some epsilon for the comparison, or do we trust score differences even if they are very tiny?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd say we should do this for strict ties only (those comparing as equals), let's trust the tiny differences to do the sorting job otherwise.
uniques.get(maxIndx).setCall(CallType.BIO); | ||
} else { | ||
for (Integer i:indices) { | ||
uniques.get(i).setCall(CallType.NO_PREDICTION); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What call should we assign to all the assemblies with the same high score? Here I choose NOPRED.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about we use BIO for several assemblies? Is that possible? or would something else break because of that?
Otherwise if we don't want to do several BIOs, then I'd rather go for a strategy like @sbliven proposed: choose the one with lowest stoichiometry.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I cannot recall anything that would break if two assemblies are assigned as BIO, I am OK with this solution. The only problem I can see is with the user interpretation, maybe we should think what does a user expect or show a help message for such cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! Let's decide on the 2 issues on the comments and then merge.
Actually for the fist comment there's no changes to make, we do only strict ties. For the second comment there's a decision to be made on whether use NOPRED, BIO or use the lowest stoichiometry for the BIO call. Let's do that in a separate commit. I'd rather merge this now so that I can do another round of testing with this in. |
I changed the final call reason for interfaces as described in #158.
I also set the unassigned value of probability scores to 0.5.
I also implemented a first solution for breaking ties in the highest scoring assemblies of a crystal. Discussion in #161.