Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: clarify ppi scoring metrics and add doc strings and tests #498

Merged
merged 7 commits into from
Sep 21, 2023

Conversation

gcroci2
Copy link
Collaborator

@gcroci2 gcroci2 commented Sep 19, 2023

We have an already implemented function, compute_ppi_scores, that takes as input a decoy (ppi model that needs evaluation) and a reference pdb structure, considered the native one. It computes lrmsd, irmsd, fnat, dockq, binary, and capri_classes scores. I see two applications of such function:

  • Adding the score we're interested in (e.g., dockq) to the HDF5 files, as a target. Then use regression (or classification in other scores' cases) task for training a supervised model for predicting the selected score (e.g., dockq). The application could be using the pre-trained model to be able to predict how good an (unlabeled) generated structure is.
  • Evaluate a pool of generated structures for which we have the native ones, using one of the scores listed above.

Do we need other features regarding this? @LilySnow
I was evaluating if it's worth it to change compute_ppi_scores to a class @DaniBodor, but I think a function is perfectly fine, and a class would be unnecessary for the scope. What do you think?

@gcroci2 gcroci2 linked an issue Sep 19, 2023 that may be closed by this pull request
@LilySnow
Copy link
Collaborator

nice work! No, we do not need new metrics for this. The two applications are indeed correct. If we have time, we should test this using data from our DeepRank paper (Just the small capri data is enough. )

@gcroci2
Copy link
Collaborator Author

gcroci2 commented Sep 19, 2023

nice work! No, we do not need new metrics for this. The two applications are indeed correct. If we have time, we should test this using data from our DeepRank paper (Just the small capri data is enough. )

The code is tested and we're sure it works, so we don't strictly need to verify the previous results :) I'd prefer to use this directly for the application we proposed for this budget, which is ranking antibody-antigen docking models.

Copy link
Collaborator

@DaniBodor DaniBodor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice job!

The docstring is a bit long here and I'm not sure this is the first place users should be expected to go to find out how to evaluate their models. Maybe it would be good to add something about this to the documentation as well?

Also, do we need something equivalent for SRV as well?

deeprank2/tools/target.py Outdated Show resolved Hide resolved
deeprank2/tools/target.py Outdated Show resolved Hide resolved
tests/tools/test_target.py Show resolved Hide resolved
@DaniBodor
Copy link
Collaborator

I was evaluating if it's worth it to change compute_ppi_scores to a class @DaniBodor, but I think a function is perfectly fine, and a class would be unnecessary for the scope. What do you think?

I agree.

@gcroci2
Copy link
Collaborator Author

gcroci2 commented Sep 19, 2023

Also, do we need something equivalent for SRV as well?

Nope, I double checked with @LilySnow as well, all these metrics are very specific for single interfaces in decoy models. It gets more complex if you have multiple interfaces (more than two chains), but usually the interfaces are evaluated one by one in such cases, only one interface is selected.

@gcroci2
Copy link
Collaborator Author

gcroci2 commented Sep 19, 2023

Thanks for the suggestions :) @DaniBodor I added them.

The docstring is a bit long here and I'm not sure this is the first place users should be expected to go to find out how to evaluate their models. Maybe it would be good to add something about this to the documentation as well?

I agree, a doc string should definitely not contain all this information. I think it would be nice to have a .md section (as we did for docs/features.md) specific for docking, with a snippet code as an example and to move a big part of the doc string there. Something like this?

@DaniBodor
Copy link
Collaborator

I agree, a doc string should definitely not contain all this information. I think it would be nice to have a .md section (as we did for docs/features.md) specific for docking, with a snippet code as an example and to move a big part of the doc string there. Something like this?

yes, indeed.

@gcroci2
Copy link
Collaborator Author

gcroci2 commented Sep 20, 2023

See here for the new page @DaniBodor. Consider that part of the SS budget is meant for creating tutorials for the new applications, including docking. So for now having something indicative is fine, later we can do proper notebooks with more details (and improved code, I don't like add_targets function at all). But since we'll work on docking tasks as well, we will develop an easier way to add targets to hdf5 files. (I opened issue #499 about this)

@DaniBodor
Copy link
Collaborator

DaniBodor commented Sep 20, 2023

See here for the new page @DaniBodor. Consider that part of the SS budget is meant for creating tutorials for the new applications, including docking. So for now having something indicative is fine, later we can do proper notebooks with more details (and improved code, I don't like add_targets function at all). But since we'll work on docking tasks as well, we will develop an easier way to add targets to hdf5 files. (I opened issue #499 about this)

Indeed, I was going to mention that it might be good to include a notebook/tutorial, but I agree that for now this is fine. Maybe there could be a short comment in the README about this, though, as currently it is completely hidden to users that this exists.
Apart from that, the changes made all look good to me.

@gcroci2
Copy link
Collaborator Author

gcroci2 commented Sep 20, 2023

Maybe there could be a short comment in the README about this, though, as currently it is completely hidden to users that this exists. Apart from that, the changes made all look good to me.

I added a ref in the readme and in the docs index to it. I'll wait for all the tests to pass and then I'll merge.

@gcroci2 gcroci2 merged commit ada1790 into main Sep 21, 2023
7 checks passed
@gcroci2 gcroci2 deleted the 54_docking_metrics_gcroci2 branch September 21, 2023 14:15
@gcroci2 gcroci2 added the SS label Jan 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement Docking Metrics class
3 participants