-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Updated results for distribution drift issue #131
Labels
main hurdle/issue
This is an issue that was a pivotal moment during the project.
Comments
jyaacoub
added
the
main hurdle/issue
This is an issue that was a pivotal moment during the project.
label
Aug 2, 2024
jyaacoub
added a commit
that referenced
this issue
Aug 2, 2024
jyaacoub
added a commit
that referenced
this issue
Aug 2, 2024
revert(splits): use random split to resolve distribution drift (#131)
SUMMARY (see below for stats on distributions - oncokb vs random split dataset):The distribution looks visually different in terms of highly targeted proteins, but when running a similarity scoring algorithmn to deduce the difference in the two distributions (random split vs OncoKB split) there was no real difference, but this could be a fault of the scoring algorithmn |
jyaacoub
changed the title
Distribution Drift issue with training and test dataset
Updated results for distribution drift issue
Aug 2, 2024
jyaacoub
added a commit
that referenced
this issue
Aug 7, 2024
… index renumbering #103 - Had to make some modifications since edge index needs to be updated after applying the mask so that it still points to the right nodes and we dont get something like an "IndexError" for being out of bounds - Also error due to not removing all proteins without pocket sequences (line 216 saved the old dataset instead of the new one). - Successfully built pocket datasets for davis and kiba #131 #103
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
There is a significant distribution drift due to the new training split we had created to exclude any proteins in OncoKB...
TODOs:
The text was updated successfully, but these errors were encountered: