-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hidden Test Set and Testing Server #3
Comments
Forgot to mention, we'll also need a good testing server that groups can submit models to and a leaderboard for new models. |
This is tricky -- my only ideas would be for only Quantum Mechanics Datasets. We can calculate energies using psi4 at some level of theory for molecules selected from a known library. We then release a training set but hide the test set and heavily limit the time for inference so real DFT cannot be run. |
Any datasets that require physical experimentation are too expensive and there would be too many arguments about data quality of the Assay. |
Cross referencing this with deepchem/deepchem#1903. Would setting up a Jenkins build server be a good design for this? An alternative is that we have a manual once-a-month update process. This could perhaps be done automatically with an AWS cron job (https://docs.aws.amazon.com/AmazonECS/latest/developerguide/scheduled_tasks.html) |
One of the biggest limitations of the original MoleculeNet was that there was no hidden test set. This means that many of the papers that have used MoleculeNet datasets test their methods on subset of the datasets and it's very hard to do an apples-to-apples comparisons of different methods. The next generation of MoleculeNet should features a common benchmark challenge with a hidden test set that can be used to evaluate models proposed by different research groups on a fair playing field.
The text was updated successfully, but these errors were encountered: