Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve Trainer and DeeprankDataset for production testing #510

Closed
5 tasks done
gcroci2 opened this issue Oct 6, 2023 · 1 comment · Fixed by #515
Closed
5 tasks done

Improve Trainer and DeeprankDataset for production testing #510

gcroci2 opened this issue Oct 6, 2023 · 1 comment · Fixed by #515
Assignees
Labels
stale issue not touched from too much time

Comments

@gcroci2
Copy link
Collaborator

gcroci2 commented Oct 6, 2023

There are some issues when using the package for testing a pre-trained model on newly generated data:

  • The GraphDataset class requires dataset_train as input even in such cases (whenever train is False). We should be able of using a test dataset without the need of the original model's training dataset. We can use the info stored in the pre-trained model for inherit the needed attributes. (see _check_inherited_params in dataset.py)
  • In the Trainer class' init, before loading parameters and the pretrained model there is a check for the target, which in a pre-trained model case could be not present at all.
  • The Trainer class expects the attribute epoch_saved_model, which should be saved within the state of the pre-trained model.
  • If the test dataset has no labels, the output exporter doesn't work (ValueError("All arrays must be of the same length"))

In order to make reasonable changes, I think we need to take into account all the possible scenarios using a mock example:

  • No pre-trained model, train, valid, and test. (should be good)
  • No pre-trained model, train, valid, no test. (should be good)
  • No pre-trained model, train only. (should be good)
  • Pre-trained model, test only, with labels. (the one to improve the code for)
  • Pre-trained model, test only, with no labels. (the one to improve the code for)
@gcroci2 gcroci2 changed the title Create Tester class Improve Trainer and DeeprankDataset for production testing Oct 6, 2023
@gcroci2 gcroci2 added the priority Solve this first label Oct 9, 2023
@gcroci2 gcroci2 self-assigned this Oct 19, 2023
Copy link

This issue is stale because it has been open for 30 days with no activity.

@github-actions github-actions bot added the stale issue not touched from too much time label Nov 24, 2023
@gcroci2 gcroci2 closed this as completed Jan 3, 2024
@gcroci2 gcroci2 removed the priority Solve this first label Mar 19, 2024
@gcroci2 gcroci2 moved this to Done in Development Jul 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stale issue not touched from too much time
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

1 participant