Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Adds selective hyperparameter optimization #58

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
58 changes: 35 additions & 23 deletions src/ageml/commands.py
Original file line number Diff line number Diff line change
Expand Up @@ -100,7 +100,7 @@ def configure_parser(self):
self.parser.add_argument(
"-ht",
"--hyperparameter_tuning",
nargs=1,
nargs="+",
default=["0"],
help=messages.hyperparameter_grid_description,
)
Expand Down Expand Up @@ -164,12 +164,28 @@ def configure_args(self, args):
else:
args.model_params = {}

# Set hyperparameter grid search value
if len(args.hyperparameter_tuning) > 1 or not args.hyperparameter_tuning[0].isdigit():
# Parse hyperparameter_tuning values
hyperparam_tuning = args.hyperparameter_tuning
if not hyperparam_tuning[0].isdigit():
raise ValueError("Hyperparameter grid points must be a non negative integer.")
else:
args.hyperparameter_tuning = args.hyperparameter_tuning[0]
args.hyperparameter_tuning = int(convert(args.hyperparameter_tuning))
args.hyperparameter_tuning = int(convert(hyperparam_tuning[0]))

hyperparameter_params = {}
if len(hyperparam_tuning) > 1:
for item in hyperparam_tuning[1:]:
if item.count("=") != 1:
err_msg = (
"Hyperparameter tuning parameters must be in the format "
"param1=value1_low,value1_high param2=value2_low, value2_high..."
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should also check that always two values given a low and a high. What happens if someone gives C=1,2,3? this should through an error. As the user should write C=1,3 and ht=3 to have 3 hyperparameter points 1,2,3. Also ht should be a minimum of 2.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about hyperparameters like kernels? where you want to choose different kernels. This should not be affected by ht.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed! thanks for this

raise ValueError(err_msg)
key, value = item.split("=")
low, high = value.split(",")
hyperparameter_params[key] = [convert(low), convert(high)]
# Add attribute to args
args.hyperparameter_params = hyperparameter_params

# Set polynomial feature extension value
if len(args.feature_extension) > 1 or not args.feature_extension[0].isdigit():
raise ValueError("Polynomial feature extension degree must be a non negative integer.")
Expand Down Expand Up @@ -227,13 +243,11 @@ def configure_parser(self):
help=messages.factors_long_description,
)

self.parser.add_argument("--covariates", metavar="FILE", required=False,
help=messages.covar_long_description)
self.parser.add_argument("--clinical", metavar="FILE", required=False,
help=messages.clinical_long_description)
self.parser.add_argument("--covcorr_mode", metavar="MODE", required=False,
choices=["cn", "each", "all"],
help=messages.covcorr_mode_long_description)
self.parser.add_argument("--covariates", metavar="FILE", required=False, help=messages.covar_long_description)
self.parser.add_argument("--clinical", metavar="FILE", required=False, help=messages.clinical_long_description)
self.parser.add_argument(
"--covcorr_mode", metavar="MODE", required=False, choices=["cn", "each", "all"], help=messages.covcorr_mode_long_description
)


class ClinicalGroups(Interface):
Expand Down Expand Up @@ -284,11 +298,10 @@ def configure_parser(self):
)

# Optional arguments
self.parser.add_argument("--covariates", metavar="FILE", required=False,
help=messages.covar_long_description)
self.parser.add_argument("--covcorr_mode", metavar="MODE", required=False,
choices=["cn", "each", "all"],
help=messages.covcorr_mode_long_description)
self.parser.add_argument("--covariates", metavar="FILE", required=False, help=messages.covar_long_description)
self.parser.add_argument(
"--covcorr_mode", metavar="MODE", required=False, choices=["cn", "each", "all"], help=messages.covcorr_mode_long_description
)


class ClinicalClassification(Interface):
Expand Down Expand Up @@ -372,12 +385,11 @@ def configure_parser(self):
)

# Optional arguments
self.parser.add_argument("--covariates", metavar="FILE", required=False,
help=messages.covar_long_description)
self.parser.add_argument("--covcorr_mode", metavar="MODE", required=False,
choices=["cn", "each", "all"],
help=messages.covcorr_mode_long_description)

self.parser.add_argument("--covariates", metavar="FILE", required=False, help=messages.covar_long_description)
self.parser.add_argument(
"--covcorr_mode", metavar="MODE", required=False, choices=["cn", "each", "all"], help=messages.covcorr_mode_long_description
)

def configure_args(self, args):
"""Configure argumens with required fromatting for modelling.

Expand Down
4 changes: 2 additions & 2 deletions src/ageml/messages.py
Original file line number Diff line number Diff line change
Expand Up @@ -108,8 +108,8 @@

hyperparameter_grid_description = (
"Number of points for which the hyperparameter optimization Grid Search will train\n"
"a model. The parameter ranges are predefined. An integer is required.\n"
"(e.g. -ht 100 / --hyperparameter_tuning 100)"
"a model, and parameter ranges to sample from. An integer is required, followed \n"
"by the parameters to optimize. (e.g. -ht 100 C=1,2,3 kernel=linear,rbf)"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this not be only C=1,2? A low and a high value? Also how do you deal with if you put 3 kerrnel types. Also 100 hyperparameter grid points is unrealistic. Too many may lead users too put too many.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, changed the C=1,2,3 to C=1,3. Also changed 100 to 10.
Working on dealing with string hyperparams.

)

thr_long_description = "Threshold for classification. Default: 0.5 \n" "The threshold is used for assingning hard labels. (e.g. --thr 0.5)"
Expand Down
Loading
Loading