Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ML hyperparameters? #1

Open
ahundt opened this issue Feb 8, 2018 · 3 comments
Open

ML hyperparameters? #1

ahundt opened this issue Feb 8, 2018 · 3 comments

Comments

@ahundt
Copy link

ahundt commented Feb 8, 2018

Could this actually be used to optimize machine learning hyper parameters? Must there be an existing dataset of samples?

It would be cool to try optimizing my keras/tf network with this. I know this is research code so I totally understand if it simply isn’t set up for actual use of that sort.

In particular can it deal with both discrete and continuous parameters for example optimizer and learning rate respectively? I know one can map from continuous to discrete with a categorical encoding but I’m not sure that’s appropriate. I’m here because I was looking at gpyopt which looks like it can do what I need, but I came across the paper and it seemed interesting, especially considering the comparison and claims in the paper.

This isn’t in a topic I’m super familiar with so please forgive the naive questions, and thanks for your consideration.

@ahundt ahundt changed the title Ml hyper parameters? ML hyperparameters? Feb 15, 2018
@nrontsis
Copy link
Member

nrontsis commented Feb 27, 2018

Hey, thanks for your interest and sorry for the delay in replying.

Yes it is research code, but you can definitely do that. As a matter of fact, in a updated version of the paper, I performed BO on tuning OpenAI's PPO baselines. In the following days, I plan to update both the paper in the arXiv and the code to include the PPO example, so it would be easier for you to test my algorithm. Other, more polished options include Spearmint, Scikit-optimize, GPyOpt, DiceOptim and Cornell-MOE.

A couple of remarks, in case you try the algorithm:

  • OEI is by definition more explorative. Using rough kernels (Matern 3/2 or 5/2) is recommended as they make interpolating regions more interesting.
  • Although not properly tested, the code supports noisy objectives [by using the plug-in heuristic of y_min = min(posterior mean of observations)]. Keep in mind that I haven't tested the performance of the algorithm in the case of a noisy objective, as the performance of the different heuristics for handling the noisy case is a discussion on its own (see e.g. Picheny et al).
  • Finally, a considerable speedup can be brought in the code by parallelising the gradient-descent restarts in the optimization function.

@nrontsis
Copy link
Member

nrontsis commented Feb 27, 2018

The code is now updated and includes the PPO case (see these lines). You could run it like this:

python run.py --function=RoboschoolHopper-v1 --noise=1e-6.

The paper is under a peer-review process now; will update it on arXiv after this is over.

@ahundt
Copy link
Author

ahundt commented Mar 2, 2018

Cool thanks! Yeah I’ve been running with gpyopt and I just ended up running the random search stage for 2/3 of the steps before enabling the actual optimization for the final third. These SGD trained objectives are definitely extremely noisy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants