Asynchronous algorithms and "Good enough" RandomSearchCV #32

mrocklin · 2017-03-10T23:02:14Z

It would be interesting to start exploring asynchronous algorithms within this project using the dask.distributed API. Because this API is somewhat different it might be wise to start with something simple.

One simple application would be to build a variant of RandomSearchCV that, instead of taking a number of candidates to try, instead took a stopping criterion like "have tried 100 options and not improved more than 1%" and then continued submitting computations while this has not been met.

My initial approach to do this would be to periodically check the number of cores I had

client = `distributed.client.default_client()`
ncores = sum(client.ncores().valuse())

and try to keep roughly twice that many candidates in flight

candidate_pool = create_infinite_candidates(parameterspace)
futures = client.map(try_and_score, list(toolz.take(ncores * 2, candidate_pool)))

Then I would consume those futures as they finished

af = distributed.client.as_finished(futures)
for future in af:
    score, params = future.result()
    if score > best:
        best = score
        best_params = params
        ...

and then submit new futures as necessary

    future = client.submit(try_and_score, next(candidate_pool))
    af.add(future)

If we wanted to be cool, we could also check the number of cores periodically and submit more or less accordingly.

cc @jcrist @amueller

The text was updated successfully, but these errors were encountered:

amueller · 2017-03-11T03:42:36Z

Which stopping criterion to use is independent of distributing it, right?
This sounds like a nice simple example.

If you're looking for something more complex, the parallelization here would be interesting: https://papers.nips.cc/paper/4522-practical-bayesian-optimization-of-machine-learning-algorithms.pdf

I think everything except the parallelization is implemented in scikit-optimize. You might need some ML help, maybe @MechCoder is interested ;)

eriknw · 2017-04-03T18:55:14Z

I have begun dask-patternsearch project that may be useful here. You can think of it as a line-search algorithm generalized to multiple dimensions without using derivatives. Its distinguishing feature is you can throw as many cores at it as you want. In principle, optimizing a 100-dimensional problem (that fits in memory) on 100,000 cores should be just fine. The algorithm isn't particularly fancy, smart, or optimal, but I would say it's better than grid search and Nelder-Mead simplex.

Read more about it here:
http://www.cs.wm.edu/~va/research/sirev.pdf
http://www.cs.wm.edu/~va/research/deto.pdf

Feel free to direct questions to the repo. Again, the project is brand-spanking-new and is missing such nice things as documentation and tests.

Example usage:

import distributed
from dask_patternsearch import search
client = distributed.Client()

def f(x):
    return x.dot(x)

x0 = np.array([12.3, 4.5])
stepsize = np.array([1, 1])
results = search(client, f, x0, stepsize, stopratio=1e-5)

https://github.com/eriknw/dask-patternsearch

thomasgreg · 2017-05-16T01:38:04Z

Hi, I've been working on an online/async version of the code here: https://github.com/thomasgreg/dask-searchcv/tree/online-fit.

It ended up involving a refactoring of much of the code in model_selection.py to allow updates to the graph without generating extra tasks (the existing code assumes that the full parameter space is know beforehand).

If this approach looks useful I can clean up the branch and put in a PR (it's a bit messy right now as I was hesitant to just replace the existing batch implementation fully without feedback).

Most tests are passing but for a few edge cases I'm working out (dask delayed object in fit_params, the graph size without cache, ...). There's an example script in: examples/async_randomsearchcv.py.

Best regards,
Greg

jcrist · 2017-05-16T02:04:55Z

Thanks for taking a look at this. I suggest you submit a PR early and we can discuss it more there. At first glance, I'm not sure what your intent is here. What problem are you trying to solve? If the goal is to solve this issue, I think there are simpler solutions that don't involve major changes to the existing code. If the goal is to make the existing classes asynchronous, I also think there are simpler ways to accomplish that. Either way, I still suggest you submit a WIP PR and we can discuss things more there.

thomasgreg · 2017-05-16T05:28:30Z

Thanks for the quick reply - I've subitted a PR. I understand that big code change is a headache, and in hindsight can see how it might be done in a simpler way with minimal change.

amueller · 2017-05-16T15:05:52Z

@eriknw you should run a benchmark against spearmint, which is an implementation of the paper I linked to. I think they have a distributed implementation, though that might not be open source.

eriknw · 2017-05-16T15:43:32Z

Thanks for the suggestion @amueller and letting me know they have a distributed implementation. I plan to perform comparative benchmarks against a couple notable packages and will definitely include spearmint.

I plan to implement adaptive step sizes in dask-patternsearch before putting much effort into benchmarks. Medium term, I would like to allow other optimizers to be easily used in combination with dask-patternsearch. Pattern search optimization readily allows "oracles" to propose trial points, and the method you linked to should be an excellent candidate for an oracle. Longer term, I'd like to see a multi-armed bandit approach to perform real-time resource allocation to each optimizer (i.e., to give more cores to methods most likely to give the most improvement).

I've had a rigorous travel schedule the past month, but I'll continue working on dask-patternsearch soon and should have it reasonably functional and presentable in time for the SciPy conference. I expect we can get dask-searchcv to use dask-patternsearch some time this summer.

eriknw mentioned this issue Apr 10, 2017

Benchmark functions to experiment on eriknw/dask-patternsearch#9

Open

thomasgreg mentioned this issue May 16, 2017

Online fit - WIP #45

Open

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Asynchronous algorithms and "Good enough" RandomSearchCV #32

Asynchronous algorithms and "Good enough" RandomSearchCV #32

mrocklin commented Mar 10, 2017

amueller commented Mar 11, 2017 •

edited

Loading

eriknw commented Apr 3, 2017 •

edited

Loading

thomasgreg commented May 16, 2017 •

edited

Loading

jcrist commented May 16, 2017

thomasgreg commented May 16, 2017

amueller commented May 16, 2017

eriknw commented May 16, 2017

Asynchronous algorithms and "Good enough" RandomSearchCV #32

Asynchronous algorithms and "Good enough" RandomSearchCV #32

Comments

mrocklin commented Mar 10, 2017

amueller commented Mar 11, 2017 • edited Loading

eriknw commented Apr 3, 2017 • edited Loading

thomasgreg commented May 16, 2017 • edited Loading

jcrist commented May 16, 2017

thomasgreg commented May 16, 2017

amueller commented May 16, 2017

eriknw commented May 16, 2017

amueller commented Mar 11, 2017 •

edited

Loading

eriknw commented Apr 3, 2017 •

edited

Loading

thomasgreg commented May 16, 2017 •

edited

Loading