Initial point samplers #808

uri-granta · 2024-01-16T08:44:40Z

Related issue(s)/PRs:

Summary

Add support for custom samplers for generating the optimization initial point candidates. This solves two problems:

It allows including pre-computed points among the candidates
It allows batching the random sampling to avoid running out of memory for high dimensional problems.

Fully backwards compatible: yes

Given how often generate_continuous_optimizer is used (and often with keyword arguments) I think we should make an extra effort to maintain backwards compatibility here. The current approach is simply to expand num_initial_samples so it can take either an int or a sampler. An alternate approach would be to add an additional optional initial_point_sampler parameter, but the downside of that is that it wouldn't be possible to catch cases where people accidentally pass in both a num_initial_samples and a initial_sampler.

PR checklist

The quality checks are all passing
The bug case / new feature is covered by tests
Any new features are well-documented (in docstrings or notebooks)

hstojic · 2024-01-26T09:51:44Z

this can work as a short term solution, but we would want to make this more flexible - we could pass generators of initial samples to the optimizer, and provide a nice function/class for random sample generators - but this would then allow us to pass other types of generators when needed, e.g. qp generated by some other optimization process which would provide better starting points

…dmind-labs/trieste into uri/split_initial_point_generation

…point_generation

uri-granta · 2024-01-26T17:45:41Z

this can work as a short term solution, but we would want to make this more flexible - we could pass generators of initial samples to the optimizer, and provide a nice function/class for random sample generators - but this would then allow us to pass other types of generators when needed, e.g. qp generated by some other optimization process which would provide better starting points

Is generators definitely the way to go here? Space.sample doesn't currently support this, and I worry that a generator giving 10,000 samples one at a time would be much less efficient than a single call to sample(10_000).

Another approach would be to provide an additional optional argument initial_sampler: Callable[[int], TensorType] that lets you specify a function that returns a given number of initial sample points. The default behaviour would be equivalent to initial_sampler = space.sample but we could provide a helper function so people could write something like initial_sampler = select_from_samples(generator) if they want to. This would work nicely with split_initial_samples, and (more importantly) not break the current generate_continuous_optimizer API.

hstojic · 2024-01-26T20:11:25Z

this can work as a short term solution, but we would want to make this more flexible - we could pass generators of initial samples to the optimizer, and provide a nice function/class for random sample generators - but this would then allow us to pass other types of generators when needed, e.g. qp generated by some other optimization process which would provide better starting points

Is generators definitely the way to go here? Space.sample doesn't currently support this, and I worry that a generator giving 10,000 samples one at a time would be much less efficient than a single call to sample(10_000).

Another approach would be to provide an additional optional argument initial_sampler: Callable[[int], TensorType] that lets you specify a function that returns a given number of initial sample points. The default behaviour would be equivalent to initial_sampler = space.sample but we could provide a helper function so people could write something like initial_sampler = select_from_samples(generator) if they want to. This would work nicely with split_initial_samples, and (more importantly) not break the current generate_continuous_optimizer API.

it doesn't have to be generators, that's just what seemed like what could allow us to pass a variety of initial samples, but
initial_sampler could do the trick - though, it would perhaps need to be a list of samplers, e.g. we would want to have some random samples and some pre-optimised initial points

@vpicheny any thoughts?

uri-granta · 2024-01-31T08:31:22Z

I've added something that I think is flexible enough but still easy to use. Opinions?

hstojic

left a few comments for improvement, but overall looks much better, I think this would fit the bill - with a sampler I can create a custom one where I mix random samples with points obtained in other ways to provide a better start to the optimiser

hstojic · 2024-02-01T12:37:28Z

trieste/acquisition/optimizer.py

+        if len(sequence) < offset + num_samples:
+            raise ValueError(
+                f"Insufficient samples ({offset+num_samples} required, {len(sequence)} available"
+            )


should we also do a basic check that dimensionality is correct?

trieste/acquisition/optimizer.py

hstojic · 2024-02-01T12:44:29Z

trieste/acquisition/optimizer.py

 def generate_continuous_optimizer(
    num_initial_samples: int = NUM_SAMPLES_MIN,
    num_optimization_runs: int = 10,
    num_recovery_runs: int = 10,
    optimizer_args: Optional[dict[str, Any]] = None,
+    split_initial_samples: Optional[int] = 100_000,


I'm thinking that with initial samplers we can simplify arguments here, i.e. we could subsume split_initial_samples and num_initial_samples within the samplers, then each call of the sampler gives a splitted sample and we call them until they are exhausted, what do you think?
it would be a breaking change, but we are releasing 3.0 anyway :)

That is a much better idea! I've pushed a first attempt at this but will see what breaks and write some tests before asking for a review.

As an aside, it's actually possible to do this without breaking the interface by still letting people specify an int for num_initial_samples if they want to. There's an argument that we should rename the argument from num_initial_samples to initial_samples, but I'm worried that this would break a lot of code (and not just ours). Another option is to add a separate initial_samples argument and make people specify just one of this and num_initial_samples.

trieste/acquisition/optimizer.py

…dmind-labs/trieste into uri/split_initial_point_generation

hstojic

great work! left a few minor comments, but I think this is it :)

it would still be good to pass it by @khurram-ghani as well before merging

p.s. this will make it way easier to test alternative generate_initial_points, e.g. with clustering to achieve better diversity of starting points between the runs

hstojic · 2024-02-05T13:07:39Z

tests/unit/acquisition/test_optimizer.py

+@pytest.mark.parametrize("num_initial_points", [0, 1, 2, 3, 4])
+def test_generate_initial_points(num_initial_points: int) -> None:
+    def sampler(space: SearchSpace) -> Iterable[TensorType]:
+        assert space == Box([-1], [2])


you are using this space in multiple tests, perhaps create a constant at the top and reuse it? or a fixture?

tests/unit/acquisition/test_optimizer.py

khurram-ghani

One important comment I think (on -inf), and some minor ones.

trieste/acquisition/optimizer.py

tests/unit/acquisition/test_optimizer.py

khurram-ghani

It is a really good and flexible implementation!

Uri Granta and others added 6 commits January 16, 2024 08:43

Split initial point generation

253249e

Fixes

75e46f7

Another fix

b3aa895

Typo

8cf13c0

Typo

1561367

Merge branch 'develop' into uri/split_initial_point_generation

9a37d83

Uri Granta added 2 commits January 26, 2024 17:30

Merge branch 'uri/split_initial_point_generation' of github.com:secon…

b08d3ca

…dmind-labs/trieste into uri/split_initial_point_generation

Merge remote-tracking branch 'origin/develop' into uri/split_initial_…

a81167a

…point_generation

uri-granta and others added 2 commits January 30, 2024 15:26

Merge branch 'develop' into uri/split_initial_point_generation

c918ed2

Add support for custom initial_sampler

a281406

uri-granta and others added 3 commits January 31, 2024 13:22

Merge branch 'develop' into uri/split_initial_point_generation

9c51faf

Add optional additional_sampler argument

d7f9c26

Tweaks

3a0bade

hstojic reviewed Feb 1, 2024

View reviewed changes

Uri Granta added 7 commits February 2, 2024 08:55

Merge branch 'uri/split_initial_point_generation' of github.com:secon…

876fcda

…dmind-labs/trieste into uri/split_initial_point_generation

Try completely different approach

05d80e8

Handle tensorflow compilation

b3cb23d

Simplify

342ac7c

Typo

b772c15

Refactor and document better

751bae9

Start adding tests

00ae6c6

uri-granta changed the title ~~Split initial point generation~~ Initial point samplers Feb 3, 2024

Uri Granta added 3 commits February 3, 2024 21:23

Clarify

6c073bb

Test sample_from_space

fab6d13

Test not enough starting points

9865426

uri-granta marked this pull request as ready for review February 4, 2024 11:50

uri-granta requested review from hstojic and khurram-ghani February 4, 2024 15:32

hstojic approved these changes Feb 5, 2024

View reviewed changes

Uri Granta added 2 commits February 5, 2024 14:27

Review comments

211482e

Test vectorized case

36a4cff

khurram-ghani reviewed Feb 7, 2024

View reviewed changes

Uri Granta and others added 3 commits February 7, 2024 15:13

Review comments

d6302c1

Add comment

bdd3da1

Merge branch 'develop' into uri/split_initial_point_generation

b908058

khurram-ghani approved these changes Feb 8, 2024

View reviewed changes

uri-granta merged commit 3a12b33 into develop Feb 8, 2024
12 checks passed

uri-granta deleted the uri/split_initial_point_generation branch February 8, 2024 09:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Initial point samplers #808

Initial point samplers #808

uri-granta commented Jan 16, 2024 •

edited

Loading

hstojic commented Jan 26, 2024

uri-granta commented Jan 26, 2024

hstojic commented Jan 26, 2024

uri-granta commented Jan 31, 2024

hstojic left a comment

hstojic Feb 1, 2024

hstojic Feb 1, 2024

uri-granta Feb 2, 2024

hstojic left a comment

hstojic Feb 5, 2024

khurram-ghani left a comment

khurram-ghani left a comment

Initial point samplers #808

Initial point samplers #808

Conversation

uri-granta commented Jan 16, 2024 • edited Loading

Summary

PR checklist

hstojic commented Jan 26, 2024

uri-granta commented Jan 26, 2024

hstojic commented Jan 26, 2024

uri-granta commented Jan 31, 2024

hstojic left a comment

Choose a reason for hiding this comment

hstojic Feb 1, 2024

Choose a reason for hiding this comment

hstojic Feb 1, 2024

Choose a reason for hiding this comment

uri-granta Feb 2, 2024

Choose a reason for hiding this comment

hstojic left a comment

Choose a reason for hiding this comment

hstojic Feb 5, 2024

Choose a reason for hiding this comment

khurram-ghani left a comment

Choose a reason for hiding this comment

khurram-ghani left a comment

Choose a reason for hiding this comment

uri-granta commented Jan 16, 2024 •

edited

Loading