George De Ath 1, Richard M. Everson 1, Alma A. M. Rahat, S 2, Jonathan E. Fieldsend 1
1 University of Exeter, United Kingdom, 2 Swansea University, United Kingdom
This repository contains the Python3 code for the ε-greedy strategies presented in:
George De Ath, Richard M. Everson, Alma A. M. Rahat, and Jonathan E. Fieldsend. 2021. Greed Is Good: Exploration and Exploitation Trade-offs in Bayesian Optimisation. ACM Trans. Evol. Learn. Optim. 1, 1, Article 1 (May 2021), 22 pages.
> Paper: https://doi.org/10.1145/3425501
> Preprint: https://arxiv.org/abs/1911.12809
The repository also contains all training data used for the initialisation of each of the 51 optimisation runs carried to evaluate each method, the optimisation results of each of the runs on each of the methods evaluated and the code to generate new training data and also to perform the optimisation runs themselves. Two jupyter notebooks are also included that reproduce all figures shown in the paper and supplementary material.
The remainder of this document details:
- The steps needed to install the package and related python modules on your system: docker / manual
- The format of the training data and saved runs.
- How to repeat the experiments.
- How to reproduce the figures in the paper.
- How to add your own acquisition functions and test problems.
If you use any part of this code in your work, please cite our ACM TELO paper:
@article{death:egreedy:2021,
title = {Greed is Good: Exploration and Exploitation Trade-offs in {B}ayesian Optimisation},
author = {George {De Ath} and Richard M. Everson and Alma A. M. Rahat and Jonathan E. Fieldsend},
year = {2021},
journal = {ACM Transactions on Evolutionary Learning and Optimization},
volume = {1},
number = {1},
publisher = {Association for Computing Machinery},
address = {New York}
}
The easiest method to automatically set up the environment needed for the optimisation library to run and to repeat the experiments carried out in this work is to use docker. Install instructions for docker for many popular operating systems are can be found here. Once docker has been installed, the docker container can be download and ran as follows:
> # download the docker container
> docker pull georgedeath/egreedy
> # run the container
> docker run -it georgedeath/egreedy
Welcome to the OpenFOAM v5 Docker Image
..
Once the above commands have been ran you will be in the command prompt of the container, run the following commands to test the functionality of the code (CTRL+C to prematurely halt the run):
> # run an example optimisation run
> python -m egreedy.optimizer
Loaded training data from: training_data/Branin_1.npz
Loaded test problem: Branin
Using acquisition function: eFront
with optional arguments: {'epsilon': 0.1}
Training a GP model with 4 data points.
Optimising the acquisition function.
..
Manual installation is straight-forward for the optimisation library apart from the configuration of the PitzDaily test problem due to the installation and compilation of OpenFOAM®. Note that if you do not wish to use the PitzDaily test problem then the library will work fine without the optional instructions included at the end of this section. The following instructions will assume that Anaconda3 has been installed and that you are running the following commands from the command prompt/console:
> conda install -y scipy numpy matplotlib statsmodels swig jupyter
> conda install -y ipopt=3.12.12 pygmo --channel conda-forge
> pip install pyDOE2 pygame box2d-py GPy numpy-stl
Note that, on windows, to install swig
and pygame
it may be necessary to
also install Visual C++ build tools.
Once the above python modules have been installed, clone this repository to a
location of your choosing (in the following we assume you are installing to
/egreedy/
) and test that it works (CTRL+C to cancel optimisation run):
> git clone https://github.com/georgedeath/egreedy/ /egreedy
> cd /egreedy
> python -m egreedy.optimizer
Loaded training data from: training_data/Branin_1.npz
..
PitzDaily (CFD) instructions (optional, Linux only) - other test problems will work without this:
> pip install pyfoam
Now follow the linked instructions to install OpenFOAM5
(this will take 30min - 3hours to install). Note that this has only been tested
with the Ubuntu 12.04 and 18.04 instructions. Once this has been successfully
installed, the command of5x
has to be ran before the PitzDaily test problem
can be evaluated.
Finally, compile the pressure calculation function and check that the test problem works correctly:
> of5x
> cd /egreedy/egreedy/test_problems/Exeter_CFD_Problems/data/PitzDaily/solvers/
> wmake calcPressureDifference
> # test the PitzDaily solver
> cd /egreedy
> python -m egreedy.test_problems.pitzdaily
PitzDaily successfully instantiated..
Generated valid solution, evaluating..
Fitness value: [0.24748876]
Please ignore errors like
Getting LinuxMem: [Errno 2] No such file or directory: '/proc/621/status
as these are from OpenFOAM and do not impact the optimisation process.
The initial training locations for each of the 51 sets of
Latin hypercube samples are located in
the training_data
directory in this repository with the filename structure
ProblemName_number
, e.g. the first set of training locations for the Branin
problem is stored in Branin_1.npz
. Each of these files is a compressed numpy
file created with numpy.savez.
It has two numpy.ndarrays
containing the 2*D initial locations and their corresponding fitness values.
To load and inspect these values use the following instructions:
> cd /egreedy
> python
>>> import numpy as np
>>> with np.load('training_data/Branin_1.npz') as data:
Xtr = data['arr_0']
Ytr = data['arr_1']
>>> Xtr.shape, Ytr.shape
((4, 2), (4, 1))
The robot pushing test problems (push4 and push8) have a third array
'arr_2'
that contains their instance-specific parameters:
> cd /egreedy
> python
>>> import numpy as np
>>> with np.load('training_data/push4_1.npz', allow_pickle=True) as data:
Xtr = data['arr_0']
Ytr = data['arr_1']
instance_params = data['arr_2']
>>> instance_params
array({'t1_x': -4.268447250704135, 't1_y': -0.6937799887556437}, dtype=object)
these are automatically passed to the problem function when it is instantiated to create a specific problem instance.
The results of all optimisation runs can be found in the results
directory.
The filenames have the following structure:
ProblemName_Run_TotalBudget_Method.npz
, with the ε-greedy methods having the
format: ProblemName_Run_TotalBudget_Method_eps0.XX.npz
where XX
corresponds
to the value of ε used. Similar to the training data, these are also
numpy.ndarrays
and contain two items, Xtr
and Ytr
, corresponding to the evaluated
locations in the optimisation run and their function evaluations. Note that
the evaluations and their function values will also include the initial 2*D
training locations at the beginning of the arrays and that the methods ε-RS
and ε-PF have results files named eRandom and eFront respectively.
The following example loads the first optimisation run on the Branin test problem with the ε-PF method using ε = 0.1:
> cd /egreedy
> python
>>> import numpy as np
>>> with np.load('results_paper/Branin_1_250_eFront_eps0.1.npz', allow_pickle=True) as data:
Xtr = data['Xtr']
Ytr = data['Ytr']
>>> Xtr.shape, Ytr.shape
((250, 2), (250, 1))
The python file run_experiment.py
provides a convenient way to reproduce an
individual experimental evaluation carried out the paper. It has the following
syntax:
> python run_experiment.py -h
usage: run_experiment.py [-h] -p PROBLEM_NAME -b BUDGET -r RUN_NO -a
ACQUISITION_NAME
[-aa ACQUISITION_ARGS [ACQUISITION_ARGS ...]]
egreedy optimisation experimental evaluation
--------------------------------------------
Example:
Running the ePF method on the Branin test function with the training data
"1" for a budget (including 2*D training points) of 250 and with a value
of epsilon = 0.1 :
> python run_experiment.py -p Branin -b 250 -r 1 -a eFront -aa epsilon:0.1
Running EI on push4 method (note the lack of -aa argument):
> python run_experiment.py -p push4 -b 250 -r 1 -a EI
optional arguments:
-h, --help show this help message and exit
-p PROBLEM_NAME Test problem name. e.g. Branin, logGSobol
-b BUDGET Budget. Default: 250 (including training points). Note
that the corresponding npz file containing the initial
training locations must exist in the "training_data"
directory.
-r RUN_NO Run number
-a ACQUISITION_NAME Acquisition function name. e.g: Explore, EI, PI UCB,
PFRandom, eRandom (e-RS), eFront (e-PF) or Exploit
-aa ACQUISITION_ARGS [ACQUISITION_ARGS ...]
Acquisition function parameters, must be in pairs of
parameter:values, e.g. for the e-greedy methods:
epsilon:0.1 [Note: optional]
Similarly, run_all_experiments.py
provides an easy interface run all
experiments for a specific set of test problems, either the synthetic, robot
pushing or pipe shape optimisation, in the following manner:
> python run_all_experiments.py -h
usage: run_all_experiments.py [-h] {synthetic,robot,pitzdaily}
Evaluate all methods on a set of functions.
--------------------------------------------
Examples:
Evaluate all methods in the paper on the synthetic functions:
> python run_all_experiments.py synthetic
Evaluate all methods in the paper on the robot pushing functions:
> python run_all_experiments.py robot
Evaluate all methods in the paper on the PitzDaily test function:
(Note that this can only be performed if OpenFOAM has been set up correctly)
> python run_all_experiments.py pitzdaily
positional arguments:
{synthetic,robot,pitzdaily}
Set of test problems to evaluate.
Note that each test problem is evaluated approximately 250000 times for the 20
methods in the script. The synthetic functions and robot pushing problem have
trivial computational cost but the Gaussian Processes need to be trained and
corresponding acquisition function optimised for each function evaluation. The
PitzDaily test problem, however, will take around 1 minute to evaluate, meaning
that the total time spent evaluating the computational fluid dynamics solver is
approximately 160 days. Given that there is no interaction between optimisation
runs, we suggest the use of run_experiment.py
in a batch setting across
multiple cores/machines. This could be accomplished, for example, by calling
run_experiment.py
with different sets of arguments (corresponding to each
experiment to be carried out), on either multiple machines and/or different
instances of python (e.g. with screen
on Linux).
The jupyter notebook Non_results_figure_generation.ipynb contains the code to generate the following figures:
- Figure 1: Showing an example Gaussian process model and its corresponding Pareto front and set.
- Figure 2: Contours of acquisition function values for EI, UCB and PI.
- Figure 3: Contours of weighted EI for three values of ω.
- Figure 1 (Supplementary material): Landscape of the WangFreitas test problem.
The jupyter notebook
Process_results_and_generate_figures_for_paper.ipynb
contains the code to load and process the optimisation results (stored in the
results
directory) as well as the code to produce all results figures and
tables used in the paper and supplementary material.
The jupyter notebook New_fitness_functions_and_acquisition_functions.ipynb contains examples and instructions of how to include your own test problems (fitness functions) and acquisition functions.