title | tags | authors | affiliations | date | bibliography | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
CBX: Python and Julia Packages for Consensus-Based Interacting Particle Methods |
|
|
|
22 March, 2024 |
paper.bib |
We introduce CBXPy and ConsensusBasedX.jl, Python and Julia implementations of consensus-based interacting particle systems (CBX), which generalise consensus-based optimization methods (CBO) for global, derivative-free optimisation. The raison d'être of our libraries is twofold: on the one hand, to offer high-performance implementations of CBX methods that the community can use directly, while on the other, providing a general interface that can accommodate and be extended to further variations of the CBX family. Python and Julia were selected as the leading high-level languages in terms of usage and performance, as well as for their popularity among the scientific computing community. Both libraries have been developed with a common ethos, ensuring a similar API and core functionality, while leveraging the strengths of each language and writing idiomatic code.
Consensus-based optimisation (CBO) is an approach to solve, for a given (continuous) objective function
i.e., the task of finding the point
In some cases, so-called gradient-based methods (those that involve updating a guess of
CBO uses a finite number
for some
Once the consensus point is computed, the particles evolve in time following the stochastic differential equation (SDE)
$$ \mathrm{d}x_t^i = -\lambda\ \underbrace{ \left( x_t^i - c_\alpha(x_t) \right) \mathrm{d}t }_{ \text{consensus drift} }
- \sigma\ \underbrace{ \left| x_t^i - c_\alpha(x_t) \right| \mathrm{d}B_t^i }_{ \text{scaled diffusion} }, $$
where
In practice, the solution to the SDE above cannot be found exactly. Instead, an Euler--Maruyama scheme [@KP1992] is used to update the position of the agents. The update is given by
$$ x^i \gets x^i -\lambda ,\Delta t \left( x^i - c_\alpha(x) \right)
- \sigma\sqrt{\Delta t}
\left| x^i - c_\alpha(x) \right| \xi^i, $$
where
As a particle-based family of methods, CBX is conceptually related to other optimisation approaches which take inspiration from biology, like particle-swarm optimisation (PSO) [@kennedy1995particle], from physics, like simulated annealing (SA) [@henderson2003theory], or from other heuristics [@mohan2012survey;@karaboga2014comprehensive;@yang2009firefly;@bayraktar2013wind]. However, unlike many such methods, CBX has been designed to be compatible with rigorous convergence analysis at the mean-field level [the infinite-particle limit, see @huang2021MFLCBO]. Many convergence results have been shown, whether in the original formulation [@carrillo2018analytical;@fornasier2021consensus], for CBO with anisotropic noise [@carrillo2021consensus;@fornasier2021convergence], with memory effects [@riedl2022leveraging], with truncated noise [@fornasier2023consensus], for polarised CBO [@bungert2022polarized], and PSO [@qiu2022PSOconvergence]. The relation between CBO and stochastic gradient descent has been recently established by @riedl2023gradient, which suggests a previously unknown yet fundamental connection between derivative-free and gradient-based approaches.
CBX methods have been successfully applied and extended to several different settings, such as constrained optimisation problems [@fornasier2020consensus_sphere_convergence;@borghi2021constrained], multi-objective optimisation [@borghi2022adaptive;@klamroth2022consensus], saddle-point problems [@huang2022consensus], federated learning tasks [@carrillo2023fedcbo], uncertainty quantification [@althaus2023consensus], or sampling [@carrillo2022consensus].
In general, very few implementations of CBO already exist, and none have been designed with the generality of other CBX methods in mind. Here, we summarise the related software:
Regarding Python, we refer to PyPop7
[@duan2023pypop7] and scikit-opt
[@scikitopt] for a collection of various derivative-free optimisation strategies. For packages connected to Bayesian optimisation, we refer to BayesO
[@Kim2023], bayesian-optimization
[@Bayesian14], GPyOpt
[@gpyopt2016], GPflowOpt
[@GPflowOpt2017], pyGPGO
[@Jiménez2017], PyBADS
[@Singh2024] and BoTorch
[@balandat2020botorch]. Furthermore, CMA-ES [@hansen1996adapting] was implemented in pycma
[@hansen2019pycma]. To the best of our knowledge the connection between consensus-based methods and evolution strategies is not fully understood, and is therefore an interesting future direction. PSO and SA implementations are already available in PySwarms
[@miranda2018pyswarms], scikit-opt
[@scikitopt], DEAP
[@deapJMLR2012] and pagmo
[@pagmo2017]. They are widely used by the community and provide a rich framework for the respective methods. However, adjusting these implementations to CBO is not straightforward. The first publicly available Python packages implementing CBX algorithms were given by some of the authors together with collaborators. @Igor_CBOinPython implement standard CBO [@pinnau2017consensus], and the package PolarCBO
[@Roith_polarcbo] provides an implementation of polarised CBO [@bungert2022polarized]. CBXPy is a significant extension of the latter, which was tailored to the polarised variant. The code architecture was generalised, which allowed the implementation of the whole CBX family within a common framework.
Regarding Julia, PSO and SA methods are, among others, implemented in Optim.jl
[@mogensen2018optim], Metaheuristics.jl
[@mejia2022metaheuristics], and Manopt.jl
[@Bergmann2022]. PSO and SA are also included in the meta-library Optimization.jl
[@DR2023], as well as Nelder--Mead, which is a direct search method. The latter is also implemented in Manopt.jl
[@Bergmann2022], which further provides a manifold variant of CMA-ES [@colutto2009cma]. One of the authors gave the first specific Julia implementation of standard CBO Consensus.jl
[@Bailo_consensus]. That package has now been deprecated in favour of ConsensusBasedX.jl, which improves the performance of the CBO implementation with a type-stable and allocation-free implementation. The package also adds a CBS implementation, and overall presents a more general interface that accomodates the wider CBX class of methods.
CBXPy and ConsensusBasedX.jl provide a lightweight and high-level interface. An existing function can be optimised with just one call. Method selection, parameters, different approaches to particle initialisation, and termination criteria can be specified directly through this interface, offering a flexible point of entry for the casual user. Some of the methods provided are standard CBO [@pinnau2017consensus], CBO with mini-batching [@carrillo2021consensus], polarised CBO [@bungert2022polarized], CBO with memory effects [@grassi2020particle;@riedl2022leveraging], and consensus-based sampling (CBS) [@carrillo2022consensus]. Parallelisation tools are available.
A more proficient user will benefit from the fully documented interface, which allows the specification of advanced options (e.g., debug output, the noise model, or the numerical approach to the matrix square root of the weighted ensemble covariance matrix). Both libraries offer performance evaluation methods as well as visualisation tools.
Ultimately, a low-level interface (including documentation and full-code examples) is provided. Both libraries have been designed to express common abstractions in the CBX family while allowing customisation. Users can easily implement new CBX methods or modify the behaviour of the existing implementation by strategically overriding certain hooks. The stepping of the methods can also be controlled manually.
Most of the CBXPy implementation uses basic Python functionality, and the agents are handled as an array-like structure. For certain specific features, like broadcasting-behaviour, array copying, and index selection, we fall back to the numpy
implementation [@harris2020array]. However, it should be noted that an adaptation to other array or tensor libraries like PyTorch [@paszke2019pytorch] is straightforward. Compatibility with the latter enables gradient-free deep learning directly on the GPU, as demonstrated in the documentation.\
The library is available on GitHub and can be installed via pip
. It is licensed under the MIT license. Below, we provide a short example on how to optimise a function with CBXPy.
from cbx.dynamics import CBO # import the CBO class
f = lambda x: x[0]**2 + x[1]**2 # define the function to minimise
x = CBO(f, d=2).optimize() # run the optimisation
More examples and details on the implementation are available in the documentation.
ConsensusBasedX.jl has been almost entirely written in native Julia (with the exception of a single call to LAPACK). The code has been developed with performance in mind, thus the critical routines are fully type-stable and allocation-free. A specific tool is provided to benchmark a typical method iteration, which can be used to detect allocations. Through this tool, unit tests are in place to ensure zero allocations in all the provided methods. The benchmarking tool is also available to users, who can use it to test their implementations of
Basic function minimisation can be performed by running:
using ConsensusBasedX # load the ConsensusBasedX package
f(x) = x[1]^2 + x[2]^2 # define the function to minimise
x = minimise(f, D = 2) # run the minimisation
The library is available on GitHub. It has been registered in the general Julia registry, and therefore it can be installed by running ]add ConsensusBasedX
. It is licensed under the MIT license. More examples and full instructions are available in the documentation.
We thank the Lorentz Center in Leiden for their kind hospitality during the workshop "Purpose-driven particle systems" in Spring 2023, where this work was initiated. RB was supported by the Advanced Grant Nonlocal-CPD (Nonlocal PDEs for Complex Particle Dynamics: Phase Transitions, Patterns and Synchronisation) of the European Research Council Executive Agency (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement No. 883363) and by the EPSRC grant EP/T022132/1 "Spectral element methods for fractional differential equations, with applications in applied analysis and medical imaging". KR acknowledges support from the German Federal Ministry of Education and Research and the Bavarian State Ministry for Science and the Arts. TR acknowledges support from DESY (Hamburg, Germany), a member of the Helmholtz Association HGF. This research was supported in part through the Maxwell computational resources operated at Deutsches Elektronen-Synchrotron DESY, Hamburg, Germany. UV acknowledges support from the Agence Nationale de la Recherche under grant ANR-23-CE40-0027 (IPSO).