Skip to content

Latest commit

 

History

History
43 lines (32 loc) · 4.34 KB

Numerical Optimization.md

File metadata and controls

43 lines (32 loc) · 4.34 KB

Numerical Optimization

Summary

Papers

Convex Optimization

Local Optimization

Global Optimization with Deterministic Approaches

Global Optimization with Stochastic Approaches

  • GP-UCB Gaussian Process Optimization in the Bandit Setting: No Regret and Experimental Design, N. Srinivas et al., 2010.
    • GP-UCB invokes an UCB term to decide which point to sample next (acquisition function), and provides a regret bound of O(sqrt(gamma(T)T)), where gamma(T) is the maximum information gain between function values and noisy samples of the first T observations.
    • The guarantee of GP-UCB ensures sub-linear regret for some kernels of interest, however, it lacks of algorithm-independent lower bound.
  • A Tutorial on Bayesian Optimization of Expensive Cost Functions, with Application to Active User Modeling and Hierarchical Reinforcement Learning, E. Brochu et al., 2010.
    • Bayesian optimization is based on the famous Bayes rule. In principle, we assume some prior knowledge on the objective function f to be optimized. We then update to some posterior while gathering evidence along with all previous observations.
    • GP prior is one of the most investigated prior for BO. It can be considered as some "random function" while instead of returning a function value, it returns the mean and variance of a Gaussian distribution over the possible values of f at the current sampled point x.
    • Since GP is characterized by its mean function and covariance function, it is of great importance to choose the kernel/covariance function. Typical kernels are SE kernel (squared exponential) and Matérn kernel.
    • The next step is to choose some acquisition/utility function that we optimize over the GP, so that we can decide which point to sample next. Common acquisition functions include PI (probability of improvement), EI (expected improvement) and GP-UCB, etc.
  • Maximizing Acquisition Functions for Bayesian Optimization, J. Wilson et al., 2018.
    • In this paper, the authors investigated two approaches for addressing BO's inner optimization problem, namely the differentiability of Monte Carlo integration and the submodularity of parallel querying.
  • Tight Regret Bounds for Bayesian Optimization in One Dimension, J. Scarlett, 2018.
    • The authors provide a lower bound of Omega(sqrt(T)) and an upper bound of O(sqrt(Tlog(T))) for BO over one dimensional functions under some mild assumptionson the kernel (e.g. SE and Matérn kernel satisfy these assumptions).
    • The lower bound provided is the first one in a noisy Bayesian setting.
    • The idea of the algorithm is to construct a sequence of increasingly closely-packed subsets of a set of potential maxizimizers, by performing resampling and elimination according to some UCB and LCB terms.
  • A-GP-UCB No-regret Bayesian Optimization with Unknown Hyperparameters, F. Berkenkamp et al., 2019.
    • It's an adaptive GP-UCB algorithm wrt its hyperparameters, namely the lengthscales theta of the kernel and the RKHS norm bound B (so the assumption is that the objective function f has boundedd RKHS norm). Roughly speaking, it increases B or decreases theta by time so that the function class is expanded, thus avoids the hyperparameters misspecification problem while still keeping a sublinear regret bound.