Kochenderfer & Wheeler, Chapter 19

Surrogate Optimization

Use the GP's uncertainty to decide where to evaluate next. Balance exploitation (low mean) with exploration (high variance).

Prerequisites: Chapter 18 (Probabilistic Surrogates).
9
Chapters
0
Simulations
9
Quizzes

Chapter 0: Bayesian Optimization

You have a GP surrogate of an expensive function. Where should you evaluate next? This is the central question of Bayesian optimization. The answer must balance two competing goals:

The exploration-exploitation tradeoff: Exploitation says: evaluate where the predicted mean is lowest (probably a good point). Exploration says: evaluate where uncertainty is highest (learn something new). Pure exploitation can miss the global minimum. Pure exploration wastes evaluations on uninformative regions. The acquisition function balances both.
Bayesian optimization uses the GP to decide:

Chapter 1: Prediction-Based

Prediction-based exploration simply evaluates at the point with the lowest predicted mean: xnext = argmin μ̂(x). This is pure exploitation.

The problem: This can get stuck sampling the same region repeatedly. If the GP predicts x=3 as the minimum, you evaluate there, confirm it, and the GP still predicts x=3 as the minimum. You learn nothing new. No exploration means you might miss a better minimum elsewhere.
Prediction-based exploration fails because:

Chapter 2: Error-Based

Error-based exploration evaluates where uncertainty is highest: xnext = argmax σ̂(x). This is pure exploration.

The problem: This wastes evaluations exploring regions that may have nothing interesting. If the GP is uncertain about a flat plateau, error-based exploration will exhaustively map that plateau without making progress toward the minimum. No exploitation means no progress.
Error-based exploration wastes effort because:

Chapter 3: Lower Confidence Bound

The lower confidence bound (LCB) acquisition function balances exploitation and exploration:

LCB(x) = μ̂(x) − βσ̂(x)

Evaluate at xnext = argmin LCB(x). The parameter β controls the balance: β = 0 gives pure exploitation, large β gives more exploration. The LCB is optimistic: it asks "what is the best this point could be?"

The intuition: LCB picks the point with the lowest optimistic estimate. A point with low mean and low variance gets selected for exploitation. A point with high mean but very high variance also gets selected because "it could be amazing." The β parameter controls how optimistic you are.
Increasing β in LCB shifts the balance toward:

Chapter 4: Probability of Improvement

The probability of improvement (PI) asks: what is the probability that evaluating at x yields a value better than the current best ymin?

PI(x) = P(f(x) < ymin) = Φ((ymin − μ̂(x)) / σ̂(x))

where Φ is the standard normal CDF. Evaluate at xnext = argmax PI(x).

Key insight: PI tends to be too exploitative. It favors points with a high probability of marginal improvement over the current best. A point that could potentially improve by a lot but has only a 30% chance gets less weight than a point with a 50% chance of a tiny improvement. Expected improvement addresses this.
PI tends to be too exploitative because it:

Chapter 5: Expected Improvement

The expected improvement (EI) accounts for both the probability and the magnitude of improvement:

EI(x) = (ymin − μ̂)Φ(z) + σ̂φ(z),    z = (ymin − μ̂) / σ̂

where Φ and φ are the standard normal CDF and PDF. The first term rewards exploitation (low predicted mean). The second term rewards exploration (high predicted variance).

Why EI is the gold standard: EI is the most widely used acquisition function in Bayesian optimization. It naturally balances exploration and exploitation with no tuning parameter (unlike LCB's β). It asks the right question: "how much improvement do I expect from evaluating here?" — weighting both the chance and size of improvement.
Expected improvement improves over probability of improvement by:

Chapter 6: Comparing Strategies

StrategyBalanceTuning Required?
Prediction-basedPure exploitationNo
Error-basedPure explorationNo
Lower confidence boundTunable via βYes (β)
Probability of improvementMostly exploitationNo
Expected improvementAutomatic balanceNo
Practical recommendation: Start with expected improvement. If you need more exploration (e.g., high-dimensional, many local minima), switch to LCB with a moderate β. PI is rarely the best choice but can be useful when any improvement matters.
Expected improvement is the most popular acquisition function because:

Chapter 7: Safe Optimization

Sometimes evaluating at certain points is dangerous (testing a controller that might crash a robot, evaluating a drug dosage that might be harmful). SafeOpt restricts evaluation to points that are confidently below a safety threshold.

SafeOpt maintains three sets:

SetDescription
Safe set SPoints whose upper confidence bound is below ymax
Potential minimizers MSafe points whose lower bound is below the best upper bound
Expanders ESafe points whose evaluation could expand the safe set
The constraint: SafeOpt can never reach the global minimum if it lies beyond an unsafe region. It finds the best optimum reachable through safe evaluations. This is the price of safety — you may miss better solutions in unexplored dangerous territory.
SafeOpt restricts evaluations to points that are:

Chapter 8: Summary

ConceptKey Fact
Acquisition functionScores each point for how valuable it would be to evaluate
Expected improvementGold standard; balances exploration-exploitation automatically
LCBOptimistic; exploration controlled by β
SafeOptBayesian optimization with safety constraints
Bayesian optimization loopFit GP → optimize acquisition → evaluate → repeat
Looking ahead: Chapter 20 addresses optimization under uncertainty — when the objective function itself is uncertain due to noise, model approximations, or varying conditions.
The Bayesian optimization loop repeats: