Use the GP's uncertainty to decide where to evaluate next. Balance exploitation (low mean) with exploration (high variance).
You have a GP surrogate of an expensive function. Where should you evaluate next? This is the central question of Bayesian optimization. The answer must balance two competing goals:
Prediction-based exploration simply evaluates at the point with the lowest predicted mean: xnext = argmin μ̂(x). This is pure exploitation.
Error-based exploration evaluates where uncertainty is highest: xnext = argmax σ̂(x). This is pure exploration.
The lower confidence bound (LCB) acquisition function balances exploitation and exploration:
Evaluate at xnext = argmin LCB(x). The parameter β controls the balance: β = 0 gives pure exploitation, large β gives more exploration. The LCB is optimistic: it asks "what is the best this point could be?"
The probability of improvement (PI) asks: what is the probability that evaluating at x yields a value better than the current best ymin?
where Φ is the standard normal CDF. Evaluate at xnext = argmax PI(x).
The expected improvement (EI) accounts for both the probability and the magnitude of improvement:
where Φ and φ are the standard normal CDF and PDF. The first term rewards exploitation (low predicted mean). The second term rewards exploration (high predicted variance).
| Strategy | Balance | Tuning Required? |
|---|---|---|
| Prediction-based | Pure exploitation | No |
| Error-based | Pure exploration | No |
| Lower confidence bound | Tunable via β | Yes (β) |
| Probability of improvement | Mostly exploitation | No |
| Expected improvement | Automatic balance | No |
Sometimes evaluating at certain points is dangerous (testing a controller that might crash a robot, evaluating a drug dosage that might be harmful). SafeOpt restricts evaluation to points that are confidently below a safety threshold.
SafeOpt maintains three sets:
| Set | Description |
|---|---|
| Safe set S | Points whose upper confidence bound is below ymax |
| Potential minimizers M | Safe points whose lower bound is below the best upper bound |
| Expanders E | Safe points whose evaluation could expand the safe set |
| Concept | Key Fact |
|---|---|
| Acquisition function | Scores each point for how valuable it would be to evaluate |
| Expected improvement | Gold standard; balances exploration-exploitation automatically |
| LCB | Optimistic; exploration controlled by β |
| SafeOpt | Bayesian optimization with safety constraints |
| Bayesian optimization loop | Fit GP → optimize acquisition → evaluate → repeat |