What if your robot doesn’t just follow a path — it reacts to where it is? Policies, vector fields, potential functions, and the one method with no local minima.
You planned a perfect path from start to goal. Your robot begins to execute it. Halfway through, another robot bumps it slightly off course. The planned path is now useless — it was computed for a specific start position, and the robot is no longer there.
This is the core fragility of open-loop planning: compute a path once, follow it blindly. Any deviation — actuator noise, model error, unexpected obstacle — causes the robot to leave the planned trajectory with no recovery mechanism.
Feedback planning is the alternative. Instead of computing a single path, we compute a policy: a function that takes the robot’s current state and returns the action to take right now. The robot doesn’t commit to a sequence of waypoints. It continually asks: “Given where I am, what should I do?” and acts accordingly.
Imagine two robots navigating a maze. Robot A pre-computes a path and follows it step by step. Robot B evaluates its current position at every moment and picks an action from a policy. When a crate falls and blocks part of the maze, Robot A walks into it; Robot B sees its current position, consults its policy, and routes around — the policy accounts for every configuration, including configurations reachable only after unexpected detours.
Watch what happens when an obstacle appears mid-execution. Click “Add Obstacle” while the robots are running.
A plan is a sequence of actions: (a1, a2, …, ak). Execute action 1, then action 2, and so on. The plan ignores state — it is a fixed script regardless of what actually happens.
A policy (LaValle calls it a feedback plan or strategy) is a function π: X → U that maps every possible state x ∈ X to an action u ∈ U. It is not a sequence to execute once — it is a lookup table valid everywhere. When the robot finds itself at state x, it applies π(x) to get its next action, regardless of how it arrived at x.
In a discrete grid world with 4-directional movement, a policy assigns one of {North, South, East, West, Stay} to every cell. You can visualize it as an arrow at each grid position. The robot always follows the arrow at its current cell. This is exactly a navigation function on a discrete graph — and it’s the topic of Chapter 2.
In a continuous C-space, the policy becomes a vector field: a smoothly varying arrow at each point in space. Following the arrows traces out a curve through C-space from any starting configuration to the goal. The mathematics of how that curve behaves is called integral curve theory — Chapters 3 and 4 cover it.
A plan is complete if it reaches the goal when executed perfectly. A policy is complete if following it from any reachable start state guarantees eventual goal arrival. Completeness is harder to guarantee for policies — the robot could cycle, converge to the wrong place, or stall. The rest of this chapter is about designing policies that provably avoid these failures.
Start with the simplest possible setting: a finite grid graph. Cells are either free or blocked. The goal is one special free cell. We want a policy — an arrow at every free cell — such that following any arrow, then the next, then the next, always leads to the goal.
The construction is a navigation function Cfree → ℝ, a scalar value at each free cell such that (1) the goal has the minimum value, and (2) the function decreases monotonically along any trajectory that follows the policy. If you always move to the neighbor with the smallest value, you will reach the goal.
The cleanest way to build a navigation function on a discrete grid is wavefront propagation (also called BFS from the goal):
The resulting values are exactly the shortest-path distances from each cell to the goal. The policy derived from them is the shortest-path policy. Every free cell reachable from the goal gets a finite value; unreachable cells stay at infinity (effectively, they lie in a disconnected component).
Click a cell to place the goal (orange). Watch the wavefront spread. Then click “Show Policy” to see the arrows.
The discrete navigation function assigns an arrow to each grid cell. Moving to continuous C-space, the natural generalization is a vector field: a smoothly varying vector assigned to every point in C-space. At each configuration q, the field gives a direction of motion f(q) ∈ TqC, a vector in the tangent space at q.
Formally, a vector field on C-space is a function f: Cfree → T(C) where T(C) is the tangent bundle. In Euclidean C-space (C = Rn), this simplifies to f: Rn → Rn — each point maps to a displacement vector.
Not every vector field makes a good policy. We need three properties:
| Property | What it means | Why it matters |
|---|---|---|
| Goal attracting | f(qG) = 0 and trajectories converge to qG | The robot actually reaches the goal |
| Obstacle avoiding | Field points away from or parallel to obstacle boundaries | The robot doesn’t collide |
| No spurious equilibria | The only point where f(q) = 0 is the goal | The robot doesn’t get stuck |
The third property is the hardest to guarantee. A field can have saddle points (where the robot stalls unless exactly on the saddle), local minima (where the robot spirals inward to a wrong location), or limit cycles (where the robot orbits forever). Avoiding all three simultaneously is the central challenge of Chapter 8.
A convenient way to build a vector field is to define a scalar function φ: Cfree → R and take its negative gradient: f(q) = −∇φ(q). The gradient always points in the direction of steepest ascent; negating it gives steepest descent. A robot following this field rolls downhill on the surface φ.
This choice is powerful because a scalar function is much easier to design than a vector field directly. The field “falls out” of the scalar automatically. But it also introduces a constraint: gradient fields can only have the critical points that φ has — and scalar functions on Rn generically have local minima, saddle points, and maxima, not just global minima.
We have a vector field f: Cfree → Rn. A robot starts at q0 and, at each moment, moves in the direction f(q). The path it traces is called an integral curve of the field — the curve whose tangent at every point agrees with the field direction at that point.
Formally, an integral curve through q0 is a function γ: [0, ∞) → Cfree satisfying:
This is just the ordinary differential equation for following the field. By the Picard-Lindelöf theorem, if f is Lipschitz continuous, a unique integral curve exists through every starting point. This uniqueness is critical: it means two integral curves never cross (otherwise, the same point would map to two different directions, contradicting uniqueness).
In practice, we can’t integrate continuously. We approximate with small steps:
Normalize the field vector first (so step size is always ε), then step. This is Euler integration of the ODE. The step size ε trades accuracy against computation. As ε → 0, the discrete path converges to the true integral curve. In practice, ε is chosen relative to the obstacle clearance.
Each colored line is an integral curve starting from a different point. All converge to the goal (orange dot). Try clicking to start a new curve from any point.
The most popular approach to feedback planning combines two forces: attraction toward the goal and repulsion away from obstacles. This is the artificial potential field method, introduced by Khatib in 1986 and refined extensively since.
We define a total potential φ(q) = φatt(q) + φrep(q) and move along its negative gradient. The attractive part pulls the robot toward the goal; the repulsive part pushes it away from obstacles.
The most common attractive potential is a paraboloid centered at the goal qG:
Its gradient is ∇φatt(q) = η(q − qG), which points directly away from the goal with magnitude proportional to distance. The negative gradient −∇φatt(q) = η(qG − q) is a vector pointing toward the goal, getting weaker as the robot approaches. η > 0 is a gain controlling the strength of attraction.
Obstacles generate repulsive potentials that grow as the robot gets close. LaValle uses Khatib’s formulation: let d(q) be the distance from configuration q to the nearest obstacle. For a threshold distance ρ0:
Here ν > 0 is the repulsive gain. The function is zero outside radius ρ0 (obstacles don’t matter at long range), rises steeply as d(q) → 0 (approaching the obstacle surface), and goes to infinity at d(q) = 0 (inside an obstacle).
The policy is: f(q) = −∇φ(q) = −∇φatt(q) − ∇φrep(q). Near the goal with no obstacles nearby, the attractive term dominates and the robot converges smoothly. Near an obstacle, the repulsive term overwhelms and the robot steers away. In the intermediate region, the two gradients balance and the robot curves around obstacles toward the goal.
Goal is orange. Obstacles are red. Arrows show the combined gradient field. Drag sliders to tune the gains, then trace robot paths by clicking.
The artificial potential field is elegant and fast to compute. It also has a fatal flaw. The combined potential φ = φatt + φrep is a scalar function on C-space, and scalar functions on Rn almost always have local minima — points where ∇φ = 0 but φ is not at its global minimum (the goal).
At a local minimum, the robot halts. It is at an equilibrium: the attractive force toward the goal and the repulsive force away from obstacles are perfectly balanced. The robot sits there forever, unable to reach the goal, with no indication from the field that it needs to do anything differently.
Local minima typically appear in concave obstacles. Imagine a C-shaped obstacle with the goal inside the C and the robot approaching from outside. The repulsive walls push the robot back, but the attractive force toward the interior goal pulls it forward. There is often an equilibrium just outside the opening of the C where these forces balance.
Another common case: a narrow passage. Two obstacles close together with the goal on the other side. The repulsive forces from both walls push the robot backward; if the passage is narrow enough, the combined repulsion exceeds the attraction at the entry point, creating a local minimum right at the passage entrance.
| Workaround | Idea | Failure mode |
|---|---|---|
| Random perturbation | Add random noise to escape minima | Slow; may re-enter minimum; not reliable |
| Increase attraction gain | Make goal pull stronger than repulsion | Robot crashes into obstacles when close |
| Decrease repulsion radius | Repulsion only active very close to obstacles | Robot grazes obstacles, safety compromised |
| Path libraries | Use a planner when stuck, then resume field | Hybrid; loses completeness guarantee |
None of these make the potential field method complete. Local minima remain. The only rigorous fix is to redesign the potential function entirely — which is the topic of Chapter 7.
Click a start position. The robot follows the gradient field but gets stuck before reaching the goal. The orange star shows where it halts.
Koditschek and Rimon (1990) asked a remarkable question: can we design a potential function on C-space that has exactly one critical point — the goal? Such a function would have no local minima by construction. A robot following its gradient would reach the goal from anywhere in the free space.
They proved the answer is yes, for a specific class of worlds: a robot moving in a sphere world (a spherical outer boundary containing spherical obstacles). They called the resulting function a navigation function (LaValle §8.4).
For a sphere world with spherical obstacles O1, …, Ok, the navigation function takes the form:
Here γ(q) = ‖q − qG‖2 is the goal distance squared; β(q) = ∏i βi(q) is a product of obstacle functions (βi = 0 on obstacle i’s boundary); and κ is a tuning parameter. As κ → ∞, the function approaches the pure goal-distance gradient in free space. For large enough finite κ, the product term dominates near obstacles, pushing trajectories away.
The critical insight: for sufficiently large κ, this function is a Morse function with exactly one local minimum. The proof uses Morse theory to count critical points — for sphere worlds, the topology guarantees the count comes out to exactly one.
| Property | Potential Field | Navigation Function |
|---|---|---|
| Local minima | Yes, many possible | None (proven) |
| Complete | No | Yes (for sphere worlds) |
| Environment restriction | None (but no guarantees) | Must be sphere world |
| Computation | Simple, analytic | More complex, needs κ tuning |
| Obstacle shapes | Any (with distance query) | Must be spherical |
Left panel: Artificial potential field (has local minima). Right panel: Navigation function (no local minima).
Click anywhere to place the goal (orange). Then click “Trace All Starts” to see where robots launched from a grid of starts end up. Watch the left panel get stuck; the right panel always reaches the goal.
Navigation functions are complete and elegant, but they require sphere worlds — a severe geometric restriction. Real environments have walls, corners, furniture, and irregular obstacle shapes. How do we get feedback completeness without restricting the geometry?
The answer is to combine the best of both worlds: use a sampling-based planner (like RRT or PRM from Chapter 5) to build a roadmap over free C-space, then derive a discrete policy over that roadmap. The result is a sampling-based feedback plan (LaValle §8.5).
Run a PRM-style sampler over Cfree to build a graph G = (V, E). Vertices V are collision-free configurations; edges E connect nearby vertices with collision-free local paths. This is the same structure as Chapter 5, but now we will use it differently: not to answer a single query, but to define a global policy.
With the goal configuration qG as the root, run Dijkstra’s algorithm on the graph G. Every vertex v gets a shortest-path distance d(v, qG). This is the discrete navigation function on the roadmap — identical to the wavefront propagation from Chapter 2, but on the PRM graph instead of a grid.
For every vertex v, the policy says: move to the neighboring vertex u with smallest d(u, qG). Between vertices, use local path following (e.g., straight-line interpolation if the edge is a straight segment). The full policy π(q) works as follows: from current configuration q, find the nearest roadmap vertex v, then follow the edge toward the best neighbor.
This approach is probabilistically complete: as the number of samples grows, the probability that the roadmap fails to cover Cfree goes to zero. For any fixed environment, enough samples will eventually connect every navigable region to the goal. In practice, this means the feedback plan can handle arbitrary (non-spherical) environments, at the cost of needing enough samples.
Click “Build Roadmap” to sample and connect vertices. Click “Solve” to run Dijkstra from the goal (orange). Then click any free point to trace a robot path using the feedback policy.
Chapter 8 opened with a simple idea — replace a plan with a policy — and developed it into navigation functions, vector fields, integral curves, potential fields, and sampling-based feedback. Here’s how it all connects to the broader landscape of planning and control.
| Chapter | Topic | Relation to Ch 8 |
|---|---|---|
| Ch 2 | Discrete planning | Wavefront propagation is the discrete nav function |
| Ch 5 | Sampling-based planning | PRM provides the roadmap for sampling-based feedback |
| Ch 7 | Extensions | Multiple-robot policies and time-varying obstacles use the same feedback framework |
| Ch 9 | Decision-theoretic planning | Policies under uncertainty generalize Ch 8 policies to MDPs |
| Ch 14 | Trajectory planning | Kinodynamic constraints on the vector field |
| Open-Loop Plan | Feedback Policy | |
|---|---|---|
| Robustness | Fragile: any deviation breaks it | Robust: defined for all states |
| Computation | Single query, then done | Pre-compute once; evaluate cheaply |
| Optimality | Can optimize the single path | Harder to optimize (Bellman equations needed) |
| Completeness | Complete if planner is | Complete if nav function exists or roadmap covers |
| Replanning | Required when environment changes | Often self-correcting for small changes |
Autonomous driving uses potential-field-inspired methods for lane centering, with replanning for large deviations. Drone swarms use gradient fields for collision avoidance (each drone repels neighbors). Robotic arms use navigation functions for joint-space control, ensuring the end-effector reaches targets even with joint limits.
“The goal is not to plan the perfect path once, but to design a system that behaves well from wherever it finds itself.”
— paraphrasing the central message of LaValle, Chapter 8