LaValle, Chapter 8

Feedback Motion Planning

What if your robot doesn’t just follow a path — it reacts to where it is? Policies, vector fields, potential functions, and the one method with no local minima.

Prerequisites: Chapters 4–5 (C-space, sampling-based planning). That’s it.
10
Chapters
6+
Simulations
10
Quizzes

Chapter 0: Why Feedback?

You planned a perfect path from start to goal. Your robot begins to execute it. Halfway through, another robot bumps it slightly off course. The planned path is now useless — it was computed for a specific start position, and the robot is no longer there.

This is the core fragility of open-loop planning: compute a path once, follow it blindly. Any deviation — actuator noise, model error, unexpected obstacle — causes the robot to leave the planned trajectory with no recovery mechanism.

Feedback planning is the alternative. Instead of computing a single path, we compute a policy: a function that takes the robot’s current state and returns the action to take right now. The robot doesn’t commit to a sequence of waypoints. It continually asks: “Given where I am, what should I do?” and acts accordingly.

The fundamental shift: Open-loop planning asks “What path should I follow?” Feedback planning asks “What should I do from here?” The second question is answered for every possible location — not just the planned one.

A tale of two robots

Imagine two robots navigating a maze. Robot A pre-computes a path and follows it step by step. Robot B evaluates its current position at every moment and picks an action from a policy. When a crate falls and blocks part of the maze, Robot A walks into it; Robot B sees its current position, consults its policy, and routes around — the policy accounts for every configuration, including configurations reachable only after unexpected detours.

Open-Loop vs. Feedback Recovery

Watch what happens when an obstacle appears mid-execution. Click “Add Obstacle” while the robots are running.

■ Open-loop robot ■ Feedback robot ■ Obstacle
A robot executing a pre-computed path gets nudged sideways by a person. Why does open-loop planning fail here?

Chapter 1: Policies vs. Plans

A plan is a sequence of actions: (a1, a2, …, ak). Execute action 1, then action 2, and so on. The plan ignores state — it is a fixed script regardless of what actually happens.

A policy (LaValle calls it a feedback plan or strategy) is a function π: X → U that maps every possible state x ∈ X to an action u ∈ U. It is not a sequence to execute once — it is a lookup table valid everywhere. When the robot finds itself at state x, it applies π(x) to get its next action, regardless of how it arrived at x.

Key analogy: A plan is like written directions (“turn left at the library, then right at the park”). A policy is like a GPS — it tells you what to do from wherever you are right now, even if you took a wrong turn earlier.

What does a policy look like?

In a discrete grid world with 4-directional movement, a policy assigns one of {North, South, East, West, Stay} to every cell. You can visualize it as an arrow at each grid position. The robot always follows the arrow at its current cell. This is exactly a navigation function on a discrete graph — and it’s the topic of Chapter 2.

In a continuous C-space, the policy becomes a vector field: a smoothly varying arrow at each point in space. Following the arrows traces out a curve through C-space from any starting configuration to the goal. The mathematics of how that curve behaves is called integral curve theory — Chapters 3 and 4 cover it.

Sense current state x
Where is the robot right now? (position, heading, joint angles, …)
Evaluate policy π(x)
Look up or compute the action for this specific state
Execute action u
Apply force, move joint, turn wheel — robot moves to a new state
↻ repeat every timestep

Completeness for policies

A plan is complete if it reaches the goal when executed perfectly. A policy is complete if following it from any reachable start state guarantees eventual goal arrival. Completeness is harder to guarantee for policies — the robot could cycle, converge to the wrong place, or stall. The rest of this chapter is about designing policies that provably avoid these failures.

Formal definition (LaValle §8.1): A feedback plan π is complete if for every initial state x0 ∈ Xfree, the trajectory produced by iterating xt+1 = f(xt, π(xt)) reaches the goal set XG in finite time.
A policy maps every state to an action. What key property does this give it that a fixed plan lacks?

Chapter 2: Navigation Functions (Discrete)

Start with the simplest possible setting: a finite grid graph. Cells are either free or blocked. The goal is one special free cell. We want a policy — an arrow at every free cell — such that following any arrow, then the next, then the next, always leads to the goal.

The construction is a navigation function Cfree → ℝ, a scalar value at each free cell such that (1) the goal has the minimum value, and (2) the function decreases monotonically along any trajectory that follows the policy. If you always move to the neighbor with the smallest value, you will reach the goal.

Intuition: Think of a drainage landscape. The goal is a valley. Every other free cell sits above it. Water (the robot) always flows downhill. If the landscape has only one valley, water always reaches it from anywhere.

Wavefront propagation: building the function

The cleanest way to build a navigation function on a discrete grid is wavefront propagation (also called BFS from the goal):

Assign value 0 to goal
The goal cell gets navigation value 0
Propagate outward in BFS order
Every free neighbor of a cell with value k gets value k+1 (if not already assigned)
Derive the policy
At each cell, the policy arrow points to the neighbor with the smallest navigation value

The resulting values are exactly the shortest-path distances from each cell to the goal. The policy derived from them is the shortest-path policy. Every free cell reachable from the goal gets a finite value; unreachable cells stay at infinity (effectively, they lie in a disconnected component).

Wavefront Propagation on a Grid

Click a cell to place the goal (orange). Watch the wavefront spread. Then click “Show Policy” to see the arrows.

In wavefront propagation, what value does a free cell receive?

Chapter 3: Vector Fields

The discrete navigation function assigns an arrow to each grid cell. Moving to continuous C-space, the natural generalization is a vector field: a smoothly varying vector assigned to every point in C-space. At each configuration q, the field gives a direction of motion f(q) ∈ TqC, a vector in the tangent space at q.

Formally, a vector field on C-space is a function f: Cfree → T(C) where T(C) is the tangent bundle. In Euclidean C-space (C = Rn), this simplifies to f: Rn → Rn — each point maps to a displacement vector.

Think of it as: Weather maps show wind as a vector field — every point on the map has an arrow showing wind speed and direction. A navigation vector field assigns a “which way to go” arrow at every point in free C-space.

Properties we need

Not every vector field makes a good policy. We need three properties:

PropertyWhat it meansWhy it matters
Goal attractingf(qG) = 0 and trajectories converge to qGThe robot actually reaches the goal
Obstacle avoidingField points away from or parallel to obstacle boundariesThe robot doesn’t collide
No spurious equilibriaThe only point where f(q) = 0 is the goalThe robot doesn’t get stuck

The third property is the hardest to guarantee. A field can have saddle points (where the robot stalls unless exactly on the saddle), local minima (where the robot spirals inward to a wrong location), or limit cycles (where the robot orbits forever). Avoiding all three simultaneously is the central challenge of Chapter 8.

Gradient of a scalar potential

A convenient way to build a vector field is to define a scalar function φ: Cfree → R and take its negative gradient: f(q) = −∇φ(q). The gradient always points in the direction of steepest ascent; negating it gives steepest descent. A robot following this field rolls downhill on the surface φ.

This choice is powerful because a scalar function is much easier to design than a vector field directly. The field “falls out” of the scalar automatically. But it also introduces a constraint: gradient fields can only have the critical points that φ has — and scalar functions on Rn generically have local minima, saddle points, and maxima, not just global minima.

If a vector field is defined as f(q) = −∇φ(q), where does the robot stop moving?

Chapter 4: Integral Curves

We have a vector field f: Cfree → Rn. A robot starts at q0 and, at each moment, moves in the direction f(q). The path it traces is called an integral curve of the field — the curve whose tangent at every point agrees with the field direction at that point.

Formally, an integral curve through q0 is a function γ: [0, ∞) → Cfree satisfying:

dγ/dt = f(γ(t)),  γ(0) = q0

This is just the ordinary differential equation for following the field. By the Picard-Lindelöf theorem, if f is Lipschitz continuous, a unique integral curve exists through every starting point. This uniqueness is critical: it means two integral curves never cross (otherwise, the same point would map to two different directions, contradicting uniqueness).

Non-crossing is powerful: If integral curves can’t cross, and the goal attracts all curves, then the entire free space “drains” into the goal like a watershed into a single lake. No closed loops, no dead ends — provided there are no spurious equilibria.

Discretizing integral curves

In practice, we can’t integrate continuously. We approximate with small steps:

qt+1 = qt + ε · f(qt) / ‖f(qt)‖

Normalize the field vector first (so step size is always ε), then step. This is Euler integration of the ODE. The step size ε trades accuracy against computation. As ε → 0, the discrete path converges to the true integral curve. In practice, ε is chosen relative to the obstacle clearance.

Visualizing integral curves

Integral Curves of a 2D Vector Field

Each colored line is an integral curve starting from a different point. All converge to the goal (orange dot). Try clicking to start a new curve from any point.

Why is uniqueness of integral curves (the fact that two integral curves never cross) useful for robot navigation?

Chapter 5: Potential Fields

The most popular approach to feedback planning combines two forces: attraction toward the goal and repulsion away from obstacles. This is the artificial potential field method, introduced by Khatib in 1986 and refined extensively since.

We define a total potential φ(q) = φatt(q) + φrep(q) and move along its negative gradient. The attractive part pulls the robot toward the goal; the repulsive part pushes it away from obstacles.

Attractive potential

The most common attractive potential is a paraboloid centered at the goal qG:

φatt(q) = ½ η ‖q − qG2

Its gradient is ∇φatt(q) = η(q − qG), which points directly away from the goal with magnitude proportional to distance. The negative gradient −∇φatt(q) = η(qG − q) is a vector pointing toward the goal, getting weaker as the robot approaches. η > 0 is a gain controlling the strength of attraction.

Repulsive potential

Obstacles generate repulsive potentials that grow as the robot gets close. LaValle uses Khatib’s formulation: let d(q) be the distance from configuration q to the nearest obstacle. For a threshold distance ρ0:

φrep(q) = ½ ν (1/d(q) − 1/ρ0)2 if d(q) ≤ ρ0, else 0

Here ν > 0 is the repulsive gain. The function is zero outside radius ρ0 (obstacles don’t matter at long range), rises steeply as d(q) → 0 (approaching the obstacle surface), and goes to infinity at d(q) = 0 (inside an obstacle).

Why (1/d − 1/ρ0)2? The offset by 1/ρ0 makes φrep and its gradient go to zero smoothly at d = ρ0. A plain 1/d2 would have a discontinuous gradient at the boundary, which causes jerky motion.

Combining the potentials

The policy is: f(q) = −∇φ(q) = −∇φatt(q) − ∇φrep(q). Near the goal with no obstacles nearby, the attractive term dominates and the robot converges smoothly. Near an obstacle, the repulsive term overwhelms and the robot steers away. In the intermediate region, the two gradients balance and the robot curves around obstacles toward the goal.

Artificial Potential Field

Goal is orange. Obstacles are red. Arrows show the combined gradient field. Drag sliders to tune the gains, then trace robot paths by clicking.

Attraction η 1.5
Repulsion ν 1.0
In the artificial potential field, what does the repulsive potential do as the robot approaches an obstacle surface (d → 0)?

Chapter 6: The Local Minima Problem

The artificial potential field is elegant and fast to compute. It also has a fatal flaw. The combined potential φ = φatt + φrep is a scalar function on C-space, and scalar functions on Rn almost always have local minima — points where ∇φ = 0 but φ is not at its global minimum (the goal).

At a local minimum, the robot halts. It is at an equilibrium: the attractive force toward the goal and the repulsive force away from obstacles are perfectly balanced. The robot sits there forever, unable to reach the goal, with no indication from the field that it needs to do anything differently.

The core problem: Local minima are topological invariants. You cannot eliminate them from an arbitrary smooth potential by tuning η and ν alone. The obstacles’ shape determines how many local minima exist, and for complex environments they are unavoidable with the basic formulation.

Where local minima appear

Local minima typically appear in concave obstacles. Imagine a C-shaped obstacle with the goal inside the C and the robot approaching from outside. The repulsive walls push the robot back, but the attractive force toward the interior goal pulls it forward. There is often an equilibrium just outside the opening of the C where these forces balance.

Another common case: a narrow passage. Two obstacles close together with the goal on the other side. The repulsive forces from both walls push the robot backward; if the passage is narrow enough, the combined repulsion exceeds the attraction at the entry point, creating a local minimum right at the passage entrance.

Workarounds (and why they fail)

WorkaroundIdeaFailure mode
Random perturbationAdd random noise to escape minimaSlow; may re-enter minimum; not reliable
Increase attraction gainMake goal pull stronger than repulsionRobot crashes into obstacles when close
Decrease repulsion radiusRepulsion only active very close to obstaclesRobot grazes obstacles, safety compromised
Path librariesUse a planner when stuck, then resume fieldHybrid; loses completeness guarantee

None of these make the potential field method complete. Local minima remain. The only rigorous fix is to redesign the potential function entirely — which is the topic of Chapter 7.

Finding a Local Minimum

Click a start position. The robot follows the gradient field but gets stuck before reaching the goal. The orange star shows where it halts.

A robot following an artificial potential field comes to rest at a point that is not the goal. What has happened?

Chapter 7: Navigation Functions ★ Showcase

Koditschek and Rimon (1990) asked a remarkable question: can we design a potential function on C-space that has exactly one critical point — the goal? Such a function would have no local minima by construction. A robot following its gradient would reach the goal from anywhere in the free space.

They proved the answer is yes, for a specific class of worlds: a robot moving in a sphere world (a spherical outer boundary containing spherical obstacles). They called the resulting function a navigation function (LaValle §8.4).

The key property: A navigation function φ: Cfree → [0, 1] is Morse (no degenerate critical points), equals 1 on all obstacle boundaries, equals 0 only at the goal, and has exactly one local minimum (the goal). Every integral curve converges to the goal.

The Koditschek-Rimon construction

For a sphere world with spherical obstacles O1, …, Ok, the navigation function takes the form:

φ(q) = γ(q) / (γ(q)κ + β(q))1/κ

Here γ(q) = ‖q − qG2 is the goal distance squared; β(q) = ∏i βi(q) is a product of obstacle functions (βi = 0 on obstacle i’s boundary); and κ is a tuning parameter. As κ → ∞, the function approaches the pure goal-distance gradient in free space. For large enough finite κ, the product term dominates near obstacles, pushing trajectories away.

The critical insight: for sufficiently large κ, this function is a Morse function with exactly one local minimum. The proof uses Morse theory to count critical points — for sphere worlds, the topology guarantees the count comes out to exactly one.

Comparison: Potential Field vs. Navigation Function

PropertyPotential FieldNavigation Function
Local minimaYes, many possibleNone (proven)
CompleteNoYes (for sphere worlds)
Environment restrictionNone (but no guarantees)Must be sphere world
ComputationSimple, analyticMore complex, needs κ tuning
Obstacle shapesAny (with distance query)Must be spherical

Interactive Showcase: Navigation Function vs. Potential Field

Navigation Function vs. Potential Field — Live Comparison

Left panel: Artificial potential field (has local minima). Right panel: Navigation function (no local minima).
Click anywhere to place the goal (orange). Then click “Trace All Starts” to see where robots launched from a grid of starts end up. Watch the left panel get stuck; the right panel always reaches the goal.

κ (nav fn) 8
What you should see: In the left (potential field) panel, some trajectories spiral into local minima near the spherical obstacles and never reach the goal. In the right (navigation function) panel, every single trajectory converges to the goal regardless of start position. That is the theorem made visible.

Chapter 8: Sampling-Based Feedback Planning

Navigation functions are complete and elegant, but they require sphere worlds — a severe geometric restriction. Real environments have walls, corners, furniture, and irregular obstacle shapes. How do we get feedback completeness without restricting the geometry?

The answer is to combine the best of both worlds: use a sampling-based planner (like RRT or PRM from Chapter 5) to build a roadmap over free C-space, then derive a discrete policy over that roadmap. The result is a sampling-based feedback plan (LaValle §8.5).

Step 1: Build a roadmap

Run a PRM-style sampler over Cfree to build a graph G = (V, E). Vertices V are collision-free configurations; edges E connect nearby vertices with collision-free local paths. This is the same structure as Chapter 5, but now we will use it differently: not to answer a single query, but to define a global policy.

Step 2: Assign navigation values to vertices

With the goal configuration qG as the root, run Dijkstra’s algorithm on the graph G. Every vertex v gets a shortest-path distance d(v, qG). This is the discrete navigation function on the roadmap — identical to the wavefront propagation from Chapter 2, but on the PRM graph instead of a grid.

Step 3: Derive the policy

For every vertex v, the policy says: move to the neighboring vertex u with smallest d(u, qG). Between vertices, use local path following (e.g., straight-line interpolation if the edge is a straight segment). The full policy π(q) works as follows: from current configuration q, find the nearest roadmap vertex v, then follow the edge toward the best neighbor.

Sample PRM over Cfree
Build collision-free graph G = (V, E)
Dijkstra from goal
Assign navigation values d(v, qG) to all vertices
Derive policy
Each vertex points to its best neighbor; robot follows nearest vertex’s edge
Execute with feedback
At each timestep, re-query nearest vertex; robustly handles small perturbations

Completeness and coverage

This approach is probabilistically complete: as the number of samples grows, the probability that the roadmap fails to cover Cfree goes to zero. For any fixed environment, enough samples will eventually connect every navigable region to the goal. In practice, this means the feedback plan can handle arbitrary (non-spherical) environments, at the cost of needing enough samples.

Robustness to perturbation: A key advantage over open-loop planning is that small perturbations don’t break the plan. If the robot is nudged off a roadmap edge, it simply re-queries its nearest vertex and resumes following the policy. The policy is defined for all of Cfree, not just the roadmap.
Sampling-Based Feedback Plan

Click “Build Roadmap” to sample and connect vertices. Click “Solve” to run Dijkstra from the goal (orange). Then click any free point to trace a robot path using the feedback policy.

Samples 60
In sampling-based feedback planning, what is the role of Dijkstra’s algorithm?

Chapter 9: Connections & What Comes Next

Chapter 8 opened with a simple idea — replace a plan with a policy — and developed it into navigation functions, vector fields, integral curves, potential fields, and sampling-based feedback. Here’s how it all connects to the broader landscape of planning and control.

Where this lives in the book

ChapterTopicRelation to Ch 8
Ch 2Discrete planningWavefront propagation is the discrete nav function
Ch 5Sampling-based planningPRM provides the roadmap for sampling-based feedback
Ch 7ExtensionsMultiple-robot policies and time-varying obstacles use the same feedback framework
Ch 9Decision-theoretic planningPolicies under uncertainty generalize Ch 8 policies to MDPs
Ch 14Trajectory planningKinodynamic constraints on the vector field

Feedback vs. open-loop: the big picture

Open-Loop PlanFeedback Policy
RobustnessFragile: any deviation breaks itRobust: defined for all states
ComputationSingle query, then donePre-compute once; evaluate cheaply
OptimalityCan optimize the single pathHarder to optimize (Bellman equations needed)
CompletenessComplete if planner isComplete if nav function exists or roadmap covers
ReplanningRequired when environment changesOften self-correcting for small changes

Real-world uses

Autonomous driving uses potential-field-inspired methods for lane centering, with replanning for large deviations. Drone swarms use gradient fields for collision avoidance (each drone repels neighbors). Robotic arms use navigation functions for joint-space control, ensuring the end-effector reaches targets even with joint limits.

Limitations and honest caveats

Related micro-lessons

MDP & Value Functions: Chapter 9 of LaValle extends policies to probabilistic settings. The connection is the Bellman equation: the navigation value satisfies a recursive optimality condition identical to the value function in an MDP. See the MDP micro-lesson.
Sampling-Based Planning: The roadmap in Chapter 8’s sampling-based feedback plan comes directly from PRM (Chapter 5). Revisit the Ch 5 lesson for the construction details.
“The goal is not to plan the perfect path once, but to design a system that behaves well from wherever it finds itself.”
— paraphrasing the central message of LaValle, Chapter 8
Which method from Chapter 8 provides a provably complete feedback plan with no local minima?