Ch 8: Feedback Motion Planning — LaValle, Planning Algorithms

Chapter 0: Why Feedback?

You planned a perfect path from start to goal. Your robot begins to execute it. Halfway through, another robot bumps it slightly off course. The planned path is now useless — it was computed for a specific start position, and the robot is no longer there.

This is the core fragility of open-loop planning: compute a path once, follow it blindly. Any deviation — actuator noise, model error, unexpected obstacle — causes the robot to leave the planned trajectory with no recovery mechanism.

Feedback planning is the alternative. Instead of computing a single path, we compute a policy: a function that takes the robot’s current state and returns the action to take right now. The robot doesn’t commit to a sequence of waypoints. It continually asks: “Given where I am, what should I do?” and acts accordingly.

The fundamental shift: Open-loop planning asks “What path should I follow?” Feedback planning asks “What should I do from here?” The second question is answered for every possible location — not just the planned one.

A tale of two robots

Imagine two robots navigating a maze. Robot A pre-computes a path and follows it step by step. Robot B evaluates its current position at every moment and picks an action from a policy. When a crate falls and blocks part of the maze, Robot A walks into it; Robot B sees its current position, consults its policy, and routes around — the policy accounts for every configuration, including configurations reachable only after unexpected detours.

Open-Loop vs. Feedback Recovery

Watch what happens when an obstacle appears mid-execution. Click “Add Obstacle” while the robots are running.

■ Open-loop robot ■ Feedback robot ■ Obstacle

A robot executing a pre-computed path gets nudged sideways by a person. Why does open-loop planning fail here?

The path was too short to reach the goal. The plan was computed for a specific trajectory; the robot’s new position was never accounted for, so the plan gives no guidance from there. The robot’s sensors failed to detect the nudge.

Chapter 1: Policies vs. Plans

A plan is a sequence of actions: (a₁, a₂, …, a_k). Execute action 1, then action 2, and so on. The plan ignores state — it is a fixed script regardless of what actually happens.

A policy (LaValle calls it a feedback plan or strategy) is a function π: X → U that maps every possible state x ∈ X to an action u ∈ U. It is not a sequence to execute once — it is a lookup table valid everywhere. When the robot finds itself at state x, it applies π(x) to get its next action, regardless of how it arrived at x.

Key analogy: A plan is like written directions (“turn left at the library, then right at the park”). A policy is like a GPS — it tells you what to do from wherever you are right now, even if you took a wrong turn earlier.

What does a policy look like?

In a discrete grid world with 4-directional movement, a policy assigns one of {North, South, East, West, Stay} to every cell. You can visualize it as an arrow at each grid position. The robot always follows the arrow at its current cell. This is exactly a navigation function on a discrete graph — and it’s the topic of Chapter 2.

In a continuous C-space, the policy becomes a vector field: a smoothly varying arrow at each point in space. Following the arrows traces out a curve through C-space from any starting configuration to the goal. The mathematics of how that curve behaves is called integral curve theory — Chapters 3 and 4 cover it.

Sense current state x

Where is the robot right now? (position, heading, joint angles, …)

↓

Evaluate policy π(x)

Look up or compute the action for this specific state

↓

Execute action u

Apply force, move joint, turn wheel — robot moves to a new state

↻ repeat every timestep

Completeness for policies

A plan is complete if it reaches the goal when executed perfectly. A policy is complete if following it from any reachable start state guarantees eventual goal arrival. Completeness is harder to guarantee for policies — the robot could cycle, converge to the wrong place, or stall. The rest of this chapter is about designing policies that provably avoid these failures.

Formal definition (LaValle §8.1): A feedback plan π is complete if for every initial state x₀ ∈ X_free, the trajectory produced by iterating x_t+1 = f(x_t, π(x_t)) reaches the goal set X_G in finite time.

A policy maps every state to an action. What key property does this give it that a fixed plan lacks?

It always produces shorter paths than a plan would. It can be computed faster than a plan. It can recover from deviations — it provides guidance from any state, not just the originally planned states.

Chapter 2: Navigation Functions (Discrete)

Start with the simplest possible setting: a finite grid graph. Cells are either free or blocked. The goal is one special free cell. We want a policy — an arrow at every free cell — such that following any arrow, then the next, then the next, always leads to the goal.

The construction is a navigation function C_free → ℝ, a scalar value at each free cell such that (1) the goal has the minimum value, and (2) the function decreases monotonically along any trajectory that follows the policy. If you always move to the neighbor with the smallest value, you will reach the goal.

Intuition: Think of a drainage landscape. The goal is a valley. Every other free cell sits above it. Water (the robot) always flows downhill. If the landscape has only one valley, water always reaches it from anywhere.

Wavefront propagation: building the function

The cleanest way to build a navigation function on a discrete grid is wavefront propagation (also called BFS from the goal):

Assign value 0 to goal

The goal cell gets navigation value 0

↓

Propagate outward in BFS order

Every free neighbor of a cell with value k gets value k+1 (if not already assigned)

↓

Derive the policy

At each cell, the policy arrow points to the neighbor with the smallest navigation value

The resulting values are exactly the shortest-path distances from each cell to the goal. The policy derived from them is the shortest-path policy. Every free cell reachable from the goal gets a finite value; unreachable cells stay at infinity (effectively, they lie in a disconnected component).

Wavefront Propagation on a Grid

Click a cell to place the goal (orange). Watch the wavefront spread. Then click “Show Policy” to see the arrows.

In wavefront propagation, what value does a free cell receive?

Its shortest-path distance to the goal, in grid steps. Its Euclidean distance to the goal. A random value assigned for tiebreaking.

Chapter 3: Vector Fields

The discrete navigation function assigns an arrow to each grid cell. Moving to continuous C-space, the natural generalization is a vector field: a smoothly varying vector assigned to every point in C-space. At each configuration q, the field gives a direction of motion f(q) ∈ T_qC, a vector in the tangent space at q.

Formally, a vector field on C-space is a function f: C_free → T(C) where T(C) is the tangent bundle. In Euclidean C-space (C = Rⁿ), this simplifies to f: Rⁿ → Rⁿ — each point maps to a displacement vector.

Think of it as: Weather maps show wind as a vector field — every point on the map has an arrow showing wind speed and direction. A navigation vector field assigns a “which way to go” arrow at every point in free C-space.

Properties we need

Not every vector field makes a good policy. We need three properties:

Property	What it means	Why it matters
Goal attracting	f(q_G) = 0 and trajectories converge to q_G	The robot actually reaches the goal
Obstacle avoiding	Field points away from or parallel to obstacle boundaries	The robot doesn’t collide
No spurious equilibria	The only point where f(q) = 0 is the goal	The robot doesn’t get stuck

The third property is the hardest to guarantee. A field can have saddle points (where the robot stalls unless exactly on the saddle), local minima (where the robot spirals inward to a wrong location), or limit cycles (where the robot orbits forever). Avoiding all three simultaneously is the central challenge of Chapter 8.

Gradient of a scalar potential

A convenient way to build a vector field is to define a scalar function φ: C_free → R and take its negative gradient: f(q) = −∇φ(q). The gradient always points in the direction of steepest ascent; negating it gives steepest descent. A robot following this field rolls downhill on the surface φ.

This choice is powerful because a scalar function is much easier to design than a vector field directly. The field “falls out” of the scalar automatically. But it also introduces a constraint: gradient fields can only have the critical points that φ has — and scalar functions on Rⁿ generically have local minima, saddle points, and maxima, not just global minima.

If a vector field is defined as f(q) = −∇φ(q), where does the robot stop moving?

Wherever φ is largest. Wherever ∇φ = 0, i.e., at critical points of φ (which include local minima, not just the goal). Wherever the obstacles are closest.

Chapter 4: Integral Curves

We have a vector field f: C_free → Rⁿ. A robot starts at q₀ and, at each moment, moves in the direction f(q). The path it traces is called an integral curve of the field — the curve whose tangent at every point agrees with the field direction at that point.

Formally, an integral curve through q₀ is a function γ: [0, ∞) → C_free satisfying:

dγ/dt = f(γ(t)), γ(0) = q₀

This is just the ordinary differential equation for following the field. By the Picard-Lindelöf theorem, if f is Lipschitz continuous, a unique integral curve exists through every starting point. This uniqueness is critical: it means two integral curves never cross (otherwise, the same point would map to two different directions, contradicting uniqueness).

Non-crossing is powerful: If integral curves can’t cross, and the goal attracts all curves, then the entire free space “drains” into the goal like a watershed into a single lake. No closed loops, no dead ends — provided there are no spurious equilibria.

Discretizing integral curves

In practice, we can’t integrate continuously. We approximate with small steps:

q_t+1 = q_t + ε · f(q_t) / ‖f(q_t)‖

Normalize the field vector first (so step size is always ε), then step. This is Euler integration of the ODE. The step size ε trades accuracy against computation. As ε → 0, the discrete path converges to the true integral curve. In practice, ε is chosen relative to the obstacle clearance.

Visualizing integral curves

Integral Curves of a 2D Vector Field

Each colored line is an integral curve starting from a different point. All converge to the goal (orange dot). Try clicking to start a new curve from any point.

Why is uniqueness of integral curves (the fact that two integral curves never cross) useful for robot navigation?

It means the robot will always take the shortest path. It means the robot can teleport between curves. It means from any start, the robot follows a deterministic, non-intersecting path — if the goal attracts all curves, the whole space drains to it with no cycles possible.

Chapter 5: Potential Fields

The most popular approach to feedback planning combines two forces: attraction toward the goal and repulsion away from obstacles. This is the artificial potential field method, introduced by Khatib in 1986 and refined extensively since.

We define a total potential φ(q) = φ_att(q) + φ_rep(q) and move along its negative gradient. The attractive part pulls the robot toward the goal; the repulsive part pushes it away from obstacles.

Attractive potential

The most common attractive potential is a paraboloid centered at the goal q_G:

φ_att(q) = ½ η ‖q − q_G‖²

Its gradient is ∇φ_att(q) = η(q − q_G), which points directly away from the goal with magnitude proportional to distance. The negative gradient −∇φ_att(q) = η(q_G − q) is a vector pointing toward the goal, getting weaker as the robot approaches. η > 0 is a gain controlling the strength of attraction.

Repulsive potential

Obstacles generate repulsive potentials that grow as the robot gets close. LaValle uses Khatib’s formulation: let d(q) be the distance from configuration q to the nearest obstacle. For a threshold distance ρ₀:

φ_rep(q) = ½ ν (1/d(q) − 1/ρ₀)² if d(q) ≤ ρ₀, else 0

Here ν > 0 is the repulsive gain. The function is zero outside radius ρ₀ (obstacles don’t matter at long range), rises steeply as d(q) → 0 (approaching the obstacle surface), and goes to infinity at d(q) = 0 (inside an obstacle).

Why (1/d − 1/ρ₀)²? The offset by 1/ρ₀ makes φ_rep and its gradient go to zero smoothly at d = ρ₀. A plain 1/d² would have a discontinuous gradient at the boundary, which causes jerky motion.

Combining the potentials

The policy is: f(q) = −∇φ(q) = −∇φ_att(q) − ∇φ_rep(q). Near the goal with no obstacles nearby, the attractive term dominates and the robot converges smoothly. Near an obstacle, the repulsive term overwhelms and the robot steers away. In the intermediate region, the two gradients balance and the robot curves around obstacles toward the goal.

Artificial Potential Field

Goal is orange. Obstacles are red. Arrows show the combined gradient field. Drag sliders to tune the gains, then trace robot paths by clicking.

Attraction η 1.5

Repulsion ν 1.0

In the artificial potential field, what does the repulsive potential do as the robot approaches an obstacle surface (d → 0)?

It decreases, allowing the robot to pass through. It stays constant at ν. It grows toward infinity, creating a repulsive force that prevents the robot from entering the obstacle.

Chapter 6: The Local Minima Problem

The artificial potential field is elegant and fast to compute. It also has a fatal flaw. The combined potential φ = φ_att + φ_rep is a scalar function on C-space, and scalar functions on Rⁿ almost always have local minima — points where ∇φ = 0 but φ is not at its global minimum (the goal).

At a local minimum, the robot halts. It is at an equilibrium: the attractive force toward the goal and the repulsive force away from obstacles are perfectly balanced. The robot sits there forever, unable to reach the goal, with no indication from the field that it needs to do anything differently.

The core problem: Local minima are topological invariants. You cannot eliminate them from an arbitrary smooth potential by tuning η and ν alone. The obstacles’ shape determines how many local minima exist, and for complex environments they are unavoidable with the basic formulation.

Where local minima appear

Local minima typically appear in concave obstacles. Imagine a C-shaped obstacle with the goal inside the C and the robot approaching from outside. The repulsive walls push the robot back, but the attractive force toward the interior goal pulls it forward. There is often an equilibrium just outside the opening of the C where these forces balance.

Another common case: a narrow passage. Two obstacles close together with the goal on the other side. The repulsive forces from both walls push the robot backward; if the passage is narrow enough, the combined repulsion exceeds the attraction at the entry point, creating a local minimum right at the passage entrance.

Workarounds (and why they fail)

Workaround	Idea	Failure mode
Random perturbation	Add random noise to escape minima	Slow; may re-enter minimum; not reliable
Increase attraction gain	Make goal pull stronger than repulsion	Robot crashes into obstacles when close
Decrease repulsion radius	Repulsion only active very close to obstacles	Robot grazes obstacles, safety compromised
Path libraries	Use a planner when stuck, then resume field	Hybrid; loses completeness guarantee

None of these make the potential field method complete. Local minima remain. The only rigorous fix is to redesign the potential function entirely — which is the topic of Chapter 7.

Finding a Local Minimum

Click a start position. The robot follows the gradient field but gets stuck before reaching the goal. The orange star shows where it halts.

A robot following an artificial potential field comes to rest at a point that is not the goal. What has happened?

The robot reached the goal but the display is wrong. The robot fell into a local minimum where ∇φ = 0 but φ ≠ φ(q_G); the attractive and repulsive forces balanced at a non-goal location. The repulsion gain ν was set too low.

Chapter 7: Navigation Functions ★ Showcase

Koditschek and Rimon (1990) asked a remarkable question: can we design a potential function on C-space that has exactly one critical point — the goal? Such a function would have no local minima by construction. A robot following its gradient would reach the goal from anywhere in the free space.

They proved the answer is yes, for a specific class of worlds: a robot moving in a sphere world (a spherical outer boundary containing spherical obstacles). They called the resulting function a navigation function (LaValle §8.4).

The key property: A navigation function φ: C_free → [0, 1] is Morse (no degenerate critical points), equals 1 on all obstacle boundaries, equals 0 only at the goal, and has exactly one local minimum (the goal). Every integral curve converges to the goal.

The Koditschek-Rimon construction

For a sphere world with spherical obstacles O₁, …, O_k, the navigation function takes the form:

φ(q) = γ(q) / (γ(q)^κ + β(q))^1/κ

Here γ(q) = ‖q − q_G‖² is the goal distance squared; β(q) = ∏_i β_i(q) is a product of obstacle functions (β_i = 0 on obstacle i’s boundary); and κ is a tuning parameter. As κ → ∞, the function approaches the pure goal-distance gradient in free space. For large enough finite κ, the product term dominates near obstacles, pushing trajectories away.

The critical insight: for sufficiently large κ, this function is a Morse function with exactly one local minimum. The proof uses Morse theory to count critical points — for sphere worlds, the topology guarantees the count comes out to exactly one.

Comparison: Potential Field vs. Navigation Function

Property	Potential Field	Navigation Function
Local minima	Yes, many possible	None (proven)
Complete	No	Yes (for sphere worlds)
Environment restriction	None (but no guarantees)	Must be sphere world
Computation	Simple, analytic	More complex, needs κ tuning
Obstacle shapes	Any (with distance query)	Must be spherical

Interactive Showcase: Navigation Function vs. Potential Field

Navigation Function vs. Potential Field — Live Comparison

Left panel: Artificial potential field (has local minima). Right panel: Navigation function (no local minima).
Click anywhere to place the goal (orange). Then click “Trace All Starts” to see where robots launched from a grid of starts end up. Watch the left panel get stuck; the right panel always reaches the goal.

κ (nav fn) 8

What you should see: In the left (potential field) panel, some trajectories spiral into local minima near the spherical obstacles and never reach the goal. In the right (navigation function) panel, every single trajectory converges to the goal regardless of start position. That is the theorem made visible.

Chapter 8: Sampling-Based Feedback Planning

Navigation functions are complete and elegant, but they require sphere worlds — a severe geometric restriction. Real environments have walls, corners, furniture, and irregular obstacle shapes. How do we get feedback completeness without restricting the geometry?

The answer is to combine the best of both worlds: use a sampling-based planner (like RRT or PRM from Chapter 5) to build a roadmap over free C-space, then derive a discrete policy over that roadmap. The result is a sampling-based feedback plan (LaValle §8.5).

Step 1: Build a roadmap

Run a PRM-style sampler over C_free to build a graph G = (V, E). Vertices V are collision-free configurations; edges E connect nearby vertices with collision-free local paths. This is the same structure as Chapter 5, but now we will use it differently: not to answer a single query, but to define a global policy.

Step 2: Assign navigation values to vertices

With the goal configuration q_G as the root, run Dijkstra’s algorithm on the graph G. Every vertex v gets a shortest-path distance d(v, q_G). This is the discrete navigation function on the roadmap — identical to the wavefront propagation from Chapter 2, but on the PRM graph instead of a grid.

Step 3: Derive the policy

For every vertex v, the policy says: move to the neighboring vertex u with smallest d(u, q_G). Between vertices, use local path following (e.g., straight-line interpolation if the edge is a straight segment). The full policy π(q) works as follows: from current configuration q, find the nearest roadmap vertex v, then follow the edge toward the best neighbor.

Sample PRM over C_free

Build collision-free graph G = (V, E)

↓

Dijkstra from goal

Assign navigation values d(v, q_G) to all vertices

↓

Derive policy

Each vertex points to its best neighbor; robot follows nearest vertex’s edge

↓

Execute with feedback

At each timestep, re-query nearest vertex; robustly handles small perturbations

Completeness and coverage

This approach is probabilistically complete: as the number of samples grows, the probability that the roadmap fails to cover C_free goes to zero. For any fixed environment, enough samples will eventually connect every navigable region to the goal. In practice, this means the feedback plan can handle arbitrary (non-spherical) environments, at the cost of needing enough samples.

Robustness to perturbation: A key advantage over open-loop planning is that small perturbations don’t break the plan. If the robot is nudged off a roadmap edge, it simply re-queries its nearest vertex and resumes following the policy. The policy is defined for all of C_free, not just the roadmap.

Sampling-Based Feedback Plan

Click “Build Roadmap” to sample and connect vertices. Click “Solve” to run Dijkstra from the goal (orange). Then click any free point to trace a robot path using the feedback policy.

Samples 60

In sampling-based feedback planning, what is the role of Dijkstra’s algorithm?

To sample collision-free configurations. To assign shortest-path distances to all roadmap vertices, defining a navigation value (discrete navigation function) over the graph. To detect collisions between the robot and obstacles.

Chapter 9: Connections & What Comes Next

Chapter 8 opened with a simple idea — replace a plan with a policy — and developed it into navigation functions, vector fields, integral curves, potential fields, and sampling-based feedback. Here’s how it all connects to the broader landscape of planning and control.

Where this lives in the book

Chapter	Topic	Relation to Ch 8
Ch 2	Discrete planning	Wavefront propagation is the discrete nav function
Ch 5	Sampling-based planning	PRM provides the roadmap for sampling-based feedback
Ch 7	Extensions	Multiple-robot policies and time-varying obstacles use the same feedback framework
Ch 9	Decision-theoretic planning	Policies under uncertainty generalize Ch 8 policies to MDPs
Ch 14	Trajectory planning	Kinodynamic constraints on the vector field

Feedback vs. open-loop: the big picture

	Open-Loop Plan	Feedback Policy
Robustness	Fragile: any deviation breaks it	Robust: defined for all states
Computation	Single query, then done	Pre-compute once; evaluate cheaply
Optimality	Can optimize the single path	Harder to optimize (Bellman equations needed)
Completeness	Complete if planner is	Complete if nav function exists or roadmap covers
Replanning	Required when environment changes	Often self-correcting for small changes

Real-world uses

Autonomous driving uses potential-field-inspired methods for lane centering, with replanning for large deviations. Drone swarms use gradient fields for collision avoidance (each drone repels neighbors). Robotic arms use navigation functions for joint-space control, ensuring the end-effector reaches targets even with joint limits.

Limitations and honest caveats

Navigation functions only proven complete for sphere worlds; extension to general environments is an active research area.
Potential fields are fast but not complete; they work well in practice for open environments and poorly in cluttered ones.
Sampling-based feedback requires dense coverage; in very high-dimensional C-spaces, getting enough samples everywhere is infeasible.
All methods in this chapter assume perfect state knowledge. Chapter 11 (sensors and uncertainty) relaxes this.

Related micro-lessons

MDP & Value Functions: Chapter 9 of LaValle extends policies to probabilistic settings. The connection is the Bellman equation: the navigation value satisfies a recursive optimality condition identical to the value function in an MDP. See the MDP micro-lesson.

Sampling-Based Planning: The roadmap in Chapter 8’s sampling-based feedback plan comes directly from PRM (Chapter 5). Revisit the Ch 5 lesson for the construction details.

“The goal is not to plan the perfect path once, but to design a system that behaves well from wherever it finds itself.”
— paraphrasing the central message of LaValle, Chapter 8

Which method from Chapter 8 provides a provably complete feedback plan with no local minima?

Artificial potential fields (Khatib 1986). Navigation functions (Koditschek-Rimon 1990) — for sphere worlds, proven to have exactly one critical point (the goal). Sampling-based feedback using a PRM roadmap.