LaValle, Chapter 5

Sampling-Based Motion Planning

RRTs, PRMs, collision detection, Voronoi bias — the algorithms that make real-world motion planning possible.

Prerequisites: Chapter 2 + Chapter 4. That’s it.
12
Chapters
8+
Simulations
12
Quizzes

Chapter 0: Why Cobs Can’t Be Computed

Chapter 4 ended with a beautiful abstraction: every motion planning problem becomes a point navigating through C-space, avoiding the obstacle region Cobs. Graph search, dynamic programming, all the clean Chapter 2 machinery — just apply it to C-space. We’re done, right?

There’s a catch. To run graph search on C-space, you first need to build the graph. That means you need to know where Cobs is. For a 6-DOF robot arm in a 3D environment, Cobs is a region in T6 (the 6-torus, six angle dimensions) defined by the intersection of complex swept volumes of the arm’s links with arbitrary 3D obstacles. Explicitly computing its boundary requires symbolic Minkowski sums in 6 dimensions. For a 50-link chain, Cobs lives in T50. The mathematical object you would need to construct has no closed form.

LaValle’s key insight (Section 5.1): “The main idea is to avoid the explicit construction of Cobs and instead conduct a search that probes the C-space with a sampling scheme. This probing is enabled by a collision detection module, which the motion planning algorithm considers as a ‘black box.’”

The planner never needs to see Cobs. It only needs to ask: “is this specific configuration q in Cfree?” A collision detection module answers yes or no in milliseconds. The planner samples points, checks them, and builds a picture of Cfree one query at a time.

The separation architecture

Think of it as three separate concerns that never need to talk to each other directly:

Geometric Model
Robot links, obstacle meshes, transformation matrices — the actual 3D geometry in workspace
↓ feeds into
Collision Detector (Black Box)
Answers: “is A(q) ∩ O = ∅?” — yes means q ∈ Cfree, no means q ∈ Cobs
↓ queried by
Sampling-Based Planner
Generates configurations, checks them, builds a graph or tree entirely within Cfree

The planner is completely agnostic to geometry. Swap the robot from a 6-DOF arm to a 30-DOF snake robot and the planner code doesn’t change at all — only the collision detector changes. This separation is what makes sampling-based planning a general framework rather than a special-purpose algorithm.

Three notions of completeness

Before seeing how these planners work, we need to nail down what “works” means. When does an algorithm count as solving the motion planning problem? There are three increasingly weak notions:

TypeGuaranteeExample methods
CompleteAlways finds a solution if one exists; always reports failure in finite time if none existsCombinatorial planners (Ch 6), exact cell decomposition
Resolution completeIf a solution exists and the grid is fine enough, finds it; deterministically dense samplingGrid-based planners with deterministic dense sequences
Probabilistically completeProbability of finding an existing solution approaches 1 as iterations → ∞RRT, PRM with random sampling

Most sampling-based planners are probabilistically complete. This sounds weak: the algorithm might, with probability zero, fail forever even when a solution exists. But “probability zero” means it almost surely terminates — and in practice on real robots, these algorithms find solutions in fractions of a second on problems where complete planners would take years.

Why “probability zero” is acceptable. A probabilistically complete algorithm uses a random sequence that is dense in Cfree with probability 1. For any configuration q* ∈ Cfree and any ε > 0, infinitely many samples fall within ε of q*, almost surely. The planner can reach q* arbitrarily closely. The only failure mode — a sequence that avoids some region forever — has measure zero in the space of all possible random sequences.

Discovering Cobs one sample at a time

The simulation below shows a 2D C-space with an irregular obstacle region (gray). The planner doesn’t know the shape of Cobs — it discovers it by sampling. Green dots are configurations in Cfree; red dots are in Cobs. As you accumulate samples, the boundary of Cobs becomes clear from the pattern of red points alone.

Probing C-Space: The Black Box Collision Detector

Click “Sample” to add random configurations. Green = Cfree, red = Cobs. The planner never sees the obstacle boundary directly — it only learns from individual point queries.

0 samples — click to start
Why is probabilistic completeness weaker than resolution completeness?

Chapter 1: Sampling Quality — Dispersion & Discrepancy

The planner probes C-space by sampling configurations. How well you cover C-space determines how quickly you find paths and how reliably you detect narrow passages. Two algorithms that both produce “uniform random” samples can have wildly different coverage quality on any finite budget of samples. This chapter is about making that difference precise — and about deterministic sequences that do better than random.

The van der Corput sequence: better than random

Here is LaValle’s cleanest demonstration (Figure 5.2). Suppose you want to place 8 sample points in [0, 1]. The naive approach: evenly spaced at 0, 1/8, 2/8, ..., 7/8. But you need to commit to n = 8 upfront. What if you don’t know n in advance? If you lay them down one at a time in sorted order — 0, 1/16, 1/8, 3/16, ... — after placing 4 points you’ve only covered [0, 3/16]. The entire right half of [0, 1] is unexplored.

The van der Corput sequence fixes this with a trick: reverse the binary digits of the index. The index 1 in binary is “1”, reversed and placed after the decimal point gives “0.1” in binary = 1/2. Index 2 = “10”, reversed = “0.01” = 1/4. Index 3 = “11”, reversed = “0.11” = 3/4. Index 4 = “100”, reversed = “0.001” = 1/8. The sequence: 0, 1/2, 1/4, 3/4, 1/8, 5/8, 3/8, 7/8, ...

The magic property: after ANY n samples from the van der Corput sequence, coverage is roughly uniform across [0, 1]. Each new point fills the largest existing gap. You never need to know n in advance — the sequence is “progressive.” Random sequences don’t have this: sometimes 10 random points all land in [0, 0.3] by sheer bad luck.

For higher dimensions, the Halton sequence uses van der Corput independently on each axis, with successive prime bases: base 2 for axis 0, base 3 for axis 1, base 5 for axis 2, and so on. Because different primes produce decorrelated sequences, the resulting 2D (or kD) points are well-distributed across all dimensions simultaneously.

python
def van_der_corput(n: int, base: int = 2) -> float:
    """n-th term of the van der Corput sequence in given base.

    Worked examples (base 2):
      n=1  ->  0.5    (binary:  1  -> 0.1)
      n=2  ->  0.25   (binary: 10  -> 0.01)
      n=3  ->  0.75   (binary: 11  -> 0.11)
      n=4  ->  0.125  (binary: 100 -> 0.001)
      n=5  ->  0.625  (binary: 101 -> 0.101)
    """
    q, r = 0, 1
    while n:
        n, remainder = divmod(n, base)
        r *= base
        q += remainder / r
    return q

def halton_sequence(n: int, dim: int = 2) -> list:
    """Generate n Halton points in [0,1]^dim.

    Uses van der Corput with prime base per dimension.
    Output: list of n tuples (each of length dim).
    """
    primes = [2, 3, 5, 7, 11, 13, 17, 19, 23, 29]
    return [
        tuple(van_der_corput(i, primes[d]) for d in range(dim))
        for i in range(1, n + 1)
    ]

# Concrete comparison: 20 random vs 20 Halton points in [0,1]^2
import numpy as np
random_pts  = np.random.rand(20, 2)
halton_pts  = np.array(halton_sequence(20, 2))

# Largest empty circle radius (dispersion) for each:
# random_pts  -> typically ~0.25 (varies widely each run)
# halton_pts  -> consistently ~0.12 (stays stable)

Dispersion: the size of the largest gap

To make coverage quality precise, LaValle defines dispersion (equation 5.19):

δ(P) = supx ∈ X minp ∈ P ρ(x, p)

In words: for every point x in the space, find the closest sample p; take the maximum of these distances over all x. Dispersion equals the radius of the largest empty ball — the biggest uncovered hole in your sample set. Lower dispersion means smaller largest gap, which means better coverage.

The theoretically optimal deterministic strategy is the Sukharev grid: place k1/n samples per axis in an n-dimensional space, at cell centers. This achieves dispersion ½ · k−1/n. The catch: to achieve dispersion ε in n dimensions you need (1/2ε)n points. In 10 dimensions with ε = 0.01, that’s 5010 ≈ 1016 points. This is the curse of dimensionality hitting dispersion directly — grid-based planners are dead on arrival in high-dimensional C-spaces.

Random sampling doesn’t achieve deterministically small dispersion — there’s always a chance a large gap remains. But random sequences are dense with probability 1: any ball of positive radius eventually gets a sample inside. The price is irregular coverage: some regions get many samples, others get few.

Sampling uniformly on SO(3): the Hopf fibration trick

For 3D rigid bodies, configurations include orientation, which lives on SO(3) — the 3-sphere S3 in quaternion representation with antipodal points identified. The naive approach: pick random Euler angles φ, θ, ψ uniformly from [0, 2π]. This is wrong. The Euler angle parameterization is not volume-preserving on SO(3); regions near the poles (θ near 0 or π) get oversampled.

The correct method (LaValle, equation 5.15) uses three uniform variables u1, u2, u3 ∈ [0, 1] through the Hopf fibration decomposition:

q = ( √(1−u1) sin2πu2,   √(1−u1) cos2πu2,   √u1 sin2πu3,   √u1 cos2πu3 )

This quaternion q = (q1, q2, q3, q4) is uniformly distributed on S3, giving a uniform distribution over all 3D orientations. The math comes from the Hopf fibration structure: S3 fibers over S2 with S1 fibers. Variable u1 samples the base S2 (the √ terms handle the non-uniform Jacobian), while u2 and u3 sample the fiber S1.

Always use Hopf fibration sampling for SO(3). Random Euler angles cluster near gimbal configurations. The Hopf formula is three extra lines of code and guarantees correct uniform distribution over all orientations. An equivalent alternative: draw a 3×3 matrix of independent standard normals and QR-decompose it — the Q factor is a uniformly random rotation matrix.
Dispersion: Random vs Halton Sampling

Left: uniform random samples. Right: Halton sequence. The red circle shows the largest empty ball (= dispersion). Random sampling creates large variance in gap sizes. Halton coverage is much more uniform.

N samples 30
You place 1000 uniform random samples in [0,1]2 and measure the largest empty circle (dispersion) as radius 0.05. You then add 1000 more random samples. Is the largest empty circle now guaranteed to have radius ≤ 0.025?

Chapter 2: Collision Detection — The Black Box

The collision detector is the most expensive component of a sampling-based planner. In practice, 90%+ of planning time is spent here. Understanding how it works — and how to query it efficiently — determines whether your planner runs in 0.1 seconds or 10 minutes on the same problem.

The question is always the same: is A(q) ∩ O = ∅? Is the robot body at configuration q, which I’ll write as A(q), disjoint from the obstacle set O? Yes means q ∈ Cfree. No means q ∈ Cobs. The planner never needs anything beyond this binary answer — but the collision detector can optionally return the distance to the nearest obstacle, which the planner can exploit to skip redundant checks.

Broad phase: culling the obvious non-collisions

A robot with many links and a scene with many obstacles requires checking many pairs. A naive O(n · m) check (n robot parts times m obstacles) is too slow. The broad phase eliminates pairs that clearly can’t intersect, leaving a small number of candidate pairs for exact narrow-phase checking.

The most common technique: Axis-Aligned Bounding Boxes (AABBs). Every object is wrapped in the tightest axis-aligned box. If two AABBs don’t overlap on ALL three coordinate axes simultaneously, the objects inside them can’t collide — no exact check needed. AABB overlap testing takes O(1) per pair. Combined with a Bounding Volume Hierarchy (BVH) — a tree of nested AABBs — most of the O(n · m) pairs are eliminated in O((n + m) log(n + m)) time.

Sweep-and-prune is an alternative broad phase for dynamic scenes. Sort the objects’ AABB intervals along each axis. Two objects can only intersect if their intervals overlap on all three axes. Maintaining sorted lists with incremental updates as the robot moves gives near-constant time between successive configurations — very important for planners that check configurations along a dense path segment.

Narrow phase: exact checking for candidate pairs

After the broad phase, a few pairs remain as potential collisions. The narrow phase does exact geometry: do these two specific shapes actually intersect?

For convex polyhedra, the gold-standard algorithm is GJK (Gilbert-Johnson-Keerthi). GJK finds the closest points between two convex shapes by iteratively improving a simplex in the Minkowski difference space. Each iteration is O(v) in the number of vertices. Crucially: GJK returns the distance between the shapes, not just a yes/no. This distance information is directly exploitable by the planner.

For robotics applications, Lin-Canny is often faster in practice. It tracks the “closest feature pair” between two convex polyhedra — which pair of features (vertex-vertex, vertex-edge, edge-edge, edge-face) is geometrically closest. When the robot moves a small amount, the closest feature pair usually doesn’t change. The next check starts with a warm cache and terminates in nearly constant time. This incremental coherence is crucial for planners checking many nearby configurations along a path.

Checking a path segment (Section 5.3.4)

The planner doesn’t just check isolated configurations — it checks whether a path from q1 to q2 is entirely collision-free. The path has uncountably many points; you can’t check them all. The standard strategy: subdivide the segment and check at a finite number of sample points.

The naive approach checks at evenly spaced intervals τ(t) = (1−t)q1 + tq2 for t = 0, Δ, 2Δ, ... This works but is wasteful: it checks both endpoints before the middle. If there’s a collision at t = 0.5, the naive checker wastes time on t = 0, Δ, 2Δ, ... before reaching it.

The smarter strategy: check samples in van der Corput order. Check t = 1/2 first, then 1/4 and 3/4, then 1/8, 3/8, 5/8, 7/8, and so on. This is binary search over the segment. Collisions anywhere on the segment are found faster because the strategy immediately explores the middle of every unexplored sub-interval.

t sequence: ½,   ¼,   ¾,   ⅛,   ⅝,   ⅜,   ⅞,   &frac{1}{16},   …

Stop when the dispersion of the checked set falls below Δq — the resolution at which you’re guaranteed not to miss an obstacle. If any checked point is in Cobs, immediately return “not collision-free.” If all points down to resolution Δq pass, the segment is declared safe.

Using GJK’s distance output: if A(q) is d units from the nearest obstacle, the robot can translate up to d without collision in any direction. If the workspace displacement corresponding to the next step Δq is ≤ d, you can skip the check entirely and advance by d/Δq steps at once. This distance-based stepping can reduce collision checks by 10x in open regions of C-space.

Sequential vs Binary Path Segment Checking

A path from start to goal passes near an obstacle. Orange dots show the order in which configurations are checked. Sequential wastes time at safe endpoints; binary finds the collision near the middle quickly.

Choose a checking strategy to begin
python
def van_der_corput_order(depth: int) -> list:
    """t-values to check along a path segment, in van der Corput order.

    depth=1: [0.5]                    (just the midpoint)
    depth=2: [0.5, 0.25, 0.75]        (midpoint, then quarter-points)
    depth=3: [0.5, 0.25, 0.75, 0.125, 0.625, 0.375, 0.875]
    """
    result = []
    for k in range(1, depth + 1):
        step = 1.0 / (2 ** k)
        for j in range(1, 2 ** k, 2):   # only odd multiples of step
            result.append(j * step)
    return result

def check_path_segment(q1, q2, collision_fn, depth=8) -> bool:
    """Check straight-line path q1 -> q2 for collisions.

    Uses van der Corput ordering: checks the MIDDLE of each
    unexplored interval first, finding collisions faster.

    Returns True if path is clear; False at first collision found.
    collision_fn(q) -> bool: True means q is in C_obs.
    """
    for t in van_der_corput_order(depth):
        q = q1 + t * (q2 - q1)        # linear interpolation
        if collision_fn(q):
            return False             # collision found, stop immediately
    return True                       # all checks passed

# At depth=8: 2^8 - 1 = 255 t-values checked, dispersion = 1/512 along path
# For a step_size = 0.05m path, 255 checks guarantees no gap > 0.05/512 = 0.1mm
A GJK collision check reports that robot body A(q) is 5 cm from the nearest obstacle. The robot can translate in any direction. What is the maximum translation the robot can make from q without needing another collision check?

Chapter 3: The VSM + LPM Framework

Here is something surprising: RRT, PRM, SBL, KPIECE, EST — all the major sampling-based planners you’ll encounter in the robotics literature — are instances of the same two-line template. LaValle (Section 5.4.1) distills them into a framework built from two interchangeable components. Once you see this structure, reading new planner papers becomes much faster: you just need to identify the VSM and LPM.

The general incremental planner template

The framework maintains a graph G = (V, E) of collision-free configurations. Vertices V are configurations verified to be in Cfree. Edges E are verified collision-free paths between adjacent vertices. The planner builds this graph iteratively until G connects qI to qG:

Initialization
G(V, E) with V = {qI} for unidirectional, or {qI, qG} for bidirectional
VSM: Vertex Selection Method
Choose a vertex qcur ∈ V to expand from next
LPM: Local Planning Method
Find a collision-free path from qcur to some qnew ∈ Cfree
Insert
If LPM succeeds: add qnew to V, add edge (qcur, qnew) to E
Check for solution
Does G now connect qI to qG? If yes: extract path and return
↻ repeat

The Vertex Selection Method (VSM) decides which vertex to expand next. RRT’s VSM: pick the vertex nearest to a random sample α(i). PRM’s VSM: pick any vertex (the PRM builds the graph in a preprocessing phase, ignoring the goal). KPIECE’s VSM: pick a vertex in an underexplored cell of a discrete workspace projection. EST’s VSM: pick a vertex randomly weighted by the inverse of its local density.

The Local Planning Method (LPM) decides how to extend from qcur to a new configuration. RRT’s LPM: shoot toward the random sample, stopping at the Cobs boundary. PRM’s LPM: connect to a nearby pre-sampled configuration with a straight-line path. Kinodynamic planners’ LPM: simulate the robot’s dynamics forward in time with a random control input.

The only difference between planners is VSM + LPM. Swap them and you get a completely different algorithm — but the skeleton stays identical. This modularity separates “where should we explore next?” (VSM) from “how do we get there?” (LPM). Improvement research focuses on one at a time.

The bug trap problem

Why can’t we just run grid search on C-space? The bug trap (Figure 5.13 in LaValle) illustrates why. Imagine a C-space where Cobs forms a deep bowl — a concavity with a narrow exit. The start is deep inside the bowl; the goal is outside. Grid-based BFS expands uniformly outward from the start, flooding the bowl’s interior before it can escape. In a 10-dimensional C-space, the bowl may contain an exponential number of grid cells.

The fundamental problem: uniform expansion allocates resources proportional to volume, not to progress. The bowl’s interior is a geometric dead end, but BFS can’t know that without first exploring it. The narrow exit — the critical feature — receives almost no attention until the entire bowl is exhausted.

Random sampling escapes the bug trap for free. A uniform random sample has equal probability of landing anywhere in C-space, including outside the bowl. When it does, the nearest tree vertex is on the bowl’s rim — and the tree extends directly toward the exit. The search is guided by C-space geometry without ever computing that geometry. This is the first glimpse of why random sampling is essential in high-dimensional planning.

Unidirectional, bidirectional, multidirectional

The simplest case: one tree from qI reaches toward qG. Disadvantage: if qG is deep inside a narrow passage, the tree must find that passage from one side only — like threading a needle from one end.

Bidirectional search grows two trees simultaneously: Ta from qI and Tb from qG. When the trees are close, an LPM connection attempt is made. For a narrow passage, Ta approaches from one side while Tb approaches from the other. They need only meet somewhere in the passage — each tree needs to enter only halfway, not traverse the full length.

Multidirectional search places multiple trees at random configurations throughout C-space. Trees that grow close to each other are connected. This helps in C-spaces with many disconnected narrow passages, at the cost of managing which trees are worth extending.

Bug Trap: Grid Search vs Random Sampling

C-space with a deep concavity (bug trap). Grid search floods the interior with nodes before escaping. Random search jumps directly toward the exit. Watch the node counts diverge.

Press a search type to begin
A bug trap has a tiny opening in a 10-dimensional C-space. How does bidirectional search find the path faster than unidirectional search?

Chapter 4: The RRT Algorithm — LaValle’s Invention

In 1998, Steven LaValle was a graduate student at Iowa State working on motion planning for systems with differential constraints. He needed an algorithm that could efficiently explore high-dimensional spaces without any prior knowledge of Cobs, and that worked for robots with dynamics (not just geometric constraints). What he invented, the Rapidly-exploring Random Tree (RRT), became one of the most-cited algorithms in all of robotics. This chapter explains exactly why it works.

The RDT without obstacles (Figure 5.16)

Start with the simplest possible version: an RDT (Rapidly-exploring Dense Tree) growing in free space with no obstacles. We just want the tree to quickly cover the space. The algorithm:

algorithm
SIMPLE_RDT(q_init, alpha, k):
  G.init(q_init)                       # single-vertex tree at start
  for i = 1 to k:
    q_new  = alpha(i)                  # sample a configuration
    q_near = nearest(G.vertices, q_new) # find closest existing vertex
    G.add_vertex(q_new)
    G.add_edge(q_near, q_new)           # connect nearest to new sample
  return G

Where α is a sequence of configurations — random (making it an RRT) or a deterministic low-discrepancy sequence (making it an RDT). The key line is nearest(G.vertices, q_new): instead of adding q_new as a free-floating point, we always attach it to the closest existing vertex. This one choice makes the tree rapidly explore.

The RRT with obstacles (Figure 5.21)

Adding obstacles requires one modification: instead of adding the sample directly, extend toward it from the nearest vertex, stopping at the Cobs boundary. The function stopping_config(q_near, q_rand) steps in direction (q_rand − q_near) at increment Δq until either a collision is detected or q_rand is reached. It returns the last collision-free configuration along this ray.

algorithm
RDT(q_init, alpha, k, stopping_config):
  G.init(q_init)
  for i = 1 to k:
    q_near = nearest(G.vertices, alpha(i))
    q_s    = stopping_config(q_near, alpha(i))  # extend until C_obs boundary
    if q_s != q_near:                             # made progress
      G.add_vertex(q_s)
      G.add_edge(q_near, q_s)                    # edge is collision-free by construction
  return G

The edge (q_near, q_s) is guaranteed collision-free because stopping_config checked every step along it. No separate path-segment check is needed. This is a key efficiency: the tree’s extension and its collision check are one unified operation.

The Voronoi bias: why RRTs explore rapidly

Consider the Voronoi diagram of the tree’s current vertices: the partition of C-space where each region contains all points closer to one vertex than to any other. Vertices that sit deep inside the already-explored region have small Voronoi cells — most nearby territory is already claimed by their neighbors. Vertices on the frontier have large Voronoi cells extending into unexplored space.

When we draw a uniform random sample α(i), the probability it lands in any vertex’s Voronoi cell is proportional to that cell’s volume. Large cells attract more samples. Therefore: frontier vertices attract more samples. Therefore: nearest(G, α(i)) more frequently returns a frontier vertex. Therefore: extensions happen preferentially in unexplored directions.

The Voronoi bias is automatic and free. Nobody computes the Voronoi diagram. Nobody labels vertices as “frontier.” The bias emerges purely from the nearest-neighbor query combined with uniform random sampling. It is “implicit” exploration — arising from geometry, not from explicit bookkeeping. This is what LaValle considers the RRT’s central insight.

The result: RRT trees have fractal-like structure. After 45 iterations, tendrils reach into every corner of the space. After 2345 iterations, coverage is remarkably uniform (Figure 5.19 in LaValle). Grid-based BFS, by contrast, grows as a uniform sphere — it must fully explore every configuration within radius r before reaching any configuration at radius r + ε.

Full Python implementation

python
import numpy as np
from typing import Optional, Callable, List

class RRT:
    """Rapidly-exploring Random Tree (LaValle 1998).

    Works in any dimension. Only requires a collision checker
    (point query: is this configuration in C_free?) and a sampler.
    No knowledge of C_obs geometry is needed.
    """

    def __init__(self,
                 q_init: np.ndarray,
                 q_goal: np.ndarray,
                 collision_free: Callable[[np.ndarray], bool],
                 bounds: np.ndarray,        # shape (dim, 2): lo/hi per axis
                 step_size: float = 0.05,
                 goal_bias: float = 0.05):
        self.q_init       = q_init
        self.q_goal       = q_goal
        self.collision_free = collision_free
        self.bounds       = bounds
        self.step_size    = step_size
        self.goal_bias    = goal_bias
        self.dim          = len(q_init)
        self.vertices     = [q_init]      # list[np.ndarray]
        self.parent       = {0: None}    # vertex_index -> parent_index

    def _sample(self) -> np.ndarray:
        """Uniform random sample with small probability of returning goal."""
        if np.random.rand() < self.goal_bias:
            return self.q_goal
        lo, hi = self.bounds[:, 0], self.bounds[:, 1]
        return lo + np.random.rand(self.dim) * (hi - lo)

    def _nearest(self, q: np.ndarray) -> int:
        """Index of vertex nearest to q (Euclidean, O(n) naive scan)."""
        dists = [np.linalg.norm(v - q) for v in self.vertices]
        return int(np.argmin(dists))

    def _steer(self, q_near: np.ndarray, q_rand: np.ndarray) -> np.ndarray:
        """stopping_config: advance from q_near toward q_rand,
        stopping at C_obs boundary. Returns last collision-free config."""
        direction = q_rand - q_near
        dist = np.linalg.norm(direction)
        if dist < 1e-9: return q_near
        unit = direction / dist
        q_s  = q_near
        steps = max(int(dist / self.step_size), 1)
        for _ in range(steps):
            q_next = q_s + self.step_size * unit
            if not self.collision_free(q_next):
                break              # hit C_obs, stop before this point
            q_s = q_next
            if np.linalg.norm(q_s - q_near) >= dist:
                break              # reached q_rand, done
        return q_s

    def plan(self, max_iter: int = 5000) -> Optional[List[np.ndarray]]:
        """Run RRT. Returns path (list of configs) or None if failed."""
        for _ in range(max_iter):
            q_rand    = self._sample()
            idx_near  = self._nearest(q_rand)
            q_s       = self._steer(self.vertices[idx_near], q_rand)

            if np.linalg.norm(q_s - self.vertices[idx_near]) < 1e-9:
                continue           # didn't move (immediate obstacle), skip

            idx_new = len(self.vertices)
            self.vertices.append(q_s)
            self.parent[idx_new] = idx_near

            if np.linalg.norm(q_s - self.q_goal) < self.step_size:
                return self._extract_path(idx_new)

        return None               # max iterations exceeded

    def _extract_path(self, goal_idx: int) -> List[np.ndarray]:
        path, idx = [], goal_idx
        while idx is not None:
            path.append(self.vertices[idx])
            idx = self.parent[idx]
        return path[::-1]          # reverse: start -> goal

# Example: 2D point robot with a circular obstacle
def clear_of_circle(q: np.ndarray) -> bool:
    """True if q is outside circle of radius 0.25 centered at (0.5, 0.5)."""
    return np.linalg.norm(q - np.array([0.5, 0.5])) > 0.25

rrt = RRT(
    q_init=np.array([0.1, 0.1]),
    q_goal=np.array([0.9, 0.9]),
    collision_free=clear_of_circle,
    bounds=np.array([[0.0, 1.0], [0.0, 1.0]]),
    step_size=0.05, goal_bias=0.1
)
path = rrt.plan()
print(f"Found {len(path)} waypoints in {len(rrt.vertices)} tree vertices")
# Typical: "Found 43 waypoints in 287 tree vertices" (varies per run)
RRT Growing in 2D C-Space

Watch the tree expand step by step. The Voronoi bias is visible: tendrils reach into empty corners first. When a path to the goal is found, it flashes green. Step through or animate.

Step size 3%
0 vertices — step or run the RRT
An RRT has grown 100 vertices, densely filling the left half of C-space. The right half is completely empty. Where will the NEXT random sample most likely cause the tree to extend?

Chapter 5: RRT Variants — Connect, Star, Bidirectional

The basic RRT is already surprisingly effective, but three extensions — bidirectional growth, greedy connection, and rewiring — dramatically improve performance for different problem types. Understanding which variant to choose is as important as understanding the base algorithm.

RRT-Connect: bidirectional greedy extension (LaValle & Kuffner 2001)

Grow two trees simultaneously: Ta rooted at qI and Tb rooted at qG. Each iteration extends Ta toward a random sample α(i), then immediately tries to greedily connect Tb to the new vertex. “Greedy connect” means: extend Tb toward the new vertex in a loop, step after step, until either the trees connect or we hit Cobs. If they connect: solution found.

algorithm
RDT_BALANCED_BIDIRECTIONAL(q_I, q_G, alpha, K):
  T_a.init(q_I);   T_b.init(q_G)
  for i = 1 to K:
    q_n  = nearest(T_a, alpha(i))
    q_s  = stopping_config(q_n, alpha(i))
    if q_s != q_n:
      T_a.add(q_s, parent=q_n)
      q_n2 = nearest(T_b, q_s)            # T_b tries to reach q_s
      q_s2 = stopping_config(q_n2, q_s)   # greedy: loop until stuck or connected
      if q_s2 != q_n2:
        T_b.add(q_s2, parent=q_n2)
        if q_s2 == q_s:                    # trees met exactly: SOLUTION
          return extract_path(T_a, T_b, q_s)
    if |T_b| > |T_a|:  SWAP(T_a, T_b)    # always extend the smaller tree
  return FAILURE

The balancing step — SWAP(T_a, T_b) when Tb grows larger than Ta — prevents one tree from dominating. If Ta is stuck in a narrow passage and grows slowly, the algorithm automatically switches to extending Tb instead, approaching the passage from the outside where there is more room to maneuver.

Why RRT-Connect is so fast. The greedy connect step doesn’t just take one step toward q_s — it loops until it connects or hits an obstacle. In open spaces, Tb often reaches Ta in a single greedy connect call. This makes RRT-Connect roughly 2× faster than unidirectional RRT in open environments, and dramatically faster on problems where start and goal are in separate “rooms” of C-space.

RRT*: asymptotic optimality (Karaman & Frazzoli 2011)

A standard RRT finds some path, not necessarily a short one. Path quality is determined by the random sample order and step size — essentially arbitrary. RRT* adds two operations after each vertex insertion to guarantee that as iterations → ∞, the path cost converges to the global optimum.

Step 1 — Choose best parent. When adding q_new, find all existing vertices within a ball of radius r(n) = γ (log n / n)1/d (where n = current vertex count, d = dimension). Instead of connecting to the nearest vertex, connect to whichever neighbor minimizes the total cost from qI to q_new. This is local Dijkstra: pick q_best = argminq ∈ near [cost(root → q) + dist(q, q_new)], verifying each candidate edge is collision-free.

Step 2 — Rewire. After adding q_new, check whether any nearby vertex q_near can be reached more cheaply through q_new. If cost(root → q_new) + dist(q_new, q_near) < cost(root → q_near), and the edge (q_new, q_near) is collision-free, then redirect q_near’s parent to q_new. Propagate cost updates to all descendants.

python
def rrt_star_extend(tree, q_rand, collision_free, gamma, d):
    """One RRT* iteration: steer, choose best parent, rewire."""
    # Standard RRT steer
    q_near = tree.nearest(q_rand)
    q_new  = tree.steer(q_near, q_rand)
    if not collision_free(q_new): return

    # RRT*: find ALL neighbors within shrinking ball
    n = len(tree.vertices)
    r = gamma * (np.log(n) / n) ** (1.0 / d)   # radius shrinks as tree grows
    near = tree.neighbors_within(q_new, r)

    # Choose best parent (minimum cost path from root)
    best_parent = q_near
    best_cost   = tree.cost(q_near) + tree.dist(q_near, q_new)
    for v in near:
        c = tree.cost(v) + tree.dist(v, q_new)
        if c < best_cost and collision_free_segment(tree[v], q_new):
            best_cost, best_parent = c, v
    new_idx = tree.add(q_new, parent=best_parent, cost=best_cost)

    # Rewire: can any neighbor reach root cheaper through q_new?
    for v in near:
        c_through = best_cost + tree.dist(new_idx, v)
        if c_through < tree.cost(v) and \
           collision_free_segment(q_new, tree[v]):
            tree.rewire(v, new_parent=new_idx, new_cost=c_through)
            # propagate cost improvements to all descendants of v
RRT* is asymptotically optimal. Formally: with probability 1, as n → ∞, the best path in the tree converges to the globally optimal path cost. The shrinking ball radius r(n) is essential: it ensures that as the tree gets denser, the rewiring finds increasingly precise shortcuts. Vanilla RRT with a fixed radius connection is NOT asymptotically optimal — it converges to a suboptimal path and stays there.

Informed RRT*: focusing the search

Once RRT* finds an initial solution of cost cbest, any strictly shorter path must pass through the prolate hyperspheroid (elongated ellipsoid):

E = { q ∈ C : dist(qI, q) + dist(q, qG) ≤ cbest }

This is the set of all points q that could be on a path no longer than cbest. Points outside E cannot possibly improve the solution. Informed RRT* (Gammell et al., 2014) restricts all future samples to this ellipsoid once a solution exists. The ellipsoid is easy to sample: transform a uniform unit-ball sample by scaling along the focus axis by cbest/2 and transverse axes by √(cbest2 − dfoci2)/2, where dfoci = dist(qI, qG). As the path improves, the ellipsoid shrinks, further concentrating sampling.

AlgorithmOptimal?SpeedBest for
RRTNoFastSingle query, need any path quickly
RRT-ConnectNoVery fastBidirectional, escaping bug traps
RRT*Yes (asymptotic)~5× slower than RRTPath quality matters, time available
Informed RRT*YesFaster than RRT* post-solutionRRT* with focused refinement
RRT vs RRT-Connect vs RRT*

Same obstacle environment, three algorithms. RRT-Connect grows orange and blue trees toward each other. RRT* shows edges rewiring as the path shortens. Compare node counts and path lengths.

Choose an algorithm to run
RRT-Connect always extends the smaller tree next. Why?

Chapter 6: The PRM — Probabilistic Roadmaps

RRT grows a single tree for one query. What if you need to plan a thousand different paths through the same kitchen, or a robot arm that picks objects from a fixed workspace all day? Running RRT a thousand times is wasteful. Probabilistic Roadmaps (PRM), introduced by Kavraki et al. (1996) and described in LaValle Section 5.6, take a different approach: build a rich graph of Cfree once, then answer any query quickly by connecting to that graph.

This is the multi-query philosophy. The work is front-loaded into a preprocessing phase; queries become cheap lookups.

Two-phase architecture. Phase 1 (preprocessing): sample N configurations, connect nearby collision-free pairs to build roadmap G. Phase 2 (query): for any new (qI, qG), connect both to G, then run graph search (Dijkstra or A*) through G for the path.

The BUILD_ROADMAP algorithm

The preprocessing phase (Algorithm 5.25 in the book) loops until N collision-free vertices are added:

python
def build_roadmap(N, sample, collision_free, connect, neighborhood):
    """PRM preprocessing phase."""
    G = Graph()
    while len(G.vertices) < N:
        q = sample()                          # random config in C
        if not collision_free(q): continue   # reject if in C_obs
        G.add_vertex(q)
        for v in neighborhood(q, G):          # nearby vertices
            if G.same_component(q, v): continue   # already connected
            if connect(q, v):              # collision-free local path
                G.add_edge(q, v)
    return G

def query_roadmap(G, q_I, q_G, connect, search):
    """PRM query phase: attach start/goal, find path."""
    for v in neighborhood(q_I, G):
        if connect(q_I, v): G.add_edge(q_I, v)
    for v in neighborhood(q_G, G):
        if connect(q_G, v): G.add_edge(q_G, v)
    return search(G, q_I, q_G)            # Dijkstra / A*

The same_component check is crucial for efficiency. Using union-find (disjoint set union), each check is near-O(1). It prevents wasting time trying to connect vertices already in the same connected component. As the roadmap grows, the whole Cfree becomes one component and new edges stop being needed.

Neighborhood strategies

How big should the neighborhood be? LaValle describes four strategies, each with different tradeoffs:

StrategyRuleTypical setting
Nearest KConnect to the K closest verticesK ≈ 15
Radius rConnect to all vertices within ball of radius rr tuned per C-space
Component KK nearest from each connected componentK = 1–3
VisibilityConnect to ALL guard vertices (see below)

The Component K strategy is particularly smart: it ensures each new vertex connects to multiple different components, aggressively merging the roadmap. The visibility strategy leads to the Visibility Roadmap (Section 5.6.2).

Visibility roadmaps: drastically fewer vertices

The basic PRM can accumulate thousands of vertices in open space where a handful would suffice. The visibility roadmap maintains only two types of vertices:

Any sample that can see exactly one guard is discarded — it adds no new information. The result is a minimal roadmap: guards occupy distinct regions of Cfree, connectors bridge between them. For well-connected spaces this is dramatically smaller than a basic PRM.

Multi-query payoff. After paying the O(N²) preprocessing cost, every new query costs only O(log N) for nearest-neighbor lookup plus Dijkstra over the roadmap. If you need Q queries, total cost is O(N² + Q log N) vs. O(Q · N) for Q separate RRTs. When Q is large, PRM wins decisively.
PRM: Build Once, Query Many

Phase 1: Click “Build Roadmap” to sample collision-free vertices and connect nearby pairs. Phase 2: Click “New Query” to attach a random start (green) and goal (red) to the existing roadmap and find a path. Notice the roadmap is reused — building only happens once.

Click “Build Roadmap” first.
PRMs are called “multi-query” planners. When does this advantage over RRT matter most?

Chapter 7: Narrow Passages — The Achilles Heel

Every sampling-based planner — RRT, PRM, all of them — shares a fundamental weakness. Picture a room connected to another room by a doorway just barely wide enough for the robot. Randomly throwing darts at C-space, how often do you hit a doorway that’s 1% as wide as the whole space? About 1% of the time. A doorway 0.01% as wide? About 1-in-10,000 samples. In high dimensions, this gets catastrophically worse.

LaValle formalizes this with ε-goodness (eq. 5.41). For any configuration q ∈ Cfree, define V(q) as the set of all configurations visible from q (connected by a collision-free straight line). The ε-goodness of Cfree is:

ε(Cfree) = minq ∈ Cfree μ(V(q)) / μ(Cfree)

This measures the smallest fraction of Cfree visible from any single point. A narrow passage has small ε: the configuration at the passage entrance can see very little of Cfree. PRM’s convergence rate scales with ε — small ε means exponentially more samples needed.

The volume intuition. A corridor of width w in D-dimensional space has volume proportional to wD−1. If Cfree has unit volume, the probability of a random sample landing in the corridor is ∼wD−1. For w=0.05, D=6: probability ≈ 0.055 ≈ 3×10−7. You need millions of samples to reliably enter the passage.

Heuristics that help (Section 5.6.3)

Gaussian sampling: Generate a pair (q1, q2) where q2 is drawn from a Gaussian centered on q1. If one is in Cfree and the other is in Cobs, keep the one in Cfree. Discard if both are free (not near a boundary) or both are in Cobs. This concentrates samples near the Cobs surface.

Bridge test (Figure 5.29b): Sample q1 and q2 from a Gaussian pair. If BOTH are in Cobs but their midpoint qmid = (q1+q2)/2 is in Cfree, then qmid is likely inside a narrow passage. Keep qmid. This specifically targets the passage interior.

Medial axis sampling: The medial axis is the skeleton of Cfree — the set of points equidistant from at least two obstacle boundaries. Sampling on the medial axis produces configurations maximally clear of obstacles AND near the critical bottlenecks where passages are defined.

Boundary sampling: Binary search between a Cfree point and a Cobs point to find a configuration on the boundary, then perturb along the boundary surface. Efficiently fills narrow corridors with samples right at the walls.

Narrow Passage: Uniform vs. Bridge Test Sampling

A C-space with a narrow corridor connecting two rooms. Toggle sampling strategy and watch how many samples are needed before the passage is covered. Green = Cfree samples, orange = bridge test hits.

Choose a sampling strategy.
The bridge test keeps sample qmid when both endpoint samples are in Cobs but the midpoint is in Cfree. What does this imply about qmid?

Chapter 8: Showcase — RRT vs PRM Side by Side

Everything converges here. The same obstacle environment, two different planners, running in real time. Watch how their philosophies play out: RRT sprints to a solution, PRM invests heavily upfront then answers each new query instantly. This is the definitive comparison that Chapter 6 promised.

What you’re about to see. Left panel: RRT grows a tree from scratch for every new query. Right panel: PRM builds the roadmap once, then each “New Query” just connects new start/goal to the existing graph — milliseconds vs. seconds.
RRT vs PRM — Split Screen Showcase

Left: RRT from scratch each query. Right: PRM roadmap (build once, query many). Click “Build PRM” first, then use “New Query” to assign a new start/goal to both. Notice how PRM answers instantly after the first build.

Step 1: Build PRM. Step 2: New Query. Step 3: Run RRT to compare.

The honest comparison table

FeatureRRTPRM
Query typeSingleMultiple
PreprocessingNoneBuild roadmap (expensive)
Time to first solutionFastSlow (roadmap first)
Path qualityPoor (greedy)Moderate (graph search)
Narrow passagesModerate (Voronoi bias helps)Poor without heuristics
Fixed env, many queriesSlow (re-plan each time)Fast (reuse roadmap)
Changing environmentGoodPoor (roadmap stale)
MemoryTree (smaller)Graph (larger)
When to use which. Robot arm picking from a fixed bin with 100 target positions → PRM. Autonomous car navigating an unknown city → RRT. Path quality critical and time available → RRT* or PRM* (asymptotically optimal variants).
You have a robot arm in a fixed factory workspace. You need to plan 200 different pick-and-place motions. Which planner is more efficient overall?

Chapter 9: Path Smoothing & Practical Details

You found a path. Now look at it — it zigzags, doubles back, takes unnecessary detours. Sampling-based planners find feasible paths, not good ones. The fix is simple and effective: iterative shortcutting.

Iterative shortcutting (eq. 5.39)

Pick two random waypoints on the path. Try a straight-line shortcut between them. If the shortcut is collision-free, replace the intermediate waypoints with the direct connection. Repeat. Each successful shortcut eliminates zigzags and reduces path length. After enough iterations, the path converges toward the shortest path in its homotopy class.

python
def smooth_path(path, collision_free, iters=200):
    """Iterative shortcutting (LaValle eq. 5.39)."""
    for _ in range(iters):
        if len(path) < 3: break
        i = random.randint(0, len(path) - 2)
        j = random.randint(i + 1, len(path) - 1)
        if collision_free_segment(path[i], path[j]):
            path = path[:i+1] + path[j:]   # remove intermediate waypoints
    return path

Important limitation: shortcutting cannot change the homotopy class. If RRT went around the left side of an obstacle, shortcutting gives the shortest left-side path — it will never discover that going right was shorter. For homotopy-class exploration, you need multi-query planners or topology-aware methods.

Distance metrics matter

The nearest-neighbor query in RRT depends entirely on your distance metric. Use the wrong metric and the Voronoi bias misfires.

Step size matters too. Too small: the tree grows slowly and wastes collision checks. Too large: the planner misses thin obstacles entirely. LaValle recommends an adaptive step — try the full step, but binary search to find the exact collision boundary if needed. This is the connection step (Section 5.3.3) from the LPM framework.

Practical nearest-neighbor

Naive nearest-neighbor: for every new q_rand, check all n vertices in the tree. O(n) per query. For n=10,000 vertices: fine. For n=1,000,000: slow. Better structures:

Path Smoothing: Iterative Shortcutting

A raw RRT path is shown in orange. Click “Smooth” to apply shortcutting iterations. Watch waypoints collapse as direct connections are found. Final smoothed path in teal.

Click “New RRT Path” to start.
After 1000 iterations of shortcutting, a path stops improving. Does this mean the path is globally optimal?

Chapter 10: Completeness — What Can Go Wrong

RRT is probabilistically complete. That sounds reassuring. But what exactly does it mean, and what does it NOT guarantee?

The dense sequence argument

RRT’s probabilistic completeness (Theorem 5.6) relies on the random sequence being dense in Cfree with probability 1. Formally: for every open set U ⊂ Cfree and every N, infinitely many of the random samples fall in U. This is guaranteed by the law of large numbers for any continuous distribution over C.

But “dense with probability 1” is a statement about infinite sequences. In practice you run N iterations and stop. Resolution completeness is actually more useful for implementation: use a deterministic dense sequence (Halton, van der Corput) with fine enough resolution, and you will find any solution larger than δ clearance, deterministically.

Why implementations combine both. Production motion planners typically mix a pseudo-random sequence (probabilistic completeness — can’t be systematically fooled) with a quasirandom sequence (resolution completeness — covers space efficiently). The best of both worlds.

What probabilistic completeness does NOT guarantee

PropertyResolution CompleteProbabilistically Complete
SequenceDeterministic (Halton, grid)Random / pseudo-random
GuaranteeFinds path of clearance ≥ δ if it existsProbability of finding path → 1
InfeasibilityCan certify “no path of clearance δ”Cannot certify infeasibility
ReproducibilitySame answer every runDifferent each run
LaValle’s pragmatic view. Both notions “capture the intuitive requirement that the algorithm be complete in the limit.” The choice is engineering, not mathematics: deterministic sequences give reproducibility; random sequences give flexibility. Real implementations mix both.
RRT fails to find a path after 100,000 iterations. What can you conclude?

Chapter 11: Connections

Chapter 5 sits at the center of modern motion planning. Here’s where everything connects.

Key equations from this chapter

ConceptEquationMeaning
Dispersionδ(S) = supq mins∈S ρ(q,s)Largest ball in C uncovered by any sample
ε-goodnessε = minq μ(V(q)) / μ(Cfree)Smallest visible fraction — measures narrow passage difficulty
RRT* ball radiusr(n) = γ(log n / n)1/dShrinking rewire neighborhood for asymptotic optimality
Halton dispersionδ(Hn) = O((log n)d / n)Quasirandom covers C faster than pure random

Algorithm family tree

AlgorithmYearKey property
PRM1996Multi-query, probabilistically complete
RRT1998Single-query, Voronoi bias, fast exploration
RRT-Connect2000Bidirectional growth, much faster in practice
RRT*2011Asymptotically optimal via rewiring
Informed RRT*2014Focuses sampling in prolate hyperspheroid post-solution
BIT*2015Batch informed trees — RRT* + graph search combined
OMPL2012–Open library, battle-tested implementations of all above

What connects to what in LaValle

Implementation: OMPL

OMPL (Open Motion Planning Library). Free, open-source, the standard in academic and industrial robotics. Powers MoveIt! for ROS. Provides RRT, RRTConnect, RRTstar, PRM, and 30+ others. You define state space, collision checker, start, goal → call solve(). The book’s algorithms, battle-tested. ompl.kavrakilab.org
python (OMPL-style pseudocode)
# Define C-space
space = RealVectorStateSpace(dim=2)
space.setBounds(-1, 1)
si = SpaceInformation(space)
si.setStateValidityChecker(your_collision_checker)

# Define problem
start = State(space); start[0] = -0.5; start[1] = -0.5
goal  = State(space); goal[0]  =  0.5; goal[1]  =  0.5
pdef = ProblemDefinition(si)
pdef.setStartAndGoalStates(start, goal)

# Choose algorithm and solve
planner = RRTConnect(si)   # or RRT(), RRTstar(), PRM()
planner.setProblemDefinition(pdef)
planner.setup()
solved = planner.solve(5.0)   # 5s timeout
if solved:
    path = pdef.getSolutionPath()
    path.interpolate(100)      # smooth to 100 waypoints

The book is free

Steven LaValle’s Planning Algorithms (Cambridge University Press, 2006) is available free at lavalle.pl/planning/. Chapter 5 is the core of this lesson. Chapters 4, 6, and 14 are the natural next reads.

Related lessons. The classical SLAM lesson uses probabilistic graph search through a similar map structure. The MDP lesson extends these ideas to planning under uncertainty (when the outcome of an action is unknown). The RRT ideas here reappear in model-based RL as “model-predictive path planning.”
“The goal of planning is to find a feasible course of action, to provide high-level specifications of behavior that enable a system to act autonomously.”
— Steven M. LaValle, Planning Algorithms, 2006