Ch 12: Planning Under Sensing Uncertainty — LaValle Planning Algorithms

Chapter 0: Planning Without Knowing Where You Are

You're a robot dropped into a building. Your task: reach the exit. You have a map of the hallways, a motor, and a sensor that tells you how far the nearest wall is. But you don't know your starting position. The sensor gives you distance readings — but multiple rooms could produce identical readings. Where are you?

This is the core challenge of Chapter 12: planning when your state is unknown. In every prior chapter, the robot knew its position. Planning was hard enough then — now we strip away that luxury. The robot must simultaneously figure out where it is and what to do. These problems are entangled: a good action depends on your location, but your location depends on which actions you've taken and what you've observed in response.

The canvas below illustrates the predicament. A robot (blue) starts at an unknown position in a corridor. Black walls give distance readings. Multiple candidate positions (shown in gray) are all consistent with the first reading. Watch what happens when the robot moves and takes new measurements — some hypotheses survive, others are eliminated.

Ambiguous Localization — Multiple Consistent Hypotheses

Three corridors are identical in shape. The robot (blue dot) gets a distance reading but can't distinguish which corridor it's in. Move the robot and watch hypotheses (gray) collapse.

Step 0 — 3 hypotheses active

Sensing uncertainty is different from action uncertainty. In Chapter 10, we knew our state but actions had uncertain outcomes. Here, actions may be deterministic — but we can't see where we ended up. The uncertainty lives in what we know, not in what the world does. Chapter 12 marries both: uncertain actions and uncertain observations.

Three problems in one chapter

LaValle groups this chapter around three compounding problems. Localization: the map is known, but the robot's position is unknown — estimate it from observations. Mapping: the robot's path is known, but the environment is unknown — build the map from observations. SLAM (Simultaneous Localization and Mapping): neither is known — solve both at once. Each is harder than the last.

Problem	Known	Unknown	Key technique
Localization	Map	Robot pose	Particle filter, Bayes filter
Mapping	Trajectory	Environment	Occupancy grid, ray tracing
SLAM	Nothing	Both	EKF-SLAM, graph SLAM
Pursuit-evasion	Map	Intruder position	Visibility planning, lion & man

A robot has a perfect map of its building and perfect motors (actions are deterministic), but its sensor is noisy. Which chapter 12 problem does this primarily illustrate?

Localization — the map is known but robot position is uncertain due to sensor noise Mapping — the sensor noise makes the environment structure unknown SLAM — whenever there is any noise we always need SLAM

Chapter 1: General Methods — Planning in I-Space

Chapter 11 introduced the information space (I-space): instead of planning over physical states X, we plan over information states η — all the knowledge the robot currently holds. An information state summarizes the full history of observations and actions into a compact form that's sufficient for decision-making.

The key insight: sensing uncertainty doesn't make planning impossible. It makes the space of things to plan over larger and stranger. Instead of asking "what action is optimal in state x?", we ask: "what action is optimal given information state η?" The planning algorithm's structure is identical — only the state space has changed from X to I.

Information state = sufficient statistic. You don't need to remember every observation ever made. You need only the summary that's sufficient to predict future observations and compute optimal actions. For Gaussian noise, the mean and covariance are sufficient. For discrete environments, the belief distribution over states is sufficient. The art is finding the smallest sufficient summary.

The I-space planning formulation

LaValle's formulation (Section 12.1) frames I-space planning as a standard planning problem where the state is now an information state η ∈ I_hist. The transition: when you apply action u in information state η and observe y, the new information state is η' = η_trans(η, u, y). The goal is a set of information states I_G that are "good enough" — perhaps all η where the robot is confident it's at the goal.

η' = η_trans(η, u, y)

This is called the I-state transition equation. It's the counterpart to x' = f(x, u) in deterministic planning, but now the new state depends on the observation y as well — because observations update our knowledge. Planning in I-space means finding an action policy π(η) that drives information states toward I_G.

Nondeterministic vs. probabilistic I-space

Two flavors of I-space appear in Chapter 12. In the nondeterministic case (Section 12.1.2), we assume worst-case nature: the robot tracks the set of all states still consistent with every observation. This set is the nondeterministic I-state, sometimes written η_nd ⊆ X. It shrinks as observations rule out impossible states and grows when actions could send the robot anywhere in a set.

In the probabilistic case (Section 12.1.3), we maintain a full probability distribution b(x) = P(x | history) — the belief state. Belief state updates follow Bayes' rule: apply action, propagate uncertainty forward, then condition on the new observation. This is the foundation of all the filter-based methods that follow.

Prior belief b(x)

What we believed before acting

↓ apply action u

Predicted belief b̅(x')

Uncertainty spreads through motion model

↓ receive observation y

Updated belief b'(x')

Multiply by sensor model P(y | x'), renormalize

↻ repeat each timestep

Bayes filter in two steps: Prediction multiplies by the motion model (uncertainty grows). Correction multiplies by the sensor likelihood (uncertainty shrinks). The product of thousands of such steps drives the robot to confident localization — if the environment is informative enough.

The nondeterministic I-state η_nd tracks a set of possible states. What happens to this set when the robot makes an observation that rules out half the candidates?

It grows — more observations mean more possible explanations It shrinks — impossible states are eliminated from the consistent set It stays the same — observations only update probabilities, not sets

Chapter 2: Localization — Where Am I?

Localization is the problem where the map is given and the robot must determine its own position. Think of waking up in a hotel room in a foreign city. You can see out the window. You know the city's street layout. But you don't know which room you're in. Every observation narrows down the possibilities until — ideally — only one location is consistent.

Formally: the robot operates in a known environment X with known map M. It takes action u_k and receives observation y_k at each step. The goal is to maintain a belief b_k(x) = P(x_k = x | y_0..k, u_0..k-1) that concentrates on the robot's true position over time.

Three flavors of localization

Global localization: the robot starts completely lost — initial belief is uniform across all positions. Must figure out location from scratch. Hardest case.

Position tracking: the robot knows where it started (initial belief is sharp), but sensor and motion noise accumulate over time. The filter prevents drift from growing unbounded. Easier — the filter never has to "find" the robot, only keep track.

Kidnapped robot: the robot knew where it was, but was suddenly moved without warning. The belief still peaks at the old location, but the true position has changed. The filter must detect the inconsistency and relocalize. Hardest case for tracking-based systems.

The kidnapped robot is a stress test. A localization system that only does position tracking will fail catastrophically if the robot is picked up and placed elsewhere — its belief confidently says "I'm here" while the robot is actually somewhere else. Robust systems must detect when their confident belief is consistently contradicted by observations, and recover by spreading belief back out.

Sensor models

The observation model P(y | x) is the heart of localization — it says how likely observation y is if the robot is truly at position x. For a range sensor (laser or sonar), a simple model: expected range r_exp(x, θ) is computed by ray-casting through the known map in direction θ. Actual measurement r_meas is modeled as Gaussian: P(r_meas | x) = Ν(r_exp(x,θ), σ²). A robot at the right position will get readings close to expected. A robot in the wrong position gets large discrepancies — low likelihood — and its weight in the belief drops.

P(y | x) = ∏_i Ν(rⁱ_meas ; rⁱ_exp(x), σ²)

The product is over all range readings i (multiple beams). This model assumes independent noise on each beam — a simplification, but it works well in practice.

Motion models

The motion model P(x' | x, u) says: if I'm at x and apply action u, where do I end up? For a wheeled robot with velocity commands (v, ω), the odometry model computes the expected new pose and adds Gaussian noise in translation and rotation. The noise parameters are tuned to match the actual wheel slippage and encoder error of the specific robot.

A robot has been tracking its position confidently for 10 minutes. It is then picked up and placed across the room without warning. What happens to its belief state?

It immediately spreads to cover all possible locations — the filter detects the jump It stays exactly where it was — belief is unchanged by physical movement It remains peaked at the old (now wrong) location — the kidnapped robot problem

Chapter 3: Particle Filter Localization — SHOWCASE

The Kalman filter is elegant but works only for Gaussian distributions over continuous state. For localization on a grid — where the belief can be multimodal (three corridors, all look identical) — we need something more flexible. Enter the particle filter, also called Monte Carlo Localization (MCL).

A particle filter represents the belief not as a formula but as a cloud of samples. Each particle is a hypothesis: "the robot might be here." Particles in likely locations multiply; particles in unlikely locations die out. After enough observations, the cloud collapses onto the true location.

Particles = weighted hypotheses. Start with 200 particles spread uniformly across the map. Each has weight 1/200. After an observation, reweight each particle by P(observation | that position). Resample: particles with high weight are duplicated; particles with low weight are dropped. Apply the motion model with noise to each surviving particle. Repeat. The cloud tracks the robot.

The three-step particle filter loop

1. Sample Motion

Move each particle by the action + Gaussian noise: x' = f(x, u) + ε

↓

2. Weight by Observation

w_i = P(y | x_i) — particles matching the observation get high weight

↓

3. Resample

Draw N new particles from the weighted distribution — likely particles survive

↻ next timestep

The simulation below runs a full particle filter on a 1D corridor. The robot (true position: red line) moves along a corridor with walls at known locations. The sensor measures distance to the nearest wall (noisy). Particles (blue dots) start uniformly distributed. Watch them converge.

Particle Filter Localization — 1D Corridor

The corridor has walls at fixed positions. Robot (red) moves right with small noise. Sensor measures nearest wall distance. N particles (blue) track the belief. Watch convergence from global uncertainty.

Particles N 100

Sensor noise σ 0.15

Step 0 — particles: uniform

Why resampling works

After weighting, particles near the wrong location have tiny weights. If you never resample, you'd have 200 particles with weights [0.001, 0.001, ..., 0.9, ...] — still wasting compute on dead hypotheses. Resampling with replacement from the weight distribution draws more samples from high-weight regions and fewer from low-weight regions. The effect: computation automatically concentrates where it matters.

Particle depletion. With too few particles and too many modes, resampling can accidentally drop all particles near the true location. The filter then diverges irreversibly. Fix: use more particles, or add random "survival" particles. A practical trick — inject a small fraction of random particles each step as insurance against the kidnapped robot problem.

python
import numpy as np

def particle_filter_step(particles, weights, action, obs, walls, sigma):
    # 1. Apply motion model with noise
    motion_noise = 0.05
    particles = particles + action + np.random.randn(len(particles)) * motion_noise
    particles = np.clip(particles, 0, 1)  # stay in corridor [0,1]

    # 2. Weight by sensor model P(obs | particle)
    for i, p in enumerate(particles):
        expected = min(abs(p - w) for w in walls)
        weights[i] = np.exp(-0.5 * ((obs - expected) / sigma)**2)
    weights += 1e-10  # prevent all-zero
    weights /= weights.sum()

    # 3. Resample
    indices = np.random.choice(len(particles), len(particles), p=weights)
    particles = particles[indices]
    weights = np.ones(len(particles)) / len(particles)
    return particles, weights

After the weighting step, particle A has weight 0.001 and particle B has weight 0.5. After resampling (drawing 100 new particles), approximately how many copies of B appear?

1 — each original particle gets exactly one copy 0 — only the single best particle survives ~50 — it has 50% of the total weight, so it gets ~50% of new particles

Chapter 4: Environment Uncertainty — Mapping

Flip the localization problem: now the robot knows exactly where it is (GPS, perfect odometry) but has never seen the environment before. The environment is unknown. The task: build a map from observations. This is the mapping problem.

The most common representation for 2D environments is the occupancy grid. The environment is discretized into a grid of cells. Each cell stores a probability P(occupied | observations so far) — the probability that there's an obstacle there. Start all cells at 0.5 (unknown). As the laser scanner sweeps, update each cell using Bayes' rule.

Log-odds occupancy update

Working with raw probabilities causes numerical issues (products of many small numbers underflow). Instead, work in log-odds: l(x) = log(p/(1-p)). Positive l means likely occupied; negative l means likely free; zero means unknown. The Bayes update becomes addition in log-odds — cheap and numerically stable.

l_t(x) = l_t-1(x) + log(P(occ | hit)) - log(P(occ | free)) + log(prior/(1-prior))

A laser ray that hits a cell increments l (occupied evidence). A ray that passes through a cell decrements l (free-space evidence). After many sweeps, occupied cells lock in at high positive l, free cells at high negative l, and only truly ambiguous cells stay near zero.

Occupancy Grid Building — Ray Casting Update

A robot (blue) at the center sweeps a laser. White = free, dark = occupied, gray = unknown. Click to add a wall obstacle, then rescan to see the grid update.

Click canvas to place walls, then sweep

Ray casting is the inner loop. For each laser beam angle, march along the ray cell by cell: update each traversed cell as "free," then update the terminal cell (where the beam hit) as "occupied." This Bresenham-style traversal is fast and the update is a single addition per cell in log-odds form.

Mapping assumptions and violations

The standard occupancy grid has three hidden assumptions. First, cell independence: each cell is updated independently, ignoring that neighboring cells tend to be correlated (walls are contiguous). Second, known pose: the robot knows exactly where it is when each scan is taken. Third, static environment: obstacles don't move between scans. Violate any of these and the map degrades: ghosting from moving objects, blurring from unknown pose, jagged edges from independent cell updates.

A laser beam passes through cell A and hits cell B. How should the occupancy grid be updated?

Both A and B increase in occupancy probability A increases, B decreases (the beam bounced off B, so A must be where the obstacle is) A decreases (free space evidence), B increases (occupied evidence)

Chapter 5: SLAM — The Chicken-and-Egg Problem

You need the map to localize. You need to localize to build the map. Both are unknown at once. This is Simultaneous Localization and Mapping (SLAM) — one of the central open problems in robotics for decades, now largely solved in practice but still theoretically rich.

The key insight that unlocks SLAM: landmarks. If the robot can identify the same physical feature (a corner, a door frame, a unique reflector) across multiple observations from different positions, it gains a constraint: "wherever I was when I saw landmark 3 the first time, and wherever I was the second time, these two poses are geometrically related by the landmark's position." Accumulate enough such constraints and the system is overdetermined — you can solve for all poses and landmark positions simultaneously.

SLAM is a graph optimization problem. Nodes are poses (robot positions at each timestep) and landmarks. Edges are constraints: "from pose i, landmark j appeared at angle θ and distance r." Find the assignment of positions to nodes that best satisfies all edges simultaneously. This is a sparse nonlinear least-squares problem — computationally tractable because the constraint graph is sparse (you only see nearby landmarks).

EKF-SLAM: the classical solution

The original SLAM solution (Smith, Self, Cheeseman, 1987) used the Extended Kalman Filter. The state vector x = [robot pose, landmark₁ position, ..., landmark_N position]. The covariance matrix P captures uncertainty in all variables and — crucially — the correlations between them. When you see landmark 3 again, you update not just its position estimate but also the robot pose and all other landmarks (because if you were wrong about landmark 3's position, you were probably wrong about nearby landmarks too).

x = [ x_robot, y_robot, θ_robot, x_L1, y_L1, x_L2, y_L2, ... ]

EKF-SLAM has O(N²) complexity per step where N is the number of landmarks — the covariance matrix is N×N and every update touches it all. For 100 landmarks: 10,000 numbers updated per step. Fine. For 10,000 landmarks: 100 million numbers per step. Intractable. This motivated the shift to sparse graph-based methods.

Graph SLAM and loop closure

Modern SLAM (2005+) builds an explicit pose graph: each robot pose is a node, each observation constraint is an edge. After collecting data, run a global nonlinear optimizer (Gauss-Newton, Levenberg-Marquardt) to find the best consistent assignment. The dramatic moment: loop closure. The robot returns to a previously-visited location. Recognizing this (via scan matching or visual place recognition) adds a long-range edge in the graph. The optimizer then "pulls" the entire trajectory into geometric consistency — the accumulated drift from odometry errors is corrected globally in one shot.

SLAM Pose Graph — Loop Closure Effect

The robot traverses a loop. Odometry drift accumulates (red path). When loop closure is detected, graph optimization corrects all poses at once (blue corrected path). Toggle between drifted and corrected.

Showing: drifted odometry path

EKF-SLAM with N landmarks has O(N²) complexity per step. Why does the full N×N covariance matrix need to be updated when you observe just one landmark?

It doesn't — only the observed landmark's row and column change All landmarks are correlated through the robot pose — a pose update propagates to all landmark estimates The update rule requires computing the inverse of the full matrix, touching all entries

Chapter 6: Visibility-Based Pursuit-Evasion

Imagine a security guard (the pursuer) searching a museum at night. Somewhere in the building is an intruder (the evader) who moves adversarially — always trying to avoid detection. The guard can see everything in its line of sight. Can the guard guarantee to spot the intruder, no matter how fast or clever the intruder is?

This is the pursuit-evasion problem (Section 12.4 in LaValle). It's a game between the pursuer and an adversarial evader. Sensing is via visibility — the pursuer sees any point in its unobstructed line of sight. The evader is "caught" when it enters the pursuer's visible region. The question is combinatorial: does a winning strategy for the pursuer even exist in a given environment, and if so, what is it?

This isn't a tracking problem. The pursuer doesn't know where the evader is. The evader might not even exist — we just need a strategy that guarantees the evader is spotted, even if the evader moves optimally to avoid detection. The solution is a pursuit schedule that "clears" the environment — every point is eventually visible simultaneously with no escapes.

The lion and man problem

The classical version (continuous, circular arena): a lion and a man both run at speed 1 inside a circle. Can the lion always catch the man? Surprisingly: in the continuous-time, continuous-space version, the lion can always match the man's radius from the center and converge — but the geometry is subtle. The discrete grid version used in robotics is cleaner: the evader moves one cell per step, the pursuer moves one cell per step, and the pursuer wins when it enters the evader's cell or the evader enters the pursuer's visible region.

Recontamination and clear regions

A region is cleared if the pursuer can guarantee no evader is hidden there. A region becomes contaminated if the evader could have slipped back in while the pursuer was looking elsewhere. The pursuit strategy must systematically expand the cleared region without allowing recontamination.

In polygonal environments, this leads to a beautiful structure: the pursuer must guard certain critical edges (corridors, doorways) while sweeping the cleared region. LaValle shows (via results from Suzuki and Tanigawa, 1992) that the minimum number of pursuers needed to clear a polygonal environment equals the minimum number of "guards" needed to block all escapes during the sweep — a quantity related to the topology of the free space.

Pursuit-Evasion in a Simple Polygon

Pursuer (blue) sweeps through a L-shaped room. Green = cleared region. Red = contaminated. The evader (red dot) tries to stay in uncleared territory. Step through the pursuit strategy.

Step 0 — all regions contaminated

The pursuer clears a room, then moves down a hallway. While the pursuer is in the hallway, the evader (who was hiding in an adjacent alcove) slips back into the cleared room. What happened?

The pursuer caught the evader — entering the cleared room is losing for the evader Recontamination — the cleared region became contaminated again, so the strategy must prevent this This is impossible — once a room is cleared it cannot be recontaminated

Chapter 7: Manipulation Under Uncertainty

A robot arm must pick up a bolt. The bolt's orientation is unknown — it might be at any angle in [0°, 360°]. The gripper has no force sensor, only a camera that's occluded by the gripper itself during grasping. How does the robot guarantee a successful grasp?

This problem (Section 12.5) is manipulation under uncertainty. Unlike localization — where we track a continuous state — manipulation under uncertainty often uses a nondeterministic I-state: we track the set of all possible object configurations still consistent with what the robot has done and observed. A good grasp strategy must succeed for every configuration in this set.

Sensorless manipulation is possible. The remarkable result: by choosing actions carefully, a robot can reduce the set of possible object configurations to a single point — without any sensing at all. The key is that actions have different effects on different configurations: pushing an object against a fence rotates it to a unique orientation regardless of its initial angle. This is configuration-reducing action design.

The pushing example

A polygonal object rests on a table with unknown orientation θ ∈ [0°, 360°). Push it against a flat fence. After the push, the longest flat face of the polygon is flush against the fence — orientation is now determined by geometry, not initial conditions. Push in a perpendicular direction against a second fence. Now the position is also fixed. The robot has localized the object using only two pushes and no sensors.

LaValle formalizes this as I-space planning. The I-state is a subset of configuration space — the set of all (x, y, θ) values still possible. Each action reduces this set (ideally). The goal is an I-state equal to a single configuration: {(x*, y*, θ*)}. The sequence of actions that achieves this is called a sensorless plan.

|η_nd,k+1| ≤ |η_nd,k| — each action should (in the best case) shrink the consistent set

Grasping under uncertainty

The same principle extends to grasping. A parallel-jaw gripper closing on an object with unknown orientation: some gripper widths will definitely miss (too narrow), some will definitely catch (the polygon's narrowest chord is less than gripper width), some will sometimes catch and sometimes not. A compliant grasp strategy uses finger forces and geometry to guarantee the object is captured and re-oriented to a fixed configuration inside the gripper, regardless of initial orientation — no vision required.

Compliant motion. Rather than commanding exact positions, command forces. The object's geometry then acts as a passive "sensor" — it physically forces the gripper into the correct relative configuration. This is called passive feedback, and it's extremely robust to uncertainty.

Planning with sets. Computing the image of a set under an action is the key operation: given I-state (a set), what's the new I-state after action u? For polygons and linear pushes, this is a geometric intersection problem. For continuous distributions, it requires propagating probability through nonlinear dynamics.

A robot pushes an object with unknown angle θ ∈ [0°, 360°) against a flat wall. After the push, the object's longest face is flush with the wall and θ is now fixed. What did this action accomplish in I-space terms?

Nothing — the action moved the object but didn't change what the robot knows It eliminated θ as a degree of freedom — but the I-state is still a set because x, y are still unknown It reduced the nondeterministic I-state from a continuous set in θ to a single value of θ, though x, y remain uncertain

Chapter 8: Information Gathering — Acting to Know

All previous planning problems had information as a side effect: we acted to reach a goal, and observations updated our belief as a byproduct. But sometimes gathering information is the goal itself. An exploration robot must visit unknown terrain. A scientific instrument must be pointed to gather the most informative data. A surgeon must probe to determine tumor margins before cutting.

Information gathering planning (Section 12.6) explicitly optimizes the information value of actions. The robot chooses actions not to minimize distance to a goal, but to minimize uncertainty — to drive the belief distribution toward certainty as efficiently as possible.

Measuring uncertainty: entropy

Entropy H(b) of a belief distribution b quantifies how uncertain the robot is: H(b) = -∑_x b(x) log b(x). A uniform belief (completely lost) has maximum entropy. A delta distribution (perfectly localized) has zero entropy. An information-gathering policy minimizes expected entropy after observations.

H(b) = -∑_x b(x) · log₂ b(x)

The information gain of an action u in belief state b is I(b, u) = H(b) - E_y[H(b' | y)]: entropy before minus expected entropy after taking action u and seeing observation y. Choose the action that maximizes information gain — this is the one-step greedy information gathering policy.

Greedy information gathering has a flaw. A camera can resolve which room the robot is in by looking left. But looking left might create a dead end — after that observation, the optimal action sequence changes completely. One-step greedy ignores this: it picks the action that helps most right now, even if it makes future information gathering harder. True optimal information gathering requires lookahead — planning a sequence of informative actions, not just the best single action.

Exploration vs. exploitation

In a partially-known map, the robot must balance exploration (visiting unknown regions to reduce uncertainty) with exploitation (navigating efficiently through what's already known). Frontier-based exploration is a classic heuristic: identify the boundaries between known-free and unknown regions (frontiers), navigate to the nearest frontier, and repeat. Simple, effective in practice, and provably complete for finite environments.

Frontier-Based Exploration — Information Gathering

Gray = unknown, white = explored free, dark = wall. Blue robot navigates to nearest frontier (orange). Watch the known region expand. Entropy decreases with each exploration step.

Step 0 — entropy: max (all unknown)

Active SLAM

Combining information gathering with SLAM yields Active SLAM: the robot plans its trajectory to build the best possible map in minimum time. At each step, the robot chooses its next waypoint by balancing three criteria: information gain (how much will this reduce map uncertainty?), loop closure opportunity (does this allow me to revisit a known area and correct drift?), and travel cost (how far is it?). Active SLAM turns map building from passive data collection into deliberate scientific inquiry.

A robot's belief is: {room A: 0.5, room B: 0.5}. It can take action u₁ (which will definitely reveal if it's in room A) or u₂ (which gives a 50/50 observation regardless of true room). Which action has higher information gain?

u₁ — it resolves all uncertainty, driving entropy to zero after the observation u₂ — a 50/50 observation is maximally informative because it's perfectly balanced They're equal — both have the same effect on average entropy

Chapter 9: Connections — Where Chapter 12 Leads

Chapter 12 sits at the intersection of robotics, estimation theory, game theory, and information theory. The ideas here are foundational to almost every real deployed robot and to several currently active research fronts.

How Chapter 12 relates to the rest of the book

Concept	LaValle source	Chapter 12 role
I-space framework	Ch 11	Foundation — planning is done in I-space
Stochastic transitions	Ch 10 (MDPs)	Needed for probabilistic belief propagation
POMDPs	Ch 10.6	Localization is a POMDP; belief state = I-state
Combinatorial planning	Ch 6	Pursuit-evasion uses visibility graph methods
Sampling-based planning	Ch 5	Particle filters ARE sampling-based belief propagation

Modern descendants

FastSLAM (2002) replaced EKF-SLAM's O(N²) covariance with a Rao-Blackwellized particle filter: each particle maintains its own set of independent landmark Kalman filters. O(N log N) per step. Enabled real-time SLAM for the first time on large maps.

iSAM / g²o / GTSAM (2008-2012) brought sparse matrix methods to graph SLAM. The pose graph is solved incrementally as new constraints arrive, rather than from scratch. Now the dominant paradigm in production SLAM systems (Google Cartographer, LiDAR-based autonomous vehicle stacks).

Neural SLAM / NeRF-based SLAM (2020+) replaces the occupancy grid and landmark map with neural implicit representations. The robot builds a differentiable 3D model of its environment. Gradient descent on rendered images vs. camera frames jointly optimizes the map and robot trajectory.

Connections to other micro-lessons

Bayes Filter — the prediction-correction loop from this chapter is exactly the Bayes filter in its general form. See bayes-filter for the continuous derivation.

Classical SLAM — this lesson covered the theory. See classical-slam for the algorithmic implementation details, EKF-SLAM code, and loop closure detection.

POMDP — localization is a special case of a POMDP where the reward function depends only on reaching the goal. See pomdp for the full generalization to reward-optimizing behavior under partial observability.

UKF / Particle Filter — Chapter 12 used the particle filter for localization. See ukf for the Unscented Kalman Filter, which handles nonlinearity more elegantly than the EKF for smooth unimodal distributions.

What Chapter 12 can't do

Limitation	What breaks	Modern fix
Grid localization scales poorly	O(\|X\|) per step — exponential in dimensions	Particle filter, KD-tree belief
EKF-SLAM is O(N²)	Slow for large maps (N > 1000 landmarks)	FastSLAM, graph SLAM
Pursuit-evasion is PSPACE-hard	Exponential in environment complexity	Heuristic guards, learned policies
Occupancy grids ignore structure	Cells treated independently — walls aren't smooth	Gaussian process maps, TSDF, NeRF
Sensorless manipulation assumes rigid bodies	Deformable objects, slipping, contact uncertainty	Compliant control, tactile sensing, sim-to-real

"The goal of robotics is to close the loop between sensing and acting in an uncertain world. Every technique in Chapter 12 is a different way to maintain that loop in the face of ignorance — and to act optimally despite it."
— paraphrasing LaValle's framing throughout Ch 12

FastSLAM replaced the EKF-SLAM covariance matrix with a Rao-Blackwellized particle filter. What is the core reason this is faster?

It uses fewer landmarks — FastSLAM's particle filter skips distant landmarks It runs on a GPU — the particle operations are parallelized efficiently Each particle has independent landmark filters — so landmarks decouple conditioned on the particle's trajectory, avoiding the N×N covariance

Planning Under Uncertainty