Estimation & Robotics

Particle Filter

When the Kalman filter’s single bell curve isn’t enough — when your belief is multimodal or your system wildly nonlinear — represent it instead with a cloud of weighted guesses. Predict, weight, resample. The estimator that lets a robot say “I might be here, or here, or here.”

Prerequisites: A Bayes filter updates a belief with prediction + measurement + You can sample from a distribution. That’s it.

Chapters

Simulations

Assumed Knowledge

Chapter 0: One Bell Curve Isn’t Enough

The Kalman filter (and its EKF and UKF cousins) is brilliant — but it makes one rigid assumption: the belief about the state is a single Gaussian, one bell curve with a mean and a covariance. For tracking a ball or fusing GPS, perfect. But the world isn’t always one bell curve.

Imagine a robot that wakes up in a building of identical hallways. From its sensors, it could be in any of three identical-looking corridors. Its true belief is multimodal — three separate bumps of probability. A single Gaussian cannot represent this; forced to fit one bell, it would put its mean in the empty space between the corridors — a place the robot definitely isn’t. Add severe nonlinearity, or non-Gaussian noise, and the Kalman family breaks down further. We need a belief that can take any shape.

The particle filter represents the belief not by a formula, but by a cloud of samples — hundreds or thousands of guesses (“particles”), each a hypothesis about the state, each with a weight. A cloud can be one blob, three blobs, a banana, a ring — any distribution. This nonparametric flexibility is the whole point, and this lesson builds the filter from it.

The trap: “just use a Kalman filter, it’s optimal.” It’s optimal only for linear-Gaussian systems. The moment the belief is multimodal — global localization, multi-target tracking, ambiguous sensors — a Gaussian is the wrong shape, and its mean lands somewhere impossible. Particles pay more compute to represent the belief honestly, warts and all.

A single Gaussian can’t fit a multimodal belief

The true belief (teal) has three peaks — the robot could be in any of three corridors. The best single Gaussian (orange) smears across them, putting its mean in the empty middle. Particles (dots) sit on the real peaks. Drag to separate the modes.

mode separation0.60

Why can’t a Kalman filter handle a robot that might be in any of three identical corridors?

Its sensors are too slow Its belief is a single Gaussian, which can’t represent a multimodal (multi-peak) distribution — its mean lands between the modes It needs more memory

Chapter 1: Belief as a Cloud of Particles

The core representation: approximate the probability distribution over the state by a set of particles. Each particle is a complete guess at the state (a position, a pose, whatever you’re estimating), and it carries a weight — how plausible that guess is. Where the particles cluster densely, the belief is high; where they’re sparse, it’s low. With enough particles, this cloud can approximate any distribution to arbitrary accuracy.

This is a nonparametric representation — it doesn’t commit to a fixed shape like a Gaussian’s mean and covariance. The trade-off is immediate and central: more particles means a more faithful belief but more computation. A 1-D problem might need a few hundred; a high-dimensional one might need impossibly many (we’ll meet this “curse of dimensionality” later). The beauty is that the same simple machinery — move the particles, weight them, resample — works for any motion model and any sensor, linear or not, Gaussian or not. You trade the Kalman filter’s elegant closed-form math for brute-force flexibility.

The particle cloud is the belief

Each dot is a hypothesis; density = probability. Drag the particle count: more particles trace the true belief (teal curve) more faithfully, at more compute. Too few and the cloud is a ragged approximation.

number of particles120

How does a particle filter represent the belief?

As a single Gaussian with mean and covariance As a cloud of weighted samples (particles) whose density approximates any distribution As a fixed grid of probabilities

Chapter 2: Predict — move the particles

The filter runs a loop, and the first step is prediction (the motion update). Take each particle and push it through the motion model — if the robot drove forward 1 meter and turned 10 degrees, move every particle that way. But not deterministically: add random noise to each, reflecting the uncertainty in the motion (wheels slip, commands are imperfect). So the whole cloud both shifts (following the motion) and spreads out (the noise diffuses it).

This spreading is the prediction step’s honest admission: “I moved, but I’m now less sure exactly where I am.” Without a measurement to correct it, repeated prediction makes the cloud diffuse wider and wider — uncertainty grows, exactly as it should when you act without observing. And because we apply the motion model directly to each particle, the model can be arbitrarily nonlinear — no linearization, no Jacobians (unlike the EKF). You just simulate each particle forward. That’s a big part of the particle filter’s power: motion models can be as complex and nonlinear as reality.

Predict: shift + spread

Each particle moves by the motion command plus random noise. The cloud shifts (it moved) and spreads (more uncertain). Drag the motion noise: more noise, more spread. Repeated prediction without measurement keeps widening the cloud.

motion noise0.30

In the predict step, why does the particle cloud spread out?

Particles repel each other Random motion noise is added to each particle, reflecting growing uncertainty when you move without observing New particles are created

Chapter 3: Weight — score by the measurement

Now a measurement arrives (a sensor reading). The second step is the update, done by weighting each particle. For each particle, ask: if the true state were this particle’s guess, how likely is the measurement we just got? That likelihood becomes the particle’s new weight. Particles whose guess is consistent with the sensor get high weight; particles that disagree with the sensor get low weight.

This is importance weighting, and it’s how the measurement reshapes the belief without moving any particle — it just re-scores them. If the sensor says “I see a door 2 meters ahead,” particles that predict a door 2 meters ahead light up; particles in the middle of a wall go dim. As with prediction, the measurement model is applied directly per particle, so it can be arbitrarily nonlinear and non-Gaussian — you just evaluate the likelihood of the observation for each hypothesis. After weighting, the cloud is the same particles, but now their weights encode the corrected belief: dense-and-heavy where the state probably is.

Weight by observation likelihood

A measurement (orange marker) arrives. Each particle is re-weighted (dot size) by how well its guess matches the measurement — consistent particles grow, inconsistent ones shrink. Drag the measurement and watch the weights shift.

measurement location0.50

How does a measurement update the belief in a particle filter?

It moves each particle toward the measurement It re-weights each particle by the likelihood of the measurement given that particle’s state (importance weighting) It deletes all particles and starts over

Chapter 4: Resample — survival of the fittest

There’s a problem with just re-weighting forever: after a few steps, one particle ends up with almost all the weight and the rest become negligible — you’re effectively tracking the belief with one sample. This is degeneracy, and the cure is resampling. Draw a brand-new set of particles from the current set, with probability proportional to weight. High-weight particles get duplicated (possibly many times); low-weight particles tend to disappear. Then reset all weights to equal.

It’s natural selection for hypotheses: the fit survive and multiply, the unfit die out. After resampling, you have the same number of particles, now concentrated where the belief is high — and because high-weight particles get duplicated, the next predict step (with its noise) will scatter those duplicates into slightly different nearby states, re-exploring the promising region. So resampling focuses the filter’s “attention” on plausible states without permanently collapsing diversity. It’s the step that keeps the particle cloud spent where it matters — and getting it right (when and how to resample) is much of the art of particle filtering.

Why duplicate, not just delete? Resampling concentrates particles on the good regions, but the duplicates aren’t wasted — the very next prediction step jitters each one by motion noise, so they spread back out to explore the neighborhood. Delete-and-duplicate, then re-jitter, is how the filter both focuses and keeps exploring.

Resample: heavy particles multiply, light ones vanish

Before (top): particles of varying weight (size). Press resample: a new equal-weight set is drawn proportional to weight — heavy particles duplicate, light ones disappear (bottom). The cloud concentrates where the belief is strong.

What problem does resampling solve, and how?

Slow sensors; by speeding them up Degeneracy (one particle hoards all weight); by drawing a new equal-weight set proportional to weight — heavy particles duplicate, light ones vanish Nonlinearity; by linearizing

Chapter 5: The Full Loop

Put the three steps in a cycle and you have the particle filter, running once per timestep:

1. Predict

push every particle through the motion model + noise (cloud shifts & spreads)

↓ a measurement arrives

2. Weight

score each particle by the observation likelihood (importance weighting)

↓

3. Resample

draw a new equal-weight set proportional to weight (focus on plausible states)

↻ repeat next step

And to report an estimate at any moment, summarize the cloud: the weighted mean of the particles is your best single guess, and their spread is the uncertainty. (For a multimodal belief, the mean might be meaningless — you might report the largest cluster, or all the modes.) This whole loop is a direct, sample-based implementation of the Bayes filter: predict = the motion prior, weight = the measurement likelihood, and the particle density is the posterior. The Kalman filter assumes everything is Gaussian and does this in closed form; the particle filter does it by Monte Carlo — same Bayesian recursion, no distribution assumptions.

The three repeating steps of a particle filter are:

Encode, attend, decode Predict (move + noise), Weight (by measurement likelihood), Resample (proportional to weight) Sort, filter, average

Chapter 6: Monte Carlo Localization

The classic application is Monte Carlo Localization (MCL): a robot figuring out where it is in a known map. It’s where particle filters shine over Kalman filters, because of one capability: global localization. The robot can start with no idea where it is — particles spread uniformly across the entire map. As it moves and senses (predict, weight, resample), the particles that consistently match the sensor readings survive, and the cloud gradually collapses onto the true location. A Kalman filter, needing an initial Gaussian guess, simply can’t start from “I’m somewhere on this map.”

And it handles ambiguity gracefully. In symmetric environments, the particles may form several clusters — “I’m in one of these three identical rooms” — and stay multimodal until a distinctive observation (a unique landmark) breaks the tie and one cluster wins. It even survives the kidnapped robot problem (someone picks the robot up and moves it): with a few random particles always injected, the filter can recover by re-localizing. This robustness to global uncertainty and ambiguity is exactly what the Gaussian assumption forbids — and it’s why particle filters power real robot localization, and why FastSLAM (a particle filter for SLAM) became a landmark.

Global localization: from uniform to converged

A robot in a hallway starts with particles spread everywhere (no idea where it is). Press step: as it moves and senses, particles matching the readings survive and the cloud converges on the truth. Watch global localization happen.

What can Monte Carlo Localization do that a Kalman filter cannot?

Run faster Global localization from total uncertainty (particles spread over the whole map) and represent ambiguity (multiple clusters) Use fewer sensors

Chapter 7: A Particle Filter, Live (showcase)

A 2-D particle filter tracking a moving target. The true target moves; the filter only gets noisy distance/bearing-style measurements. Watch the particle cloud predict (spread), weight against the measurement, resample (concentrate), and track the target. Crank the noise, or “kidnap” the target, and see the cloud cope — or scramble to recover.

2-D particle filter tracking

Teal cloud = particles, orange = true target, the cross = the filter’s estimate (weighted mean). Press Run to track; the cloud predicts, weights, and resamples each step. Add measurement noise to loosen it; press Kidnap to teleport the target and watch the filter recover.

measurement noise0.08

particles200

Watch the cloud breathe: it spreads on predict, concentrates on resample, and chases the target. With few particles or high noise it gets ragged and can lose the target (particle deprivation); with enough it tracks tightly. Kidnap the target and the cloud is briefly lost, then the random injected particles let it re-acquire — the recovery the Gaussian filters can’t do.

Chapter 8: Pitfalls & the Estimation Family

Degeneracy: without resampling, one particle hoards the weight. Resampling fixes it — but resampling too often causes the next problem.
Sample impoverishment: resample too aggressively and the particles all become copies of a few survivors — diversity collapses, and if the truth wasn’t among them, you can’t recover. Motion noise (re-jittering) and not over-resampling combat this.
Particle deprivation: too few particles to cover the belief, especially after a surprise — the true state has no nearby particle and is lost. More particles, or injecting random ones, helps.
The curse of dimensionality: the killer. The number of particles needed to fill a space grows exponentially with the state’s dimension. Particle filters are wonderful in low dimensions (2-D/3-D localization) but become hopeless in high dimensions — which is exactly where Gaussian filters (or Rao-Blackwellized hybrids) win.

So the estimation family divides cleanly. Kalman filter: linear-Gaussian, exact, efficient, unimodal. EKF / UKF: mildly nonlinear, still Gaussian/unimodal. Particle filter: arbitrary nonlinearity, arbitrary (multimodal) distributions — at the cost of many samples and the dimensionality curse. Choose by your belief’s shape and your state’s dimension: a low-dimensional, multimodal, nonlinear problem (robot localization) is the particle filter’s home turf; a high-dimensional smooth one belongs to the Kalman family. And FastSLAM famously combined both — particles for the robot path, tiny Kalman filters for each landmark — to beat the curse.

Particles needed vs. dimension (the curse)

Particles required to cover the state space, as the state dimension grows. It climbs exponentially — fine in 2-D/3-D, hopeless in high dimensions. Drag the dimension and watch the requirement explode.

state dimension2

What is the particle filter’s fundamental limitation?

It can’t handle nonlinearity The curse of dimensionality — particles needed grow exponentially with state dimension, so it’s impractical in high dimensions It only works with Gaussian noise

Chapter 9: Cheat Sheet & Connections

belief

a cloud of weighted particles (any distribution, multimodal OK)

↓ 1. predict

motion update

move each particle by motion model + noise (shift & spread; any nonlinearity)

↓ 2. weight (measurement)

importance weighting

weight ∝ likelihood of the observation given each particle

↓ 3. resample

survival of the fittest

redraw ∝ weight; estimate = weighted mean; fights degeneracy

Filter	Belief	Best for
Kalman	single Gaussian	linear-Gaussian, any dimension
EKF / UKF	Gaussian (approx)	mildly nonlinear, unimodal
Particle filter	weighted samples (any shape)	nonlinear, multimodal, low dimension
FastSLAM	particles + per-landmark Kalman	SLAM (beats the curse)

Keep exploring

→ Bayes Filter — the recursion the particle filter implements
→ Kalman Filter — the Gaussian, closed-form counterpart
→ EKF / UKF — nonlinear Gaussian filters
→ Classical SLAM — where FastSLAM and localization live

“What I cannot create, I do not understand.” You just rebuilt the particle filter: represent the belief as a cloud of weighted guesses, move them through the motion model, weight them by the measurement, and resample so the fit survive. No Gaussian assumption — just enough particles to trace whatever shape the truth takes.