The modern SLAM back-end. Stop filtering one state at a time — instead optimize the whole trajectory and map at once as a giant, sparse least-squares problem. Pose-graph optimization, bundle adjustment, and the loop closure that snaps a drifted map back into place.
A filter — Kalman, particle — is recursive: it keeps only the latest state estimate, folds in each new measurement, and throws the past away. That’s efficient and real-time, but it has two costs. It accumulates linearization error (each step linearizes once, forever), and when a robot returns to a place it’s seen before — a loop closure — a filter can’t easily go back and fix the whole drifted path between then and now, because it already discarded those past states.
Smoothing takes the opposite approach: keep all the states — the entire trajectory of poses, plus the map — and find the configuration of everything at once that best explains all the measurements. It’s a single, giant optimization over the whole history. More expensive, but far more accurate: it can re-linearize everything, and a loop closure constraint can ripple back and correct the entire trajectory. This is how modern SLAM back-ends work, and the data structure that makes it tractable is the factor graph.
A filter tracks only the current pose (one highlighted dot); a smoother holds the entire trajectory in memory and optimizes it jointly. Drag the timeline — the filter forgets the past; the smoother keeps it all, ready to correct.
A factor graph is a bipartite graph with two kinds of nodes. Variables — the unknowns we’re solving for: the robot’s poses at each timestep, and the positions of landmarks in the map. And factors — the constraints, one per measurement, each connecting the variables it relates. An odometry factor connects two consecutive poses (“I moved roughly this much between them”). A landmark observation factor connects a pose to a landmark (“from here, I saw that landmark over there”). A prior factor pins down a single variable (“I started at the origin”). And a loop closure factor connects two non-consecutive poses (“I’ve been here before”).
So the graph literally is the problem: the variables are what you want to find, and the factors are everything you know about them. Each factor encodes a measurement and its uncertainty. The whole estimation task becomes: find the assignment of all the variables that’s most consistent with all the factors. If this feels like a graph of nodes passing information — it is; factor graphs are cousins of the probabilistic graphical models and message-passing you saw with GNNs. But here the goal is to optimize the variables, not classify them.
Big nodes = robot poses (a trajectory); small nodes = landmarks; squares = factors (odometry between poses, observations pose→landmark, a prior). Click factor types to highlight them. The graph IS the estimation problem.
Concretely, each factor defines an error (a residual): the difference between what the measurement said and what the current variable estimates would predict. An odometry factor’s error is “the relative pose my variables imply between these two poses” minus “the relative pose odometry measured.” A landmark factor’s error is “where my variables say that landmark should appear from this pose” minus “where I actually observed it.” If the variables are perfect and noise-free, every error is zero. They never are, so each factor contributes a nonzero residual.
And each error is weighted by the measurement’s uncertainty — a precise sensor’s factor is stiff (small errors cost a lot), a noisy sensor’s factor is loose (errors are tolerated). The total cost of a configuration is the sum of all the squared, weighted errors across every factor. Minimizing that sum is the goal. This is exactly maximum-a-posteriori estimation under Gaussian noise: the most probable trajectory and map, given all measurements, is the one that minimizes this weighted sum of squared residuals. Every factor “pulls” the variables toward satisfying its measurement; the solution is the equilibrium of all those pulls.
A measurement (teal) says where a landmark should be relative to a pose; the current estimate (orange) disagrees. The gap is the residual — its squared, uncertainty-weighted length is this factor’s cost. Drag the estimate and watch the cost rise and fall.
So we have a big nonlinear least-squares problem: find the variable values (all poses, all landmarks — possibly thousands of dimensions) that minimize the sum of squared residuals. We solve it iteratively with Gauss-Newton or Levenberg-Marquardt. The recipe: linearize every factor’s error around the current estimate; this turns the problem into a single big linear least-squares system; solve that system for the best update step; apply the step; and repeat until the total error stops shrinking. Each iteration nudges all the variables together toward the configuration that satisfies all the constraints best.
You start from an initial guess (often the raw odometry, which has drifted), and the optimizer pulls the whole trajectory and map into global consistency. Because it re-linearizes at every iteration — unlike a filter, which linearizes once and moves on — smoothing avoids the filter’s accumulated linearization error. The catch: nonlinear least squares can get stuck in local minima, so a decent initial guess matters (Levenberg-Marquardt adds damping to stay robust). But given a reasonable start, this iterate-linearize-solve loop reliably finds the most consistent explanation of all your data.
Total error vs. iteration: linearize, solve, step, repeat. The error plunges as the variables snap into consistency. Press iterate and watch the cost fall; a drifted start converges to the consistent solution.
A trajectory with 10,000 poses and 50,000 landmarks has a state of hundreds of thousands of dimensions. Solving a dense linear system that big each iteration would be hopeless — cost grows cubically. What saves us is sparsity. Each factor touches only a few variables: an odometry factor connects just two consecutive poses; an observation connects one pose and one landmark. No factor touches everything. So the linear system’s matrix is overwhelmingly zeros — each variable is directly linked to only its handful of neighbors.
And sparse linear systems can be solved dramatically faster than dense ones, with sparse factorization methods (sparse Cholesky, QR) that exploit the structure — near-linear in the number of variables, not cubic. This is the entire reason large-scale SLAM and bundle adjustment are tractable: the problem is huge but locally connected, and the optimizer exploits that local structure. The factor graph isn’t just a nice picture — its sparsity pattern is exactly the sparsity of the matrix you solve, so the graph structure directly determines how fast you can solve it. Good ordering of the variables (to keep the factorization sparse) is a real part of the craft.
The matrix linking variables: a nonzero block only where two variables share a factor. Mostly zeros — each pose links to its neighbors and a few landmarks. Drag the trajectory length: the matrix grows, but stays sparse (a thin band + loop-closure off-diagonals).
This is where smoothing earns its keep. As a robot drives, odometry drifts — small errors accumulate, so its estimated trajectory slowly diverges from reality. By the time it returns to its starting point, its map might be visibly warped, the start and end not lining up. Then it recognizes a place it’s seen before — a loop closure — and adds a single new factor: “pose 1000 is right next to pose 5.”
That one constraint is electric. The optimizer re-solves the whole graph, and the loop-closure error — the gap between where the loop should close and where it currently does — distributes back across the entire loop, correcting every pose in between. The warped trajectory snaps into a consistent shape; accumulated drift is wiped out in one optimization. A filter can’t do this — it discarded those intermediate poses. Because the smoother kept the whole graph, a discovery in the present rewrites the entire past into consistency. Watching a drifted SLAM map snap straight the instant a loop closes is one of the most satisfying sights in robotics, and it’s pure factor-graph optimization.
A trajectory that should be a closed loop has drifted open (start and end don’t meet). Press close loop: a loop-closure factor is added and the graph re-optimizes — the drift distributes back and the loop snaps shut.
Bundle adjustment is the factor graph specialized to vision — the engine behind structure-from-motion and visual SLAM. The variables are camera poses and 3-D points (the reconstructed scene). The factors are reprojection errors: project each 3-D point through each camera that saw it, and compare to where it actually appeared in that image. The error is the pixel distance between the projected point and the observed pixel. Minimize the sum of all these reprojection errors over all cameras and all points, jointly.
The name comes from the “bundles” of light rays from each 3-D point to each camera, all being adjusted into agreement. It’s the same nonlinear least-squares-on-a-sparse-factor-graph machinery, and it’s what turns a set of overlapping photos into a precise 3-D reconstruction (and what runs in the back-end of visual SLAM systems like ORB-SLAM). The sparsity here is beautiful: each 3-D point is seen by only some cameras, so the camera-point structure is extremely sparse — which is exactly why optimizing thousands of cameras and millions of points is feasible at all. Bundle adjustment is, in a sense, the original and most important factor-graph optimization.
A 3-D point, projected into two cameras (lines), should land where it was observed (teal marks); the current estimate (orange) is off by the reprojection error. Drag the point estimate — bundle adjustment moves cameras AND points to minimize the total pixel error.
The full pose-graph SLAM experience: a robot drives a loop, odometry drifts, and the estimated trajectory warps away from the true path. Add loop closures and run the optimizer, and watch the whole trajectory pull back into a consistent map. Crank the odometry drift and see how much a single loop closure can rescue.
Teal = true loop, orange = drifted estimate from odometry. Press optimize to add a loop closure and re-solve the factor graph — the orange estimate snaps toward the truth. Increase the drift and watch how badly raw odometry warps, and how well the optimization recovers it.
Notice the optimization doesn’t just fix the endpoint — it redistributes the correction along the whole loop, so the entire estimated path bends back toward the truth. That global consistency, from a sparse least-squares solve over the whole trajectory, is the modern SLAM back-end in a nutshell. The front-end (odometry, place recognition) feeds factors in; the factor-graph back-end keeps the map globally consistent.
Re-optimizing the entire graph from scratch every time a new measurement arrives would be wasteful — most of the solution barely changes. Incremental smoothing (iSAM, iSAM2) updates the solution efficiently as new factors stream in, re-solving only the parts of the graph that are actually affected, using a clever data structure (the Bayes tree) to track dependencies. This gives near-real-time smoothing — the accuracy of full optimization at close to filtering speed — which is why modern SLAM systems can smooth, not just filter, online.
In practice you don’t write the solver yourself; you use a library. GTSAM (factor graphs and iSAM), g2o (general graph optimization), and Ceres (Google’s nonlinear least-squares) are the workhorses — you define your variables and factors, and they handle the sparse linear algebra and iteration. And the broad lesson generalizes far beyond robotics: any problem that’s a sum of local error terms over shared variables — calibration, sensor fusion, structure-from-motion, even some learning problems — is a factor graph, and can be solved by this same sparse nonlinear-least-squares machinery. Filtering gives you the present cheaply; smoothing gives you the whole consistent history, and factor graphs are how.
Estimation error along a looping trajectory. The filter (orange) drifts and stays drifted — it can’t fix the past. The smoother (teal) drifts too, until the loop closes, then snaps the whole path back to low error. Drag through the trajectory.
| Concept | What it is |
|---|---|
| Factor graph | variables (unknowns) + factors (measurement error terms) |
| Cost | Σ of squared, uncertainty-weighted residuals = MAP estimate |
| Solver | Gauss-Newton / LM, exploiting sparsity |
| Loop closure | one factor → re-optimize → drift corrected across the whole path |
| Bundle adjustment | cameras + 3-D points, minimize reprojection error |
| iSAM | incremental smoothing for online, near-real-time use |
→ Modern SLAM — where factor-graph back-ends live
→ Classical SLAM — the filtering-based predecessor
→ Kalman Filter / Particle Filter — the filtering alternatives
→ Multi-View Geometry — the geometry behind bundle adjustment