The Complete Beginner's Path

Understand Classical
SLAM

The algorithm that lets robots explore unknown environments — building a map while simultaneously figuring out where they are within it.

Prerequisites: Basic probability + Intuition for coordinates. That's it.
11
Chapters
8+
Simulations
0
Assumed Knowledge

Chapter 0: The Chicken-and-Egg Problem

Imagine you wake up in a dark building with a flashlight. You want to draw a floor plan (a map). But to mark anything on the map, you need to know where you are. And to know where you are, you need... a map. It's a chicken-and-egg problem.

SLAM (Simultaneous Localization And Mapping) solves both at the same time. As the robot moves, it observes landmarks, estimates its own pose, and builds a map — all in one continuous loop. It's one of the fundamental problems in robotics.

The core idea: You can't separate "where am I?" from "what does the world look like?" They're coupled. Improving your map improves your position estimate, and improving your position estimate improves your map. SLAM exploits this feedback loop.
The SLAM Problem

A robot (green) moves through unknown space, sensing landmarks (yellow). It must estimate both its own path and the landmark positions.

Check: Why is SLAM called a "chicken-and-egg" problem?

Chapter 1: What Is a Map?

A "map" in robotics isn't necessarily a picture. It's a representation of the environment that the robot can use for navigation and planning. There are several common types, each with tradeoffs.

Landmark Map
A list of distinctive point features with (x,y) coordinates. Sparse, compact, great for localization.
Occupancy Grid
A 2D grid where each cell stores the probability of being occupied. Dense, good for path planning.
Topological Map
A graph of places and connections. "Kitchen connects to hallway." Abstract, ignores geometry.
Map Representations

Toggle between three representations of the same room. Each shows the same space differently.

Classical SLAM typically uses landmark maps — a sparse set of distinctive points. This keeps the math tractable. Modern SLAM systems often produce dense maps (point clouds, meshes) as a byproduct.
Check: Which map type stores a probability for each grid cell?

Chapter 2: Robot Pose & Coordinates

A robot's pose in 2D is three numbers: position (x, y) and heading angle θ. Together, they describe where the robot is and which direction it's facing. This is a rigid body transform — it tells us how to convert from the robot's frame to the world frame.

pose = (x, y, θ)     where θ is the heading angle

When the robot moves, its new pose depends on the old pose plus a motion command (drive forward, turn). But real motors are imperfect, so every motion adds noise. Over time, this odometry drift accumulates and the robot becomes lost.

Odometry Drift

Watch how odometry (red dashed) drifts away from the true path (teal). Without corrections, errors compound.

Motion noise0.030
Key problem: Odometry drift is why robots can't just dead-reckon forever. After 100 meters, a 1% error means you're a full meter off. SLAM corrects this drift by observing landmarks.
Check: What three quantities define a robot's 2D pose?

Chapter 3: Feature Extraction (Frontend)

Before the robot can localize or map, it needs to find distinctive features in its sensor data. These are things like corners, edges, blobs — anything that looks unique and can be reliably re-detected later. This is called the frontend of a SLAM system.

Each detected feature gets a descriptor — a compact numerical summary that captures what it looks like. Think of it as a fingerprint. When the robot sees a feature again from a different viewpoint, the descriptor helps match it to the stored version.

Raw Sensor Data
Camera image, lidar scan, sonar pings
Feature Detection
Find corners (Harris), blobs (SIFT), edges (Canny)
Descriptor Extraction
Compute a fingerprint vector for each feature
Feature List
[(x1,y1,d1), (x2,y2,d2), ...]
Corner Detection

Features are detected at corners and distinctive points (yellow). Move the threshold slider to see how detection sensitivity changes.

Detection threshold0.50
Good features are distinctive (not all walls look the same), repeatable (detected from different angles), and compact (quick to compare). Corners are ideal; flat surfaces are terrible.
Check: What is a feature descriptor?

Chapter 4: Data Association

The robot sees a corner. Is it the same corner it saw 30 seconds ago, or a completely different one? This matching problem is called data association, and getting it wrong is catastrophic — if you match to the wrong landmark, your map becomes inconsistent.

The simplest approach is nearest neighbor: match each observation to the closest known landmark. But this fails when landmarks are close together. More robust methods include JCBB (Joint Compatibility Branch and Bound), which considers all matches jointly.

Data Association Challenge

Current observations (teal) must match to known landmarks (yellow). Lines show matches. Red lines are wrong matches. Click "Scramble" to see how ambiguity arises.

Wrong matches are poison. A single incorrect data association can corrupt the entire map. This is why robust matching algorithms and outlier rejection are critical in real SLAM systems.
MethodApproachRobustness
Nearest NeighborMatch to closest landmarkLow
Mahalanobis GateNN but weighted by uncertaintyMedium
JCBBCheck all matches jointlyHigh
RANSACRandom sample consensusHigh
Check: What is the data association problem?

Chapter 5: EKF-SLAM

The classic approach. We augment the robot's state vector with every landmark position. If we have N landmarks, the state is [x, y, θ, l1x, l1y, ..., lNx, lNy] — a vector of size 3+2N. The covariance matrix is (3+2N) × (3+2N), tracking correlations between everything.

The EKF (Extended Kalman Filter) handles the nonlinear motion and observation models by linearizing them at each step. Predict: move robot, grow uncertainty. Update: observe a landmark, shrink uncertainty — for the robot and all correlated landmarks.

State: x = [robot pose, landmark 1, landmark 2, ...]
Covariance: P is (3+2N) × (3+2N) — grows quadratically!
EKF-SLAM: Growing State Vector

Watch the state vector grow as new landmarks are discovered. Uncertainty ellipses (cyan) show correlations. Click "Step" or toggle auto-play.

Landmarks: 0 | State dim: 3
The O(N²) curse: Every update touches the entire covariance matrix. With 1000 landmarks, that's a million-entry matrix to update per step. EKF-SLAM doesn't scale to large environments.
Check: Why does EKF-SLAM struggle with large environments?

Chapter 6: Particle Filter SLAM (FastSLAM)

FastSLAM takes a clever shortcut using Rao-Blackwellization. The insight: if you knew the robot's path exactly, the landmark positions would be conditionally independent. You wouldn't need that massive joint covariance matrix.

So FastSLAM uses particles to represent the robot's pose distribution. Each particle carries its own mini-EKF for each landmark. Since landmarks are independent given the path, each mini-EKF is just 2×2 — no quadratic blowup.

Particle (path hypothesis)
Each particle is one possible robot trajectory
Per-particle landmark EKFs
Each landmark: independent 2×2 EKF. O(N) per particle.
Resampling
Keep particles that match observations. Kill unlikely ones.
FastSLAM Particles

Each green dot is a particle (robot pose hypothesis). After resampling, particles cluster around the true pose. Landmarks shown in yellow.

Num particles50
Scaling: EKF-SLAM is O(N²) per update. FastSLAM is O(M·N) where M is the number of particles. With M=100 particles and N=1000 landmarks, that's 100K instead of 1M operations.
Check: What does Rao-Blackwellization exploit in FastSLAM?

Chapter 7: Graph-Based SLAM

The modern workhorse. Instead of maintaining a filter, we build a graph. Each node is a robot pose (at some time) or a landmark position. Each edge is a constraint — an odometry measurement between poses, or a landmark observation. This forms a factor graph.

SLAM becomes an optimization problem: find the node positions that best satisfy all constraints simultaneously. This is solved with iterative least-squares methods like Gauss-Newton. The key advantage: we can re-optimize the entire trajectory when new information arrives (like a loop closure).

x* = argmin Σedges || zij − h(xi, xj) ||²Σ
Factor Graph

Blue nodes = robot poses. Yellow nodes = landmarks. Edges = constraints. Click "Optimize" to see least-squares adjustment pull everything into consistency.

Factor graphs are sparse. Each constraint only connects a few nodes. This means the system matrix (Hessian) is sparse, and we can use efficient sparse solvers. Libraries like g2o, GTSAM, and Ceres make this practical.
ApproachStateUpdatesScalability
EKF-SLAMCurrent onlyRecursiveO(N²)
FastSLAMCurrent + particlesSequentialO(M·N)
Graph-SLAMAll historyBatch optimizeSparse O(N)
Check: In graph-based SLAM, what do the edges represent?

Chapter 8: Loop Closure

The robot has been exploring for 10 minutes. Odometry has drifted badly. Then it returns to a place it visited at the start — and recognizes it. This is loop closure, and it's the most impactful event in SLAM. A single loop closure can correct accumulated drift across the entire trajectory.

Detection uses place recognition: comparing the current scene's features against a database of all previously visited places. Bag-of-words (DBoW) and learned descriptors are common approaches. Once a loop is detected, a new constraint is added to the graph, and re-optimization snaps everything into alignment.

Loop Closure Correction

The red path shows drifted odometry forming a loop. Click "Close Loop" to see the constraint propagate corrections backward through the entire trajectory.

This is why graph-SLAM wins: When a loop closure arrives, graph-SLAM re-optimizes all poses. EKF-SLAM can only update the current state. FastSLAM struggles because particles can't "go back in time." The ability to retroactively fix the past is graph-SLAM's superpower.
Check: Why is loop closure so important?

Chapter 9: 2D SLAM Simulator

Time to see it all come together. Below is a 2D room with landmarks. A robot explores the space, observing landmarks within its sensor range. Watch the map build in real-time. Toggle between the EKF view (uncertainty ellipses) and the graph view (nodes and edges). Use arrow keys or click "Auto-explore" to drive the robot.

Live 2D SLAM
Landmarks found: 0
Sensor range100
Observation noise5

Arrow keys to drive manually. Sensor range shown as a circle around the robot.

Try this: Drive the robot in a big loop, then return to the start. Watch how re-observing old landmarks tightens the entire map. That's loop closure in action!

Chapter 10: Connections & Beyond

Classical SLAM is the foundation, but the field has exploded in the last decade. Modern systems combine visual features, inertial measurements, and even neural networks. Here's where SLAM connects to everything else.

SystemWhat It AddsWhen To Use
Visual SLAM (ORB-SLAM)Camera features + bundle adjustmentCamera-only systems
VIO (Visual-Inertial Odometry)Tightly coupled camera + IMUDrones, AR/VR headsets
LiDAR SLAM (LOAM)Point cloud matchingSelf-driving cars, surveying
NeRF/3DGS SLAMNeural scene representationDense 3D reconstruction
Related microLessons: The EKF is the heart of EKF-SLAM (see Kalman Filter lesson). Bayes filters are the general framework that all SLAM filters instantiate. Particle filters power FastSLAM.
Open problems: Long-term SLAM (maps that update over months), dynamic environments (moving objects), multi-robot SLAM (sharing maps), and semantic SLAM (understanding what objects are, not just where).
"SLAM is not one problem, but a family of problems — and no single solution dominates all domains."
— Cadena et al., "Past, Present, and Future of SLAM"

You now understand the core concepts behind every SLAM system. From vacuum robots to self-driving cars, the chicken-and-egg problem is the same — only the sensors and scale change.