Real-time mapping via gradient descent, cycle detection, backwards correction, multi-robot, and 3D mapping.
EKF SLAM is online but scales poorly. EIF SLAM scales well but is batch. EM is powerful but requires multiple passes. Can we build maps in real-time, processing each sensor reading once as it arrives?
Incremental maximum likelihood mapping does exactly this. At each time step, it finds the pose that maximizes the local likelihood:
and updates the map by incorporating the new observation at the estimated pose.
The incremental ML algorithm maximizes the conditional belief — the probability of the current pose and map given the previous pose and map:
This is the product of two factors: (1) How well does the current observation match the current map at the proposed pose? (2) How likely is this pose given the last pose and the action?
Starting from the odometry prediction, gradient descent adjusts the pose to maximize the likelihood. A few iterations (5-20) typically suffice for convergence.
To find the pose that maximizes the likelihood, we use gradient ascent in log-likelihood space:
where κ > 0 is a small step size. The gradient has two terms: one from the sensor model (which scan positions match the map?) and one from the motion model (which poses are consistent with odometry?).
Starting point: the odometry prediction (noise-free motion model applied to the previous pose). Gradient descent then adjusts x, y, and θ to better align the current scan with the existing map.
The gradient for the likelihood field sensor model (Chapter 6) is computed per beam:
For each beam k with endpoint (xok, yok) and nearest obstacle (x̄, ȳ):
The gradient of the log-likelihood is the sum of these per-beam contributions, weighted by the Gaussian likelihood.
Incremental ML mapping is extremely effective for straight corridors and small environments where odometry drift is modest. But it has a critical weakness: it can't handle loops.
When the robot traverses a large loop (e.g., around a building), odometry error accumulates. By the time it returns to the starting point, the accumulated error might be several meters. Incremental ML mapping doesn't know it's returned — it just keeps building the map based on local scan matching, creating a map where the loop doesn't close.
To handle loops, incremental mapping algorithms must first detect when the robot has returned to a previously visited area. Two approaches:
Sensor-based detection: Compare the current sensor reading to readings taken earlier. If the current scan matches an old scan well, the robot has probably returned. This is essentially a global localization problem within the existing map.
Pose-based detection: Monitor whether the current pose estimate is close to any previously visited pose. If so, and the sensor readings match, a loop has been detected.
Once a loop is detected, the accumulated error must be distributed across all poses in the loop. Backwards correction adjusts past pose estimates to make the loop close:
1. Detect the loop: current pose matches a past pose (say, from time t0).
2. Compute the closure error: the difference between the current pose and the matched past pose.
3. Distribute the error across all poses from t0 to t, proportionally to the amount of motion. Poses near the closure point get large corrections; poses far from it get small ones.
Multiple robots can map faster by dividing the environment. The challenge: each robot builds its own local map in its own coordinate frame. To merge them into a global map, the robots must establish a common reference frame.
Rendezvous approach: Two robots meet and observe each other. This fixes the relative transformation between their coordinate frames. Their maps can then be merged by applying this transformation.
Overlap approach: Two robots independently map overlapping areas. Scan matching between the overlapping regions determines the relative transformation.
All mapping algorithms in this book extend to 3D, but at significantly higher computational cost. The robot pose becomes 6D (x, y, z, roll, pitch, yaw). Occupancy grids become voxel grids. Scan matching operates in 6D pose space.
| Aspect | 2D | 3D |
|---|---|---|
| Pose dimensions | 3 (x, y, θ) | 6 (x, y, z, roll, pitch, yaw) |
| Map representation | 2D grid (~104 cells) | Voxel grid (~107 cells) |
| Scan matching | 3D optimization | 6D optimization |
| Sensors | 2D laser, sonar | 3D laser, stereo, depth cameras |
| Concept | Key Idea |
|---|---|
| Incremental ML | Real-time mapping via local gradient descent; O(1) per step |
| Scan matching | Align current scan to existing map using likelihood gradient |
| Loop closure | Detect revisited places; distribute error backwards |
| Multi-robot | Merge maps via coordinate frame alignment |
| 3D mapping | Same principles, much higher computational cost |