The Kalman filter assumes straight lines. Real systems curve, twist, and wrap around. The EKF handles all of that by approximating curves with tangent lines at each step.
The standard Kalman filter is beautiful — but it makes a critical assumption: the system must be linear. That means the next state is a matrix times the current state, and measurements are a matrix times the state. In math: x' = Fx and z = Hx.
Real systems almost never obey this. A radar station measures range and bearing to an aircraft — that's a square root and an arctangent, not a matrix multiply. A robot turns by an angle and drives forward — that involves sines and cosines. A satellite re-entering the atmosphere experiences exponential drag. All nonlinear.
The blue Gaussian is the input. Watch how a linear function preserves the bell shape, but a nonlinear function distorts it. Toggle between them.
Examples of nonlinear systems in the wild:
| System | Nonlinearity |
|---|---|
| Radar tracking | Range = sqrt(x² + y²), bearing = atan2(y, x) |
| Robot navigation | Motion involves sin(θ), cos(θ) |
| Spacecraft re-entry | Drag is exponential in altitude |
| Pendulum | Restoring force is sin(θ), not θ |
The key idea of the EKF is simple: if the function is curved, approximate it with a straight line at the current operating point. This is exactly what a first-order Taylor expansion does. Near any point, a smooth curve looks like its tangent line.
For a 1D function f(x), the linearization around point a is: f(x) ≈ f(a) + f'(a)(x - a). The derivative f'(a) is the slope of the tangent line. For multi-variable functions, we use the Jacobian matrix instead of a simple derivative.
Drag the slider to move the linearization point along the curve. The orange line is the tangent. Notice how it matches well nearby but diverges further away.
When your function takes a vector in and produces a vector out, a single derivative number isn't enough. You need the Jacobian matrix: a matrix where each entry is a partial derivative. Row i, column j contains ∂fi/∂xj — how much the i-th output changes when you wiggle the j-th input.
The Jacobian is the multi-dimensional version of "the slope." It tells you the best linear approximation of a vector function near a point. In the EKF, we compute two Jacobians at every step:
For f(x) = [r · cos(θ), r · sin(θ)] (polar to Cartesian), explore how the Jacobian changes with θ.
In the standard KF, the predict step is: x̄ = Fx. The state transition is a matrix multiply. In the EKF, we replace this with the actual nonlinear function f(x). But for the covariance, we still need a matrix — so we use the Jacobian FJ evaluated at the current estimate.
| Kalman Filter | Extended Kalman Filter | |
|---|---|---|
| State | x̄ = F · x | x̄ = f(x) |
| Covariance | P̄ = F P FT + Q | P̄ = FJ P FJT + Q |
| F meaning | Constant matrix | FJ = ∂f/∂x at current x |
Python def ekf_predict(x, P, f, F_jacobian, Q): """EKF predict step.""" x_pred = f(x) # nonlinear state transition F_j = F_jacobian(x) # Jacobian at current estimate P_pred = F_j @ P @ F_j.T + Q # covariance propagation return x_pred, P_pred
Same story for the update step. In the KF, the expected measurement is z̄ = Hx. In the EKF, we use the nonlinear measurement function h(x) to compute what we expect to see. But the Kalman gain uses the Jacobian HJ to propagate uncertainty.
| Kalman Filter | Extended Kalman Filter | |
|---|---|---|
| Innovation | y = z − H x | y = z − h(x) |
| Innov. cov. | S = H P HT + R | S = HJ P HJT + R |
| Gain | K = P HT S−1 | K = P HJT S−1 |
| H meaning | Constant matrix | HJ = ∂h/∂x at predicted x |
Python def ekf_update(x_pred, P_pred, z, h, H_jacobian, R): """EKF update step.""" H_j = H_jacobian(x_pred) # Jacobian at predicted state y = z - h(x_pred) # innovation S = H_j @ P_pred @ H_j.T + R # innovation covariance K = P_pred @ H_j.T @ np.linalg.inv(S) # Kalman gain x = x_pred + K @ y # updated state P = (np.eye(len(x)) - K @ H_j) @ P_pred return x, P
Let's work through a concrete example: a robot at position (x, y) with heading θ observes a landmark at (lx, ly). The measurement is range and bearing:
To get HJ, we take partial derivatives with respect to x, y, and θ. Let dx = lx − x, dy = ly − y, and q = dx² + dy²:
Move the robot (drag the slider) and see how the Jacobian entries change. The landmark is fixed at the orange star.
For a differential-drive robot with state [x, y, θ] and control [v, ω]:
This is the showcase. A robot moves in 2D space with 3 known landmarks. It can only measure the bearing (angle) to each landmark — not the distance. The EKF estimates the robot's position and heading from these bearing-only measurements.
Use the buttons to drive the robot. Watch the green uncertainty ellipse shrink as the filter incorporates measurements. The blue lines show bearing measurements to landmarks.
The EKF is an approximation. It assumes the nonlinear function is well-described by its tangent line within the region of uncertainty. When this assumption breaks, the filter can diverge: the estimated covariance shrinks (the filter thinks it's certain), but the actual error grows. The filter becomes overconfident and wrong.
Increase the nonlinearity to see how the tangent-line approximation degrades. The teal region is the true transformed distribution (sampled). The orange ellipse is the EKF's Gaussian approximation.
This is precisely why the Unscented Kalman Filter (UKF) was invented. Instead of linearizing, the UKF passes carefully chosen sample points through the true nonlinear function, then reconstructs a Gaussian from the results. No Jacobians required, and it captures more of the curvature.
| Aspect | EKF | UKF |
|---|---|---|
| Approach | Linearize with Jacobian | Sample with sigma points |
| Accuracy | 1st order (tangent line) | 2nd order (captures curvature) |
| Jacobians needed? | Yes | No |
| When it fails | High curvature, large uncertainty | Extreme nonlinearity, multimodal |
The EKF sits at the center of a family of nonlinear estimation algorithms. Understanding it gives you the vocabulary and intuition to explore the rest.
The EKF's model of nonlinear functions is wrong (tangent lines aren't curves). But it's useful enough to navigate robots, track missiles, and land spacecraft. You now understand why — and when it breaks.