Every calculation a robotics engineer needs to derive from scratch. Rotation matrices, homogeneous transforms, quaternions, EKF prediction and update, SLAM graphs, forward and inverse kinematics, Jacobians, PID control — all solvable in-browser with instant feedback.
You're programming a robot arm and need to figure out where the tip of the arm ends up after a joint rotates. Everything starts with this: how do you rotate a point in 2D?
A 2D rotation matrix R(θ) rotates a point counter-clockwise by angle θ around the origin:
Rotate the point (3, 0) counter-clockwise by 90°. What is the y-coordinate of the result?
R(90°) = [[cos90, -sin90], [sin90, cos90]] = [[0, -1], [1, 0]]. Multiply R · [3, 0]T.
The point (3, 0) on the positive x-axis moves to (0, 3) on the positive y-axis — exactly a quarter turn counter-clockwise.
Compute the (1,1) entry (top-left) of R(30°) × R(60°). Express as a decimal.
Use R(α)R(β) = R(α+β). The (1,1) entry of a rotation matrix is cos of the total angle.
Rotation composition is just angle addition. This is why rotations in 2D are so simple — they commute (R(a)R(b) = R(b)R(a)). In 3D, rotations do not commute, which is what makes 3D orientation so much harder.
Because R is orthogonal, R-1 = RT. The transpose of R(θ) is R(-θ) — rotating by the negative angle. No need for expensive matrix inversion. This property holds for ALL rotation matrices in ANY dimension.
Rotate the point (1, 1) counter-clockwise by 45°. What is the x-coordinate of the result? (cos45° = sin45° = √2/2 ≈ 0.7071)
The point (1,1) is at 45° from the x-axis at distance √2. Rotating it 45° more places it at 90° from the x-axis — directly on the y-axis at (0, √2). The distance from the origin is preserved: √(0² + (√2)²) = √2 = √(1² + 1²).
Write a function that rotates a 2D point by an angle (in radians). Return [x', y'].
javascript function rotate2D(x, y, theta) { const c = Math.cos(theta); const s = Math.sin(theta); return [c * x - s * y, s * x + c * y]; }
What is the determinant of R(37°)? (No calculator needed.)
This is true for ALL rotation matrices, regardless of angle. det = 1 means the rotation preserves area and orientation (no reflection). This is the defining property of SO(2) — Special Orthogonal group in 2D.
A robot joint rotates AND translates. Rotation matrices alone can't represent translation. How do you combine both into a single matrix multiply? The answer is homogeneous coordinates.
A robot is at position (3, 4) with no rotation (θ=0). Transform the point (1, 0) from robot frame to world frame. What is the x-coordinate in world frame?
With zero rotation, the transform is just a translation: add the robot position to the point. (1,0) in robot frame becomes (4,4) in world frame.
Robot at position (2, 1), heading θ=90°. A point at (1, 0) in the robot's frame. What is the y-coordinate in world frame?
Build T with R(90°) and t=(2,1). Multiply T · [1, 0, 1]T.
The robot faces 90° (along the y-axis). A point 1 unit "ahead" in robot frame (robot's x-axis) is actually 1 unit along the world's y-axis. So world position = (2, 1+1) = (2, 2).
A robot arm has: Tworld_base (translate by (0, 1), no rotation) and Tbase_tool (translate by (2, 0), rotate 90°). A point at (1, 0) in the tool frame. What is the x-coordinate in the world frame?
Compute Tworld_base · Tbase_tool · [1, 0, 1]T. Work inside out: first transform from tool to base, then base to world.
xworld = 2. The chain works inside out: point (1,0) in tool frame becomes (2,1) in base frame (the tool frame is rotated 90° and offset by 2), then (2,2) in world frame (base is offset by 1 in y).
The inverse of a homogeneous transform is NOT just the transpose (that only works for pure rotation matrices). The inverse is T-1 = [[RT, -RTt], [0, 1]]. You rotate the translation vector by RT and negate it. This is computationally cheap — no general matrix inversion needed.
A self-driving car sees a pedestrian. Arrange the steps to convert the pedestrian's position from LiDAR frame to GPS (world) frame.
The pipeline reads left to right but the math reads right to left: pworld = Tworld_car · Tcar_lidar · plidar. Start with the point, apply the LiDAR-to-car transform, then the car-to-world transform.
In 2D, rotations were simple — one angle, one axis. In 3D, everything gets harder. There are multiple representations (Euler angles, rotation matrices, quaternions, axis-angle), each with tradeoffs. And there's a trap called gimbal lock.
A drone rotates 90° around the z-axis (yaw). What does Rz(90°) · [1, 0, 0]T give? Enter the y-component.
The x-axis unit vector rotates to the y-axis unit vector. A 90° yaw maps "forward" to "left" (in a right-handed frame). The z-component stays 0 because Rz doesn't affect z.
At β = 90°, cosβ = 0. Plug into Rz(γ)Ry(90°)Rx(α) and you get a matrix that only depends on γ+α, not on γ and α independently. Two parameters collapsed into one — you lost a degree of freedom. This is gimbal lock. The fix: use quaternions, which have no singular configurations.
Convert the rotation "90° about the z-axis" to a unit quaternion q = [w, x, y, z]. What is the w component?
Axis = [0, 0, 1], angle = 90° = π/2. Use q = [cos(θ/2), sin(θ/2)·n].
Notice the half-angle: 90° rotation uses 45° in the quaternion. This is because quaternions double-cover SO(3) — both q and -q represent the same rotation. The norm is ||q|| = √(0.5 + 0 + 0 + 0.5) = 1, confirming it's a valid unit quaternion.
Convert "180° about the x-axis" to a quaternion. What is the z component?
z = 0 because the rotation axis is [1,0,0], which has no z component. A 180° rotation has w=0 (cos(90°)=0), meaning the quaternion is purely imaginary. This represents the "most different" orientation from identity (where q = [1,0,0,0]).
Is q = [0.5, 0.5, 0.5, 0.5] a valid unit quaternion? Compute ||q||.
Yes, it is a valid unit quaternion. It represents a 120° rotation about the axis [1,1,1]/√3. You can verify: w = cos(θ/2) = 0.5 implies θ/2 = 60°, so θ = 120°. The axis is [0.5, 0.5, 0.5]/sin(60°) = [0.5, 0.5, 0.5]/0.866 = [0.577, 0.577, 0.577] ≈ [1,1,1]/√3.
Your robot drives forward and turns. You know the velocity and angular rate from the wheels. But the model is nonlinear — position update involves cos(θ) and sin(θ). The Extended Kalman Filter (EKF) handles this by linearizing around the current estimate.
Robot at x=[0, 0, 0] (origin, facing east). Drives with v=1 m/s, ω=0 rad/s, dt=1s. What is the new x-coordinate?
Facing east (θ=0), cos(0)=1, sin(0)=0. The robot drives 1 meter along the x-axis. Simple — but the math gets interesting when θ ≠ 0.
Robot at x=[0, 0, π/4] (facing 45°). Drives with v=2 m/s, ω=0 rad/s, dt=0.5s. What is the new y-coordinate?
At 45°, the robot moves equally in x and y. It travels a distance of v×dt = 1 meter, split as (cos45°, sin45°) = (0.707, 0.707).
Robot at θ=π/2 (facing north), v=1 m/s, dt=0.5s. What is the (1,3) entry of the Jacobian F? (Row 1, column 3.)
F(1,3) = -v · sin(θ) · dt.
This entry means: "if heading changes by a small δθ, the x-position prediction changes by -0.5 × δθ." When facing north (θ=π/2), a small increase in heading rotates the velocity vector slightly to the left, reducing x — hence the negative sign.
Given P = [[1,0,0],[0,1,0],[0,0,0.1]], Q = [[0.01,0,0],[0,0.01,0],[0,0,0.001]], and θ=0, v=1, dt=1. Compute P- = FPFT + Q. What is the (1,1) entry of P-?
At θ=0: F = [[1,0,0],[0,1,1],[0,0,1]] (since -v·sin(0)·dt=0 and v·cos(0)·dt=1).
P-(1,1) = 1 + 0.01 = 1.01. The x uncertainty barely grew because at θ=0, x-position doesn't depend on θ (F(1,3)=0). But look at P-(2,2) = 1.11: the y uncertainty grew more because F(2,3)=1, meaning y-prediction couples with θ-uncertainty.
This EKF prediction has a bug. The robot always drifts right even when driving straight north. Click the buggy line.
function ekfPredict(x, y, theta, v, omega, dt) { const xNew = x + v * Math.cos(theta) * dt; const yNew = y + v * Math.cos(theta) * dt; const thetaNew = theta + omega * dt; return [xNew, yNew, thetaNew]; }
Line 3 is the bug. The y update uses Math.cos(theta) instead of Math.sin(theta). When θ=π/2 (facing north), cos(π/2)=0 so y doesn't change — the robot "drifts right" because x increases (cos is nonzero at other angles) while y stays stuck. The correct line: const yNew = y + v * Math.sin(theta) * dt;
The motion model is x' = x + Δx, y' = y + Δy, θ' = θ + Δθ. Taking ∂x'/∂x = 1 because x appears directly in x'. The Jacobian has the form F = I + (off-diagonal terms from the nonlinear corrections). This is the hallmark of an additive motion model.
The predict step pushed our estimate forward in time. Now a sensor reading arrives — say, the range and bearing to a known landmark. The update step fuses this measurement with the prediction.
Robot at (0, 0, 0), landmark at (3, 4). What is the predicted range h1(x)?
The classic 3-4-5 right triangle. The predicted range is simply the Euclidean distance from robot to landmark.
Same setup: robot at (0, 0, 0), landmark at (3, 4). What is the predicted bearing in degrees?
φ = atan2(ly-y, lx-x) - θ = atan2(4, 3) - 0.
The landmark is at 53.13° from the robot's heading direction. Since the robot faces east (θ=0), this is the same as the absolute angle to the landmark.
Robot at (0,0,0), landmark at (3,4). Compute H(1,1), the partial derivative of range with respect to x.
H(1,1) = -dx/√q where dx=3, q=25.
This is -cos(φ) where φ is the bearing angle. The negative sign means: if the robot moves in +x direction, the range to the landmark decreases (because it's moving toward the landmark's x-component). The magnitude 0.6 = cos(53.13°) reflects that the landmark isn't directly ahead.
Predicted measurement: h(x) = [5, 0.9273]. Actual measurement: z = [4.8, 0.95]. Compute the range innovation (first component of y = z - h(x)).
The measured range is 0.2m shorter than predicted. This means the landmark is closer than we thought, so the update step will pull the robot estimate toward the landmark. The innovation is the "surprise" — what the sensor told us that we didn't already know.
Implement a simplified 1D EKF update. Given scalar state x, prediction covariance P, measurement z, measurement function h(x)=x, and measurement noise R, compute the updated state.
javascript function ekfUpdate(x, P, z, R) { const K = P / (P + R); // Kalman gain const xNew = x + K * (z - x); // updated state const PNew = (1 - K) * P; // updated covariance return [xNew, PNew]; }
K = P/(P+R). As R → 0, K → P/P = 1. The updated state becomes x+ = x- + 1 × (z - x-) = z. A perfect sensor overrides the prediction entirely. Conversely, as R → ∞, K → 0 and the measurement is ignored. The Kalman gain is exactly the trust slider between prediction and measurement.
SLAM (Simultaneous Localization and Mapping) is harder than EKF alone. You don't just filter — you build a graph of all past poses and optimize them jointly. Each node is a robot pose. Each edge is a constraint (odometry measurement or loop closure). The solution minimizes total error across all constraints.
A 2D pose graph has 100 robot poses, each with state [x, y, θ]. How many total optimization variables?
In 3D SLAM, each pose has 6 DOF (x, y, z, roll, pitch, yaw), so 100 poses = 600 variables. If landmarks are also optimized (as in full SLAM), add 2 variables per 2D landmark or 3 per 3D landmark.
Pose x1 = [0, 0, 0]. Odometry says we moved [1, 0, 0] (1m forward). Actual pose x2 = [1.1, 0.05, 0.02]. What is the x-component of the error e = z12 - (x2 - x1)?
The error is negative: the actual displacement was 0.1m more than the odometry predicted. The optimizer will try to reduce this error by adjusting x1 and/or x2. Weighted by Ω, larger errors in more-certain measurements get penalized more heavily.
An odometry sensor has covariance Σ = [[0.01, 0, 0], [0, 0.01, 0], [0, 0, 0.001]]. What is the (1,1) entry of the information matrix Ω = Σ-1?
Higher information = more certainty. The x-position has Ω=100, while θ has Ω=1/0.001=1000. The angular measurement is 10x more informative, so angular errors are penalized 10x more heavily in the optimization.
The graph optimizer adjusts ALL poses simultaneously. A loop closure reveals that drift has accumulated over 50 steps. Rather than slamming x50 onto x1 (which would violate all the odometry constraints), the optimizer spreads the correction across all intermediate poses. Each pose shifts a little, and the total cost is minimized. This is the power of graph-based SLAM over filtering.
A 3D SLAM problem has 1000 poses (6 DOF each) and 500 landmarks (3 DOF each). The Gauss-Newton step solves HΔx = b where H is a square matrix. What is the dimension of H?
That's 56 million entries. But H is extremely sparse — each edge only connects two nodes, so each row has at most 6+6=12 nonzero entries. Sparse linear solvers like CHOLMOD exploit this sparsity to solve in milliseconds, not hours.
A robotic arm with two links needs to reach a target. Forward kinematics (FK) tells you: given joint angles, where is the end effector? Inverse kinematics (IK) asks the reverse: given a target position, what joint angles do we need?
L1=L2=1, θ1=0, θ2=0 (arm fully extended along x-axis). What is the x-coordinate of the end effector?
Both links point along the x-axis, so the end effector is at (2, 0). This is the maximum reach: L1 + L2 = 2.
L1=L2=1, θ1=30°, θ2=45°. What is the x-coordinate of the end effector? (Give 4 decimal places.)
cos(30°)=0.8660, sin(30°)=0.5, cos(75°)=0.2588, sin(75°)=0.9659.
The first link points at 30°, and the second link at 30°+45°=75° (angles are cumulative). The end effector is at (1.125, 1.466), about 1.85m from the origin — less than the max reach of 2m because the arm is bent.
L1=L2=1, target at (1, 1). Compute cos(θ2). What is θ2 in degrees?
The elbow bends at exactly 90° to reach (1, 1). This makes geometric sense: the target is at distance √2 from the origin, and two unit-length links forming a right angle span exactly √(1²+1²) = √2.
(3, 1) is at distance √(9+1) = √10 ≈ 3.16, which exceeds the maximum reach L1+L2 = 3. Even fully extended, the arm can only reach 3 meters. The workspace is an annulus (donut) with inner radius |L1-L2|=1 and outer radius L1+L2=3.
Write a function that computes the end-effector position of a 2-link planar arm. Angles are in radians.
javascript function forwardKinematics2Link(L1, L2, theta1, theta2) { const x = L1 * Math.cos(theta1) + L2 * Math.cos(theta1 + theta2); const y = L1 * Math.sin(theta1) + L2 * Math.sin(theta1 + theta2); return [x, y]; }
You can compute positions with FK and IK. But what about velocities? If a joint rotates at 1 rad/s, how fast does the end effector move? The Jacobian matrix maps joint velocities to end-effector velocities.
L1=L2=1, θ1=0, θ2=90°. Compute J(1,1) — top-left entry of the Jacobian.
J(1,1) = -L1sin(θ1) - L2sin(θ1+θ2). sin(0)=0, sin(90°)=1.
Rotating joint 1 moves the end effector in the (-x, +y) direction. Rotating joint 2 moves it in the -x direction only. The determinant is (-1)(0) - (-1)(1) = 1 ≠ 0, so we're not at a singularity.
Using J = [[-1, -1], [1, 0]] from the previous exercise. Joint velocities: θ̇1=1 rad/s, θ̇2=0. What is the x-velocity of the end effector?
Joint 1 rotating at 1 rad/s swings the end effector at (-1, 1) m/s. The speed is √(1+1) = √2 ≈ 1.41 m/s. The end effector moves along a circle, and the Jacobian gives the instantaneous linear velocity of that circular motion.
Compute det(J) for L1=L2=1, θ2=0 (fully extended).
Zero determinant means the Jacobian is singular — it cannot be inverted. Physically, the fully extended arm can only move perpendicular to itself (sideways). It cannot move radially (extend or contract) because both joints produce the same motion direction. This is the kinematic equivalent of two equations being linearly dependent.
The manipulability measure is w = |det(J)| = |L1L2sin(θ2)|. For L1=L2=1, at what θ2 (in degrees) is manipulability maximized?
Maximum manipulability at θ2=90° means the arm can move equally well in all directions. This is the "sweet spot" for control — the arm is most agile with a right-angle elbow bend. Robot path planners try to keep joints away from singularities (θ2=0 or 180°) to maintain good manipulability.
This Jacobian computation produces wrong velocities. Click the buggy line.
function jacobian2Link(L1, L2, t1, t2) { const s1 = Math.sin(t1), c1 = Math.cos(t1); const s12 = Math.sin(t1+t2), c12 = Math.cos(t1+t2); const J11 = -L1*s1 - L2*s12; const J12 = -L2*s12; const J21 = L1*c1 + L2*c12; const J22 = L2*s12; return [[J11,J12],[J21,J22]]; }
Line 7 is the bug. J(2,2) = ∂y/∂θ2 = L2·cos(θ1+θ2) = L2*c12, but the code uses L2*s12 (sine instead of cosine). This would give wrong end-effector velocities whenever joint 2 is moving. The correct line: const J22 = L2*c12;
You've solved kinematics — you know WHERE to go. Now: how do you actually GET there? A PID controller computes the motor command based on the error between desired and actual position. It's the workhorse of industrial control: simple, tunable, and good enough for 90% of robotic tasks.
Target = 10, actual = 7, Kp = 2, Ki = 0, Kd = 0. What is the control output u?
Pure proportional control: the output is simply the gain times the error. As the system approaches the target, the error shrinks and the correction weakens. This is why P-only control has steady-state error — the correction vanishes before reaching zero error.
Kp=1, Ki=0.1, Kd=0.5, dt=1. Errors over 3 steps: e[0]=5, e[1]=3, e[2]=1. Assume e[-1]=0 (initial). What is u[2]?
u[k] = Kp·e[k] + Ki·(sum of all e×dt) + Kd·(e[k]-e[k-1])/dt.
The P term says "error is 1, push with force 1." The I term says "errors have been accumulating (total 9), push a little extra." The D term says "error is decreasing (from 3 to 1), back off to avoid overshoot." The net command 0.9 is a gentle push — the system is converging nicely.
Rapid oscillation around the target is the hallmark of excessive Kp. The controller overcorrects: it pushes too hard, overshoots, sees a negative error, pushes back too hard, overshoots again. Ki issues cause slow drift/overshoot (not rapid oscillation). Kd issues cause jitter from noise, not sustained oscillation. Fix: reduce Kp and/or add Kd for damping.
Implement a single PID step. Takes current error, previous error, accumulated integral, gains, and dt. Returns [u, newIntegral].
javascript function pidStep(e, ePrev, integral, Kp, Ki, Kd, dt) { const newIntegral = integral + e * dt; const P = Kp * e; const I = Ki * newIntegral; const D = Kd * (e - ePrev) / dt; return [P + I + D, newIntegral]; }
Steady-state error occurs when the proportional correction exactly balances the disturbance (gravity). The system is in equilibrium: Kp×e = gravity force. The error e is nonzero because that's what generates the force to hold position. Adding Ki breaks this equilibrium: the integral accumulates over time, continuously increasing the control effort until e=0. Only then does the integral stop growing. This is why the I term exists — it's the only PID component that guarantees zero steady-state error.
Error decreasing: e[k]=2, e[k-1]=5, dt=0.1. Kd=1. What is the D term?
The D term is negative because the error is decreasing rapidly. It acts as a brake — opposing the direction of movement to prevent overshoot. The magnitude is large (-30) because the error changed by 3 in just 0.1 seconds, so the derivative (rate of change) is -30/s. This is why Kd must be tuned carefully: too much and the system sluggishly approaches the target.
A mobile robot with wheel odometry and a LiDAR that detects known landmarks. Everything from Chapters 0-8 comes together: transforms to convert between frames, the EKF to fuse odometry with landmark measurements, and PID to execute the path. Let's work through a complete navigation cycle.
Design the EKF pipeline for a robot with wheel odometry and LiDAR landmark detection. Arrange the steps in the correct order.
The EKF loop: (1) Predict the new pose using the motion model, (2) Detect a landmark via LiDAR, (3) Innovate — compare predicted vs actual measurement, (4) Update the state estimate using the Kalman gain, (5) Control — compute the next motor command based on the updated pose estimate.
Robot at x=[5, 3, π/6] (30° heading). Drives v=1 m/s, ω=0.1 rad/s for dt=1s. What is the new x-coordinate?
x' = x + v·cos(θ)·dt. cos(30°) = √3/2 ≈ 0.8660.
New state: [5.866, 3.5, 0.624]. The robot moved mostly in x (because cos30° > sin30°) and turned slightly leftward (heading increased by 0.1 rad ≈ 5.7°).
From the predicted pose [5.866, 3.5, 0.624], the robot observes a landmark at world position (10, 7). What is the predicted range?
The landmark is about 5.4 meters away. If the LiDAR measures 5.3m instead, the innovation is -0.12m — suggesting the robot is slightly closer to the landmark than predicted. The EKF update will pull the pose estimate toward the landmark.
Simplified 1D: predicted x=5.866, P=0.5. LiDAR range suggests true x should be 5.7 (after inverting the measurement model). R=0.2. What is the updated x?
This is just the scalar Kalman update: K = P/(P+R), x+ = x + K(z-x).
K = 0.71, meaning we trust the measurement about 71% (because R=0.2 is smaller than P=0.5). The updated position shifts 71% of the way from the prediction toward the measurement. The updated P+ = (1-0.71) × 0.5 = 0.143, much smaller than both P and R — fusion always improves certainty.
From the previous exercise: P=0.5, K=0.7143. What is the updated covariance P+ = (1-K)P?
The covariance dropped from 0.5 to 0.143 — the uncertainty decreased by 71%. This is the fundamental guarantee of the Kalman filter: fusing two uncertain estimates always produces a better estimate than either one alone. The updated covariance is always smaller than both the prediction covariance P and the measurement noise R.
Updated pose: x=5.747, target waypoint: x=10. Kp=0.5 for the velocity controller (u = Kp × position_error). What velocity command does the PID produce?
The P controller commands 2.1 m/s. In practice, you'd clamp this to the robot's maximum speed. As the robot approaches the waypoint, the error shrinks and the velocity naturally decreases — this is the beauty of proportional control for waypoint following. A real system would add Kd to avoid overshoot at the waypoint.
For the capstone prediction (θ=π/6, v=1, dt=1), compute the (2,3) entry of the Jacobian F. F(2,3) = v·cos(θ)·dt.
F(2,3)=0.866 says: a small change in heading increases the y-velocity component. At θ=30°, the robot is mostly moving in x; a small heading increase rotates velocity toward y. F(1,3)=-0.5 = -sin(30°) says: that same heading change reduces x-velocity, because rotating away from the x-axis.
| Topic | Lesson |
|---|---|
| Kalman filtering | Kalman Filter — From Absolute Zero |
| Extended Kalman filter | EKF — From Absolute Zero |
| Classical SLAM | Classical SLAM — From Absolute Zero |
| Classical VIO | Classical VIO — From Absolute Zero |
| MDPs & planning | MDP — From Absolute Zero |