Classical Robotics — Robot Learning: A Tutorial, Chapter 2

Chapter 0: The Planar Manipulator

Before we can understand why robot learning is revolutionary, we need to understand what it replaces. Every robotics course starts with the same object: a planar manipulator — a robot arm that moves in a flat 2D plane.

The SO-100 arm that LeRobot supports has 6 degrees of freedom (DOF). That's a lot of complexity. So we'll strip it down to the simplest possible case: a 2-DOF arm with two links of equal length, connected by two revolute joints. Think of it as your upper arm and forearm, but operating on a flat table.

The arm lives in two spaces simultaneously. In joint space, we describe its configuration as two angles: θ₁ (the shoulder angle) and θ₂ (the elbow angle). In task space (also called Cartesian space), we describe the position of the end-effector as (x, y) coordinates.

The central problem of robotics: You know where you want the hand to go (task space), but you can only command the joints (joint space). Converting between these two spaces is the entire business of kinematics.

The simulation below shows the 2-DOF arm. Drag the target (the orange circle) and watch the arm try to reach it by solving for the right joint angles. This is inverse kinematics — the hard direction. The easy direction, computing (x, y) from (θ₁, θ₂), is called forward kinematics.

Interactive 2-DOF Arm

Drag the orange target. The arm solves inverse kinematics to reach it. Gray zone = unreachable (beyond arm's total length). Link length l = 1.0 each.

θ₁ = 0.00 θ₂ = 0.00 p = (0.00, 0.00)

The arm has two joints. In joint space, its configuration is described by:

The (x, y) position of the end-effector Two angles (θ₁, θ₂) — the shoulder and elbow joint angles The forces applied at each motor

Chapter 1: Forward Kinematics

Forward kinematics is the easy direction: given the joint angles (θ₁, θ₂) and link lengths (l₁, l₂), compute where the end-effector is in (x, y) coordinates.

The first link starts at the origin and extends at angle θ₁. Its tip is at (l₁ cos θ₁, l₁ sin θ₁). The second link starts at that tip and extends at angle θ₁ + θ₂ (because θ₂ is measured relative to the first link). So the end-effector position is:

p(q) =
x = l₁ cos(θ₁) + l₂ cos(θ₁ + θ₂)
y = l₁ sin(θ₁) + l₂ sin(θ₁ + θ₂)

Notice how the second link's contribution depends on both angles. This is the first hint that joint space and task space are non-linearly related — a fact that will matter enormously when we try to go in the reverse direction.

Hand calculation: Let θ₁ = π/4, θ₂ = π/3, l = 1 for both links.

The second link angle is θ₁ + θ₂ = π/4 + π/3 = 7π/12 ≈ 1.833 rad.

x = cos(π/4) + cos(7π/12) = 0.707 + (-0.259) = 0.448
y = sin(π/4) + sin(7π/12) = 0.707 + 0.966 = 1.673

The end-effector is at (0.448, 1.673).

Let's verify this geometrically. The arm reaches up and to the right (θ₁ = 45°), then bends further counterclockwise by 60°. The x-coordinate is small because the second link points mostly upward. The y-coordinate is large because both links contribute positive y-components.

Forward Kinematics Explorer

Adjust the joint angles with the sliders. Watch how the end-effector position changes. The hand-calculated example (θ₁=45°, θ₂=60°) is pre-loaded.

θ₁ 45°

θ₂ 60°

p = (0.448, 1.673)

For a 2-DOF arm with l=1, θ₁=0, θ₂=0 (both links pointing right), what is the end-effector position?

(2, 0) — both links extend along the x-axis, total reach = l₁ + l₂ = 2 (1, 1) — each link contributes 1 in a different direction (0, 2) — both links point straight up

Chapter 2: Inverse Kinematics

Inverse kinematics (IK) is the hard direction: given a desired end-effector position p* = (x*, y*), find joint angles (θ₁, θ₂) that reach it. We're solving:

min_q ||p(q) − p*||²

For our 2-DOF arm with equal link lengths, there's an analytical solution using the law of cosines. The distance from origin to target is d = √(x*² + y*²). Then:

cos(θ₂) = (d² − l₁² − l₂²) / (2 l₁ l₂)

θ₂ = ± arccos( cos(θ₂) )

θ₁ = atan2(y*, x*) − atan2(l₂ sin(θ₂), l₁ + l₂ cos(θ₂))

Notice the ± in the θ₂ equation. This means most reachable points have two valid solutions — an "elbow-up" and an "elbow-down" configuration. The robot must choose one. This ambiguity is tame for 2 DOF but becomes a nightmare for 6-DOF arms, where infinitely many joint configurations can reach the same point.

Key insight: Analytical IK exists only for simple geometries. For a general 6-DOF arm with joint limits and obstacles, there is no closed-form solution. We must use iterative numerical optimization — which is expensive, may converge to local minima, and must be re-solved at every timestep. This is the first crack in the classical pipeline.

The simulation below shows the configuration space (C-space) — the (θ₁, θ₂) plane. Each point is a joint configuration. The green region is where the arm reaches the target. The gray region is where it doesn't. With obstacles, forbidden regions appear as holes in C-space.

Configuration Space

The left panel shows the arm. The right panel shows C-space (θ₁ vs θ₂). The orange dot marks the current configuration. Drag the target in the arm view to see the solution point move in C-space. Both elbow-up and elbow-down solutions are shown.

Drag target in left panel

Why does a 2-DOF planar arm have two IK solutions for most reachable target positions?

Because the arm has two links, and each link has its own solution Because θ₂ can be positive or negative (elbow-up or elbow-down), and both reach the same point Because numerical optimization finds two local minima

Chapter 3: The Jacobian

Forward kinematics gives us position from angles: p = p(q). But what about velocities? If we wiggle the joints at rates q̇ = (θ̇₁, θ̇₂), how fast does the end-effector move?

The answer is the Jacobian matrix J(q), the matrix of partial derivatives of p with respect to q. It maps joint velocities to end-effector velocities:

ṗ = J(q) · q̇

For our 2-DOF arm, we differentiate the forward kinematics equations:

J(q) =
[ −l₁sin(θ₁) − l₂sin(θ₁+θ₂) , −l₂sin(θ₁+θ₂) ]
[ l₁cos(θ₁) + l₂cos(θ₁+θ₂) , l₂cos(θ₁+θ₂) ]

This is a 2×2 matrix (2 task-space dimensions, 2 joint-space dimensions). The first column tells you what happens to (ẋ, ẏ) when you rotate joint 1. The second column tells you what happens when you rotate joint 2.

Hand calculation at θ₁ = π/4, θ₂ = π/3, l = 1:

θ₁ + θ₂ = 7π/12

J₁₁ = −sin(π/4) − sin(7π/12) = −0.707 − 0.966 = −1.673
J₁₂ = −sin(7π/12) = −0.966
J₂₁ = cos(π/4) + cos(7π/12) = 0.707 + (−0.259) = 0.448
J₂₂ = cos(7π/12) = −0.259

So J = [[-1.673, -0.966], [0.448, -0.259]]

The Jacobian changes with configuration — it depends on q. At some configurations, the Jacobian becomes singular (its determinant is zero). This means the arm loses a degree of freedom: no matter how you move the joints, the end-effector can't move in certain directions. These configurations are called singularities, and they're the bane of classical robotics.

When is J singular? For our 2-DOF arm, det(J) = l₁ l₂ sin(θ₂). The determinant is zero when θ₂ = 0 or θ₂ = π — when the arm is fully extended or fully folded. At these configurations, the end-effector can only move along the arm's axis, not perpendicular to it.

Jacobian Velocity Mapping

Set joint velocities with the sliders. The Jacobian maps them to end-effector velocity (shown as an arrow). Watch the arrow shrink near singularities (θ₂ ≈ 0° or 180°).

θ₁ 45°

θ₂ 60°

θ̇₁ 1.0

θ̇₂ 0.0

det(J) = 0.866

At what joint configuration does the 2-DOF arm's Jacobian become singular?

When θ₁ = 0 When both joints are at 90° When θ₂ = 0 or θ₂ = π — the arm is fully extended or folded, so det(J) = l₁l₂sin(θ₂) = 0

Chapter 4: Differential IK

Full inverse kinematics solves for absolute angles. Differential IK takes a shortcut: instead of asking "what angles reach the target?", it asks "which way should I move the joints right now to get closer?"

We want the end-effector to move with velocity ṗ*. We know ṗ = J(q) q̇. So we need:

q̇ = J(q)⁻¹ ṗ*

But J might not be square (more joints than task dimensions), or it might be singular. The robust solution uses the pseudoinverse J⁺ = J^T(J J^T)⁻¹:

q̇ = J(q)⁺ · ṗ*

This gives the joint velocity vector with minimum norm that achieves the desired end-effector velocity. For our 2×2 case, the pseudoinverse equals the regular inverse (when it exists), but for redundant arms (more joints than needed), the pseudoinverse elegantly handles the extra degrees of freedom.

Why differential IK beats full IK: Full IK is a global optimization problem — it must search over all possible joint configurations. Differential IK is a local linear solve — one matrix multiply per timestep, O(n²) for n joints. The tradeoff: differential IK only gives a direction to move, not the final answer. You must iterate, re-computing J at each step.

The algorithm is simple: at each timestep, compute the error Δp = p* − p(q), multiply by the pseudoinverse to get a joint velocity, take a small step, and repeat.

1. Compute error

Δp = p* − p(q)

↓

2. Compute Jacobian

J = J(q) at current config

↓

3. Pseudoinverse step

q̇ = J⁺ · Δp

↓

4. Update joints

q ← q + α · q̇

↻ repeat until ||Δp|| < ε

python
import numpy as np

def differential_ik(target, q, l1, l2, alpha=0.1, tol=1e-3, max_iter=200):
    """Solve IK via Jacobian pseudoinverse iteration."""
    for _ in range(max_iter):
        # Forward kinematics
        p = np.array([
            l1 * np.cos(q[0]) + l2 * np.cos(q[0] + q[1]),
            l1 * np.sin(q[0]) + l2 * np.sin(q[0] + q[1])
        ])
        # Error
        dp = target - p
        if np.linalg.norm(dp) < tol:
            return q
        # Jacobian
        s1, c1 = np.sin(q[0]), np.cos(q[0])
        s12, c12 = np.sin(q[0]+q[1]), np.cos(q[0]+q[1])
        J = np.array([
            [-l1*s1 - l2*s12, -l2*s12],
            [ l1*c1 + l2*c12,  l2*c12]
        ])
        # Pseudoinverse step
        J_pinv = np.linalg.pinv(J)
        q = q + alpha * J_pinv @ dp
    return q

What advantage does differential IK have over full inverse kinematics?

It's a local linear solve (one matrix multiply per step, O(n²)) rather than a global nonlinear optimization It always finds the globally optimal joint configuration It doesn't require computing the Jacobian

Chapter 5: Feedback Control

Differential IK tells us which direction to move the joints. But in the real world, things go wrong: the motor overshoots, the table vibrates, the gripper slips. Without feedback, these errors accumulate and the arm drifts off course.

Feedback control adds a correction term based on the current error. The simplest version is proportional (P) control:

q̇ = J(q)⁺ · (ṗ* + k_p · Δp)

Here k_p is the proportional gain. When Δp is large (the arm is far from the target), the correction is strong. When Δp is small, the correction fades. This prevents overshoot while ensuring convergence.

Think of k_p as a spring: Imagine a rubber band connecting the end-effector to the target. k_p is the stiffness of that rubber band. Too low and the arm drifts lazily toward the target. Too high and the arm oscillates wildly around it. The sweet spot depends on the system's dynamics — mass, friction, motor bandwidth.

In practice, roboticists use PID control (proportional + integral + derivative) or more sophisticated model-predictive controllers. But the principle is the same: measure the error, correct for it, and repeat at high frequency (typically 100-1000 Hz).

Feedback Control Simulator

Drag the target. Adjust k_p to see how feedback gain affects trajectory tracking. Low gain: slow convergence. High gain: oscillation. The trail shows the end-effector path.

k_p 2.0

Drag target, then press Play

What happens if the proportional gain k_p is set too high?

The arm moves too slowly toward the target The arm oscillates around the target because corrections overshoot The Jacobian becomes singular

Chapter 6: Obstacles and Constraints

So far our arm operates in an empty plane. Real robots share space with objects, humans, and other robots. The arm must reach its target while avoiding collisions — and this is where the classical pipeline starts to crack.

With a static obstacle, the problem is manageable. We can precompute the obstacle region in configuration space — the set of (θ₁, θ₂) configurations where the arm intersects the obstacle — and plan a path that avoids it.

But what about a moving obstacle? Imagine a human reaches across the table while the arm is executing a trajectory. The classical pipeline must:

1. Detect

Perceive the obstacle in real-time

↓

2. Model

Update the C-space obstacle region

↓

3. Replan

Find a new collision-free path

↓

4. Execute

Track the new trajectory with feedback control

↻ repeat at high frequency

Each step must complete within a single control cycle (~10ms). Detection requires real-time computer vision. Modeling requires updating a complex geometric representation. Replanning requires solving a constrained optimization problem. And any single failure in this chain means the arm either freezes (safe but useless) or collides (dangerous).

The moving obstacle problem: This is Figure 7 from Capuano et al. A human hand enters the workspace unpredictably. The classical controller must continuously re-solve IK with new constraints. Each re-solve takes time, and during that time the human hand keeps moving. The system is always reacting to where the obstacle was, not where it is.

Joint limits add another layer of complexity. Real motors can't rotate 360° — they have physical stops. The IK solver must find solutions within the feasible range, which may not exist even when the target is geometrically reachable. And joint velocity limits mean the arm can't just jump to any configuration instantly.

Obstacle Avoidance

The arm must reach the teal target while avoiding the red obstacle. Toggle the obstacle on/off to see how it constrains the solution. Drag the obstacle to see the arm reconfigure.

Obstacle: ON

Why is the moving obstacle problem especially challenging for classical methods?

Because the detect→model→replan→execute loop must complete within ~10ms, and the system always reacts to where the obstacle was, not where it is Because moving obstacles make the forward kinematics equations invalid Because moving obstacles change the link lengths of the arm

Chapter 7: Limitations of Classical Methods

We've now seen the full classical pipeline: forward kinematics tells us where the arm is, inverse kinematics tells us where it should be, the Jacobian connects velocities, feedback control corrects errors, and path planning avoids obstacles. Each piece is elegant. Together, they're fragile.

Capuano et al. identify four specific failure modes:

Problem 1: Brittle Multi-Component Pipelines

The classical stack has perception, planning, dynamics, and control as separate modules. Each is developed and tested independently. But they must communicate through hand-designed interfaces, and the failure of any single module cascades through the entire system. We saw this in Chapter 1 — 0.95⁶ ≈ 0.74 success rate with six 95%-reliable modules.

Problem 2: No Multimodal Data

Classical controllers consume joint angles and maybe a point cloud. They don't look at camera images. They don't read language instructions. They don't feel tactile pressure. Each modality requires a separate processing pipeline and a separate interface to the controller. A learned policy can consume all modalities in a single forward pass.

Problem 3: Simplified Physical Models

The dynamics equations we use assume rigid links, known inertias, no friction, no backlash, and no contact forces. Real robots have flexible joints, uncertain inertias, significant friction, gearbox backlash, and constantly changing contact conditions. The gap between model and reality — the sim-to-real gap — is often larger than the safety margin.

Problem 4: Ignoring Data Trends

Perhaps the most damning limitation: classical methods don't get better with more experience. A Jacobian controller that has been running for a year performs identically to one running for a day. It learns nothing from its history. Meanwhile, scaling laws in deep learning show that performance improves predictably with more data, more compute, and more parameters.

The core argument for learning: Classical methods require human engineers to anticipate every possible scenario and design a solution for it. Learned methods require data from those scenarios and automatically discover solutions. As tasks get more complex and environments get more unstructured, the engineering cost of classical methods grows exponentially, while the data cost of learned methods grows linearly.

Property	Classical	Learned
Multi-component pipeline	6+ modules, hand-designed interfaces	1 network, end-to-end
Multimodal input	Separate processing per modality	All modalities in one forward pass
Physical model	Analytical (rigid body, known params)	Implicit (learned from data)
Scaling with data	No improvement	Predictable improvement
Novel situations	Fails unless pre-engineered	Generalizes from similar training data

Which limitation explains why a classical controller that has been running for a year performs identically to one running for one day?

Brittle multi-component pipelines No multimodal data Simplified physical models Ignoring data trends — classical methods don't learn from experience, so they never improve

Chapter 8: The Case for Learning

Now we can see the full picture. Classical robotics builds a precise model of the world and computes optimal actions analytically. Robot learning skips the model and maps observations directly to actions using a neural network trained on demonstrations.

The key question is: when does each approach win?

Scenario	Classical Wins	Learning Wins
Structured factory	✓ Known geometry, fixed objects
Deformable objects		✓ Physics too complex to model
Known dynamics	✓ Optimal control possible
Novel environments		✓ Generalizes from training data
Safety-critical (surgery)	✓ Provable guarantees
Language instructions		✓ Multimodal input natural
High precision (<0.1mm)	✓ Calibrated analytical models
Contact-rich tasks		✓ Learns from demonstration

The hybrid future: In practice, the best systems combine both. Classical methods provide safety constraints and precise calibration. Learning handles the parts that are too complex to model. This is why understanding classical robotics is essential even if you plan to use only learned policies — you need to know what guarantees you're giving up.

Classical vs Learning Pipeline

Side-by-side comparison. The classical pipeline has many boxes connected by arrows. The learning pipeline has one box. Click each to highlight the data flow.

In which scenario would you prefer a classical controller over a learned policy?

A structured factory with known geometry, fixed objects, and safety-critical precision requirements Folding laundry in a home kitchen with varying lighting and clutter A task described by natural language instructions

Chapter 9: Connections

You now understand the classical robotics stack — what it can do, how it works, and why it isn't enough for unstructured environments. Here's where the journey continues:

Chapter	What You'll Learn	Connection to This Chapter
Ch 3: RL Foundations	MDPs, rewards, policy optimization	Replaces feedback control with learned optimization
Ch 4: Imitation Learning	Behavior cloning, DAgger, action chunks	Replaces the IK solver with learned demonstrations
Ch 5: Policies (ACT, DiffPol)	Specific neural network architectures	The actual networks that replace the classical pipeline
Ch 6: VLAs	Foundation models for robotics	One model that understands vision + language + action

The critical bridge: Chapter 3 (RL) replaces the controller with an optimizer that learns from trial-and-error. Chapter 4 (Imitation Learning) replaces the entire pipeline with a network that learns from watching an expert. The progression is: known dynamics → learned dynamics (RL) → no dynamics at all (imitation). Each step trades model accuracy for data.

The concepts from this chapter — forward kinematics, the Jacobian, singularities — reappear throughout robot learning. Even learned policies must ultimately command joint velocities. The Jacobian still governs how the arm moves. Understanding these fundamentals helps you debug learned policies when they fail: is the policy predicting bad actions, or is the arm in a singular configuration where no action works well?

Feynman's principle applied: "What I cannot create, I do not understand." You've now derived forward kinematics, inverse kinematics, and Jacobian control from scratch. You can implement a complete classical controller in ~50 lines of Python. That understanding doesn't go away when you switch to learned methods — it becomes your debugging toolkit.

What is the progression from classical methods to modern robot learning?

Smaller models → bigger models → biggest models Known dynamics (classical) → learned dynamics (RL) → no explicit dynamics (imitation learning), trading model accuracy for data More sensors → fewer sensors → no sensors

← Chapter 1: Introduction