Linear Programming

Chapter 0: The Problem

You run a small furniture workshop. You make two products: tables and chairs. Each table earns you $70 profit. Each chair earns you $50 profit. Naturally, you want to make as much money as possible. But there are limits.

Your workshop has three constraints. You have 240 hours of carpentry time per week: each table needs 4 hours, each chair needs 3 hours. You have 100 units of lumber: each table uses 3 units, each chair uses 5 units. And your finishing department can handle at most 200 hours: each table takes 2 hours, each chair takes 4 hours. You also cannot produce negative furniture.

How many tables and chairs should you produce each week to maximize profit?

This is a linear program (LP). You have an objective function you want to maximize (profit = 70x₁ + 50x₂) subject to linear constraints (inequalities that limit your resources). Every term is a constant times a variable — nothing squared, no logarithms, no products of variables. Everything is linear.

Why "linear" matters. Linearity is what makes LP tractable. The feasible region — the set of all (x₁, x₂) satisfying every constraint — is a convex polygon (in 2D) or a convex polytope (in higher dimensions). Convexity guarantees that any local optimum is also a global optimum. And the optimal solution always lives at a vertex (corner point) of this polytope. These two facts — convexity and vertex optimality — are the foundation of everything in this chapter.

Let us write the LP formally. Let x₁ = number of tables, x₂ = number of chairs.

maximize 70x₁ + 50x₂ # profit subject to: 4x₁ + 3x₂ ≤ 240 # carpentry hours 3x₁ + 5x₂ ≤ 100 # lumber units 2x₁ + 4x₂ ≤ 200 # finishing hours x₁, x₂ ≥ 0 # non-negativity

The Geometry: Feasible Region and Vertices

In two dimensions, every inequality constraint defines a half-plane. The constraint 4x₁ + 3x₂ ≤ 240 means "stay below or on the line 4x₁ + 3x₂ = 240." The feasible region is the intersection of all these half-planes — a convex polygon.

The objective function 70x₁ + 50x₂ = z defines a family of parallel lines (one for each value of z). As z increases, these iso-profit lines sweep across the plane. The optimal solution is the point in the feasible region that the iso-profit line touches last as it sweeps upward. Because the region is convex and the boundary is made of straight lines, this last-touch point is always a vertex.

The fundamental theorem of LP. If an LP has an optimal solution, then at least one optimal solution occurs at a vertex of the feasible polytope. This means we never need to search the interior — only vertices. In 2D there are a handful of vertices. In n dimensions there can be exponentially many, but the simplex algorithm navigates them cleverly.

The canvas below shows the feasible region for our furniture problem. The shaded area is feasible. The dashed lines are iso-profit lines for the objective 70x₁ + 50x₂. Drag the profit slider to watch the iso-profit lines sweep across the region. The optimum is the last feasible vertex the line touches.

2D Linear Program — Feasible Region & Iso-Profit Lines

Shaded = feasible region. Dashed = iso-profit lines. Drag the slider to sweep the objective value. The optimum is at the last vertex touched.

Profit z 50%

Drag the slider or click "Sweep to Optimum" to find the optimal vertex.

Why Brute Force Works (in 2D) but Not in General

For two variables, you could enumerate all vertices by finding every pair of constraint intersections, check which are feasible, and evaluate the objective at each. With 5 constraints (including non-negativity), there are at most C(5,2) = 10 intersection points. That is trivial.

But in n dimensions with m constraints, the number of vertices can be as large as C(m, n) — which is exponential in n. A problem with 100 variables and 200 constraints could have up to C(200, 100) ≈ 9 × 10⁵⁸ vertices. Enumerating them all is absurd. We need an algorithm that walks from vertex to vertex, improving the objective at each step, without visiting most vertices. That algorithm is the simplex method.

Think of it this way. Imagine standing at a corner of a many-dimensional diamond. You can see the neighboring corners. The simplex algorithm is like always walking to the highest neighbor. Because the terrain is convex, uphill steps always lead to the summit — you never get stuck on a false peak. The only question is how many steps it takes.

Three Possible Outcomes

An LP can have exactly three outcomes. Understanding all three prevents confusion during algorithm design.

1. Optimal solution exists. The feasible region is non-empty and the objective is bounded. There is at least one vertex where the optimum is achieved. This is the normal case.

2. Infeasible. The constraints are contradictory — no point satisfies all of them simultaneously. Example: x₁ + x₂ ≤ 1 and x₁ + x₂ ≥ 3 with x₁, x₂ ≥ 0. The feasible region is empty. The simplex algorithm detects this during Phase I (finding an initial feasible solution).

3. Unbounded. The feasible region extends to infinity in a direction that improves the objective. Example: maximize x₁ subject to x₁ - x₂ ≤ 5, x₁, x₂ ≥ 0. You can increase x₁ and x₂ together forever. The simplex algorithm detects this when the minimum ratio test finds no positive coefficient — meaning you can increase the entering variable without limit.

Outcome	Feasible Region	Objective	Simplex Detection
Optimal	Non-empty	Bounded	All objective coefficients ≤ 0 in final tableau
Infeasible	Empty	N/A	Phase I fails to find feasible starting point
Unbounded	Non-empty, unbounded in objective direction	+∞	Entering column has no positive entries

In practice. Well-formulated real-world LPs are almost always feasible and bounded. Infeasibility usually signals a modeling error (contradictory constraints). Unboundedness usually means a missing constraint (forgetting a budget limit). When your solver returns "infeasible" or "unbounded," check your model before checking the solver.

Before we can run simplex, we need to express the LP in a standard form that the algorithm can manipulate algebraically. That is the subject of Chapter 1.

Quick check: Why does the optimal solution to a linear program always occur at a vertex of the feasible region (assuming a bounded optimum exists)?

Because the objective function is always quadratic near the boundary Because the feasible region is convex and the objective is linear, so any interior point can be improved by moving toward a boundary, and any boundary point can be improved by moving toward a vertex Because the simplex algorithm is designed to only visit vertices

Chapter 1: Standard and Slack Forms

Before we can solve an LP algorithmically, we need a canonical representation. There are many equivalent ways to write the same LP — maximizing vs. minimizing, ≤ vs. ≥ constraints, variables that can be negative vs. non-negative. The standard form picks one convention and sticks to it.

Standard Form

An LP is in standard form if it:

Maximizes the objective (not minimizes)
All constraints are ≤ inequalities (not ≥ or =)
All variables are non-negative (x_i ≥ 0)

In matrix notation:

maximize c^Tx subject to Ax ≤ b, x ≥ 0

Where c is the n×1 objective vector, A is the m×n constraint matrix, b is the m×1 resource vector, and x is the n×1 decision variable vector. Our furniture LP is already in standard form: we maximize, all constraints are ≤, and x₁, x₂ ≥ 0.

Converting to Standard Form

Any LP can be converted to standard form using three transformations:

1. Minimization → Maximization. If the original LP minimizes c^Tx, negate the objective: maximize (-c)^Tx. The optimal x is the same; the optimal value has opposite sign.

2. Equality constraints. Replace each equality a_i^Tx = b_i with two inequalities: a_i^Tx ≤ b_i and a_i^Tx ≥ b_i. The second inequality is then flipped: -a_i^Tx ≤ -b_i.

3. Unrestricted variables. If x_j can be negative, replace it with x_j = x_j⁺ - x_j^- where both x_j⁺, x_j^- ≥ 0. The difference can represent any real number.

Think of it this way. Standard form is like a normal form in grammar — it does not change the meaning, just the representation. Every LP, no matter how it is originally written, can be expressed in standard form. The transformations are mechanical: flip signs, split equalities, split unrestricted variables. No information is lost.

Slack Form

Standard form uses inequalities. But algebraic manipulation (like the simplex algorithm) prefers equalities. We convert each inequality to an equality by introducing a slack variable.

The constraint 4x₁ + 3x₂ ≤ 240 becomes 4x₁ + 3x₂ + s₁ = 240 where s₁ ≥ 0 is the slack — it measures how much "room" we have left. If 4x₁ + 3x₂ = 200, then s₁ = 40: we have 40 unused carpentry hours.

For our furniture LP, slack form is:

maximize z = 70x₁ + 50x₂ s₁ = 240 - 4x₁ - 3x₂ # carpentry slack s₂ = 100 - 3x₁ - 5x₂ # lumber slack s₃ = 200 - 2x₁ - 4x₂ # finishing slack x₁, x₂, s₁, s₂, s₃ ≥ 0

Now we have 5 variables (2 original + 3 slack) and 3 equality constraints. The slack variables s₁, s₂, s₃ are the basic variables — they are expressed in terms of the non-basic variables x₁, x₂. Setting the non-basic variables to zero gives the basic feasible solution (BFS): x₁ = 0, x₂ = 0, s₁ = 240, s₂ = 100, s₃ = 200. This corresponds to producing nothing — all resources are slack.

Key insight: basic feasible solutions ARE vertices. Every vertex of the feasible polytope corresponds to a BFS. At a vertex, exactly n constraints are active (tight). The slack for those constraints is zero, meaning those slack variables are zero, meaning those variables left the basis. The simplex algorithm works by moving from BFS to BFS — which is the same as moving from vertex to vertex.

Worked Conversion Example

Let us convert a non-standard LP to standard form step by step. The original problem:

minimize 2x₁ - 3x₂ subject to: x₁ + x₂ = 10 # equality constraint x₁ - x₂ ≥ 2 # wrong direction inequality x₁ ≥ 0, x₂ unrestricted # x₂ can be negative

Step 1: Convert minimize to maximize. Negate the objective: maximize -2x₁ + 3x₂.

Step 2: Handle the equality constraint. Replace x₁ + x₂ = 10 with two inequalities: x₁ + x₂ ≤ 10 and -x₁ - x₂ ≤ -10.

Step 3: Flip the ≥ inequality. x₁ - x₂ ≥ 2 becomes -x₁ + x₂ ≤ -2.

Step 4: Handle unrestricted x₂. Replace x₂ = x₂⁺ - x₂^- with x₂⁺, x₂^- ≥ 0.

The result in standard form (3 variables, 3 constraints):

maximize -2x₁ + 3x₂⁺ - 3x₂^- subject to: x₁ + x₂⁺ - x₂^- ≤ 10 -x₁ - x₂⁺ + x₂^- ≤ -10 -x₁ + x₂⁺ - x₂^- ≤ -2 x₁, x₂⁺, x₂^- ≥ 0

Note the negative RHS values (-10 and -2). This means the origin x = 0 is not feasible. The simplex algorithm needs a feasible starting point, so it uses Phase I: add artificial variables, solve an auxiliary LP to find a feasible BFS, then switch to Phase II (optimize the real objective). Most textbooks and solvers handle this automatically.

Visualizing the Transformation

The canvas below shows the transformation from standard form (inequalities) to slack form (equalities with slack variables). Each constraint is shown with its slack variable. As you move the point (x₁, x₂) around the feasible region, watch the slack values change: they are zero on the boundary and positive in the interior.

Standard Form → Slack Form

Drag the point inside the feasible region. Watch the slack variables (right panel) change. At a vertex, exactly 2 slacks are zero (in 2D).

Drag the orange point to explore slack values.

Data Flow Summary

Form	Variables	Constraints	Used By
General	n original	Mix of ≤, ≥, =	Problem statement
Standard	n (all ≥ 0)	All ≤, max objective	Theory, duality proofs
Slack	n + m (all ≥ 0)	All equalities	Simplex algorithm

The transformation from general to standard to slack is mechanical and reversible. No information is lost. But the slack form is what the simplex algorithm actually operates on. It needs equalities so it can do algebra — solving for one variable in terms of others, pivoting variables between basic and non-basic sets.

Quick check: You have the constraint 3x₁ + 2x₂ ≤ 18 with x₁ = 2, x₂ = 3. What is the slack variable s_i?

s_i = 0 (constraint is tight) s_i = 6 (slack = 18 - 3(2) - 2(3) = 18 - 12 = 6) s_i = 12 (slack = 18 - 6 = 12)

Chapter 2: The Simplex Algorithm

The simplex algorithm, invented by George Dantzig in 1947, is one of the most important algorithms in all of computing. It solves linear programs by walking along the edges of the feasible polytope, moving from vertex to vertex, always improving the objective value, until it reaches the optimum.

The Idea: Vertex Hopping

Here is the high-level strategy. Start at some vertex (a basic feasible solution). Look at all adjacent vertices — those reachable by moving along one edge of the polytope. Pick the neighbor that improves the objective the most (or any improving neighbor). Move to it. Repeat until no neighbor is better. Because the feasible region is convex, this greedy uphill walk always finds the global optimum.

In algebraic terms, "moving to an adjacent vertex" means performing a pivot operation: one non-basic variable enters the basis, one basic variable leaves. The entering variable is chosen because increasing it improves the objective. The leaving variable is determined by the minimum ratio test — it is the basic variable that reaches zero first as we increase the entering variable.

Simplex in one sentence. Pick a variable that, if increased, would improve the objective (entering). Increase it as much as possible without violating any constraint (minimum ratio test determines the leaving variable). Pivot: swap the entering and leaving variables between the basis and non-basis. Repeat until no entering variable improves the objective — you are at the optimum.

Worked Example

Let us solve our furniture LP step by step. Starting slack form:

z = 70x₁ + 50x₂ # objective s₁ = 240 - 4x₁ - 3x₂ # carpentry s₂ = 100 - 3x₁ - 5x₂ # lumber s₃ = 200 - 2x₁ - 4x₂ # finishing

Initial BFS: x₁ = 0, x₂ = 0, s₁ = 240, s₂ = 100, s₃ = 200. Objective z = 0. This is the origin — produce nothing.

Iteration 1: Choose entering variable. Look at the objective z = 70x₁ + 50x₂. Both coefficients are positive, so increasing either variable improves z. We pick x₁ (coefficient 70 is largest). This is the largest-coefficient rule.

Minimum ratio test. How much can we increase x₁ (keeping x₂ = 0) before some basic variable goes negative?

s₁ = 240 - 4x₁ ≥ 0 ⇒ x₁ ≤ 60 # carpentry limits x₁ to 60 s₂ = 100 - 3x₁ ≥ 0 ⇒ x₁ ≤ 33.3 # lumber limits x₁ to 33.3 s₃ = 200 - 2x₁ ≥ 0 ⇒ x₁ ≤ 100 # finishing limits x₁ to 100

The tightest constraint is s₂ (lumber), giving x₁ ≤ 33.3. So s₂ is the leaving variable — it becomes zero. We solve the lumber equation for x₁:

s₂ = 100 - 3x₁ - 5x₂ 3x₁ = 100 - s₂ - 5x₂ x₁ = 100/3 - s₂/3 - 5x₂/3

Now substitute this expression for x₁ everywhere else:

z = 70(100/3 - s₂/3 - 5x₂/3) + 50x₂ = 7000/3 - 70s₂/3 - 350x₂/3 + 50x₂ = 7000/3 - 70s₂/3 + (-350/3 + 150/3)x₂ = 7000/3 - 70s₂/3 - 200x₂/3 # ≈ 2333.3

After pivot 1: Basic: {x₁, s₁, s₃}. Non-basic: {x₂, s₂}. BFS: x₁ = 100/3 ≈ 33.3, x₂ = 0, z ≈ 2333.3. We moved from the origin to the vertex at (33.3, 0).

Iteration 2: Choose entering variable. z = 7000/3 - 70s₂/3 - 200x₂/3. Both non-basic coefficients are negative. No variable can be increased to improve z. We are optimal!

The optimal solution is x₁ = 100/3 ≈ 33.3 tables, x₂ = 0 chairs, profit z = 7000/3 ≈ $2333.33.

Wait — fractional tables? Yes. LP allows fractional solutions. If you need integer solutions (you cannot make 1/3 of a table), you have an integer linear program (ILP), which is NP-hard. LP relaxation gives a bound; rounding or branch-and-bound gives integer solutions. We return to this in Chapter 9.

The Simplex Algorithm — Formal Steps

1. Initialize

Start with slack form. Basic = slack vars, non-basic = original vars. Initial BFS: all original vars = 0.

↓

2. Select entering variable

Find a non-basic variable with positive coefficient in the objective. If none exists, current BFS is optimal — STOP.

↓

3. Minimum ratio test

For each basic variable, compute the ratio b_i / a_ie (only for positive a_ie). The basic variable with the smallest ratio is the leaving variable.

↓

4. Pivot

Solve the leaving variable's equation for the entering variable. Substitute into all other equations and the objective.

↻ Go to step 2

Watch Simplex Walk

The canvas below shows the simplex algorithm walking from vertex to vertex on our furniture LP. Each iteration is one edge traversal. The objective value increases at every step. Click "Step" to advance one pivot, or "Run" to animate.

Simplex Algorithm — Vertex Walking

Green path = simplex trajectory. Each arrow = one pivot. Orange star = current BFS. Watch the objective improve at every step.

Click "Step" to perform one pivot, or "Run" to animate.

Understanding the Tableau

The simplex algorithm is often implemented using a tableau — a matrix that encodes the current slack form compactly. Each row corresponds to a basic variable's equation, and the last row is the objective function. Pivoting is just Gaussian elimination on this matrix.

The initial tableau for our furniture LP looks like this:

	x₁	x₂	s₁	s₂	s₃	RHS
s₁	4	3	1	0	0	240
s₂	3	5	0	1	0	100
s₃	2	4	0	0	1	200
z	-70	-50	0	0	0	0

The objective row uses negated coefficients (for the implementation convention of maximizing). When all entries in the objective row are non-negative, we are optimal.

After pivot 1 (x₁ enters, s₂ leaves): we divide the s₂ row by the pivot element (3) and eliminate x₁ from all other rows. The result:

	x₁	x₂	s₁	s₂	s₃	RHS
s₁	0	-7/3	1	-4/3	0	106.7
x₁	1	5/3	0	1/3	0	33.3
s₃	0	2/3	0	-2/3	1	133.3
z	0	200/3	0	70/3	0	2333.3

All entries in the objective row are non-negative (200/3 ≈ 66.7 and 70/3 ≈ 23.3). We are optimal. The solution: x₁ = 33.3, x₂ = 0, z = 2333.3.

Notice the dual solution hiding in the tableau. The objective row entries under the slack variables (s₂ column = 70/3 ≈ 23.3) are the dual variables. The dual variable y₂ = 23.3 is the shadow price of lumber — one more unit of lumber is worth $23.3 in additional profit. The simplex algorithm produces both primal and dual solutions simultaneously.

Reading the final tableau. In the optimal tableau: (1) The RHS column gives basic variable values. (2) The objective row's RHS gives the optimal z*. (3) The objective row entries under slack variables give dual variable values (shadow prices). (4) A non-basic variable's reduced cost (objective row entry) tells you how much the objective would decrease per unit if you forced that variable into the basis. All of this information from one matrix.

Python Implementation

python
import numpy as np

def simplex(c, A, b):
    """Solve: maximize c^T x subject to Ax <= b, x >= 0.
    c: (n,) objective coefficients
    A: (m, n) constraint matrix
    b: (m,) resource vector (must be >= 0)
    Returns: optimal x, optimal value z
    """
    m, n = A.shape
    # Build initial tableau: [A | I | b] with objective row [-c | 0 | 0]
    tab = np.zeros((m + 1, n + m + 1))
    tab[:m, :n] = A                            # constraint coefficients
    tab[:m, n:n+m] = np.eye(m)                 # slack variables
    tab[:m, -1] = b                             # RHS
    tab[-1, :n] = -c                             # objective (negated for min)
    basis = list(range(n, n + m))             # initial basis = slack vars

    while True:
        # Step 2: find entering variable (most negative in obj row)
        obj_row = tab[-1, :-1]
        if np.all(obj_row >= -1e-10):
            break                              # optimal!
        enter = np.argmin(obj_row)

        # Step 3: minimum ratio test
        col = tab[:m, enter]
        rhs = tab[:m, -1]
        ratios = np.full(m, np.inf)
        pos = col > 1e-10
        ratios[pos] = rhs[pos] / col[pos]
        if np.all(ratios == np.inf):
            raise ValueError("LP is unbounded")
        leave = np.argmin(ratios)

        # Step 4: pivot
        pivot = tab[leave, enter]
        tab[leave] /= pivot                   # normalize pivot row
        for i in range(m + 1):
            if i != leave:
                tab[i] -= tab[i, enter] * tab[leave]
        basis[leave] = enter

    # Extract solution
    x = np.zeros(n)
    for i, var in enumerate(basis):
        if var < n:
            x[var] = tab[i, -1]
    return x, tab[-1, -1]

# Furniture problem
c = np.array([70, 50])
A = np.array([[4, 3], [3, 5], [2, 4]])
b = np.array([240, 100, 200])
x_opt, z_opt = simplex(c, A, b)
print(f"Optimal: x = {x_opt}, z = {z_opt:.2f}")
# Optimal: x = [33.33  0.  ], z = 2333.33

Quick check: In the simplex algorithm, the minimum ratio test determines the leaving variable. Why do we only consider positive coefficients a_ie in the ratio b_i/a_ie?

Because if a_ie ≤ 0, increasing the entering variable does not decrease the basic variable b_i — it stays non-negative or increases. Only positive a_ie causes the basic variable to decrease toward zero. Because negative coefficients mean the constraint is inactive Because we want the smallest positive ratio for numerical stability

Chapter 3: Degeneracy and Cycling

In Chapter 2, the simplex algorithm improved z at every iteration and terminated quickly. But there is a subtle pitfall: what happens when a pivot does NOT improve the objective? This occurs at degenerate vertices, and it can cause the algorithm to cycle forever.

What Is Degeneracy?

A basic feasible solution is degenerate if one or more basic variables has value zero. Geometrically, this means more than n constraints are active (tight) at the same vertex. In 2D, a non-degenerate vertex has exactly 2 active constraints. A degenerate vertex has 3 or more constraints meeting at the same point.

Why does this cause problems? When the minimum ratio test yields a ratio of zero (because some basic variable is already at zero), the pivot moves to a new basis BUT the BFS stays at the same point in space. The objective does not improve. We changed the algebraic representation of the vertex without actually moving. If this happens repeatedly, the algorithm can cycle through a sequence of bases, all representing the same vertex, never making progress.

Cycling is real but rare. It is possible to construct examples where the simplex algorithm cycles. In 1977, Beale constructed such an example. But cycling is vanishingly rare in practice. In decades of computational use on millions of LPs, cycling has almost never been observed on naturally occurring problems. The theoretical fix — Bland's rule — guarantees termination, but most implementations never need it.

Bland's Rule

Bland's rule (1977) is the simplest anti-cycling rule: among all eligible entering variables (those with positive objective coefficient), always choose the one with the smallest index. Among all eligible leaving variables (those with the smallest ratio), always choose the one with the smallest index. That is it. This simple tie-breaking rule provably prevents cycling.

The proof is by contradiction: if cycling occurs, there is a smallest set of variables involved in the cycle. Bland's rule creates a contradiction within this set by showing the last variable to leave the basis could not have re-entered under the smallest-index rule.

Worst-Case Complexity

The simplex algorithm is not polynomial in the worst case. In 1972, Klee and Minty constructed a family of LPs on which the simplex method with the largest-coefficient rule visits every vertex — an exponential number. The Klee-Minty cube in d dimensions has 2^d vertices, and the algorithm visits all of them.

Dimensions	Vertices	Simplex Pivots (worst case)
2	4	4
3	8	8
10	1,024	1,024
20	1,048,576	1,048,576
d	2^d	2^d

But this is a pathological worst case. In practice, the simplex method is astonishingly fast.

Smoothed Analysis: Why Simplex Works in Practice

In 2001, Daniel Spielman and Shang-Hua Teng introduced smoothed analysis, explaining why simplex works so well despite its exponential worst case. The idea: the Klee-Minty cube is razor-sharp. Even tiny random perturbations to the constraint coefficients destroy the pathological structure. Real-world problems are never perfectly aligned — noise, rounding, and imprecise measurements ensure that the constraints are slightly "jiggled."

Under smoothed analysis, the simplex method runs in polynomial time: O(n³ / σ⁴) where σ is the standard deviation of random perturbation. Spielman and Teng won the Godel Prize for this result. It explains a 50-year mystery: why does an exponential-time algorithm solve real problems so fast?

The Klee-Minty moral. Worst-case complexity is not always the right measure. An algorithm can be exponential in theory and polynomial in practice if the pathological cases are fragile — easily destroyed by any real-world noise. Smoothed analysis captures this formally. It is the right framework for understanding simplex.

A Concrete Cycling Example

To understand cycling, consider this LP:

maximize x₁ + x₂ subject to: x₁ + x₂ ≤ 4 # constraint 1 2x₁ + x₂ ≤ 6 # constraint 2 x₁ + 2x₂ ≤ 6 # constraint 3 x₁, x₂ ≥ 0

All three constraints are active at (2, 2): 2+2=4, 4+2=6, 2+4=6. This is a degenerate vertex. The simplex tableau at this vertex has a basic variable equal to zero. During the pivot, the minimum ratio test produces a tie (ratio = 0 for multiple basic variables). The choice of leaving variable is ambiguous.

Without Bland's rule, the algorithm might choose the wrong leaving variable, arrive at a different basis representing the same point (2,2), and repeat. The algebraic representation changes — which variables are basic vs. non-basic — but the actual solution does not move. After several such "degenerate pivots," the algorithm might return to the original basis, creating a cycle.

With Bland's rule, we always break ties by choosing the smallest-index variable. This systematic tie-breaking ensures that the algorithm eventually escapes the degenerate vertex and moves to a genuinely better vertex. The proof uses a clever potential function argument: Bland's rule ensures a lexicographic improvement even when the objective value stays flat.

Practical perspective. Modern simplex implementations rarely use Bland's rule (it can slow down non-degenerate pivots). Instead, they use perturbation methods: add tiny random values to the RHS vector b, which eliminates degeneracy with probability 1. The perturbed LP has the same optimal solution (for small enough perturbations) but never has degenerate vertices. After solving, round back to the original. This is the computational equivalent of Spielman-Teng's smoothed analysis — noise cures degeneracy.

Degeneracy Canvas

The canvas below illustrates a degenerate vertex. Three constraint lines meet at the same point. The simplex algorithm can cycle between different bases at this vertex, but Bland's rule prevents it. Toggle Bland's rule on/off to see the difference.

Degeneracy — Three Constraints at One Vertex

The orange vertex is degenerate: 3 constraints active. Without Bland's rule, the algorithm can cycle. With it, it terminates.

Click "Run Simplex" to watch the algorithm handle the degenerate vertex.

Quick check: The Klee-Minty cube shows that simplex can take exponential time. Smoothed analysis shows it is polynomial under perturbation. Which best explains why real LPs are fast?

Real LPs have fewer constraints than the Klee-Minty cube Real LPs are always solved by interior point methods, not simplex Real-world data is never perfectly aligned — tiny perturbations in coefficients destroy the pathological structure that causes exponential behavior

Chapter 4: Duality

Every linear program has a twin. If the original LP (the primal) is a maximization problem, its twin (the dual) is a minimization problem. They are mirror images, and their optimal values are always equal. This is the strong duality theorem, and it is one of the most beautiful results in optimization.

Constructing the Dual

Start with the primal in standard form:

Primal: maximize c^Tx s.t. Ax ≤ b, x ≥ 0

The dual is constructed by a systematic transposition:

Dual: minimize b^Ty s.t. A^Ty ≥ c, y ≥ 0

The recipe: (1) The primal's resource vector b becomes the dual's objective. (2) The primal's objective c becomes the dual's constraint RHS. (3) The constraint matrix transposes. (4) ≤ becomes ≥. (5) Maximize becomes minimize. (6) Each primal constraint gets a dual variable y_i.

For our furniture LP, the dual is:

minimize 240y₁ + 100y₂ + 200y₃ # total resource "price" subject to: 4y₁ + 3y₂ + 2y₃ ≥ 70 # table constraint 3y₁ + 5y₂ + 4y₃ ≥ 50 # chair constraint y₁, y₂, y₃ ≥ 0

Think of it this way. Imagine a buyer wants to purchase all your resources (carpentry hours, lumber, finishing hours) at prices y₁, y₂, y₃ per unit. The buyer wants to minimize total cost: 240y₁ + 100y₂ + 200y₃. But the buyer must offer enough that you prefer selling resources over making furniture. For tables: the resources used (4y₁ + 3y₂ + 2y₃) must be worth at least as much as the table's profit ($70). Same for chairs. The dual is the buyer's problem: price resources as cheaply as possible while ensuring the seller prefers to sell.

Weak Duality

Weak duality theorem: For any feasible primal solution x and any feasible dual solution y:

c^Tx ≤ b^Ty

Every feasible dual solution provides an upper bound on the primal optimum. Every feasible primal solution provides a lower bound on the dual optimum. The proof is one line:

c^Tx ≤ (A^Ty)^Tx = y^TAx ≤ y^Tb = b^Ty # first ≤: dual constraint A^Ty ≥ c # second ≤: primal constraint Ax ≤ b, y ≥ 0

Strong Duality

Strong duality theorem: If the primal has an optimal solution x*, then the dual also has an optimal solution y*, and:

c^Tx* = b^Ty*

At optimality, the primal and dual values are exactly equal. The gap closes completely. This is remarkable: two different optimization problems, one maximizing and one minimizing, arrive at exactly the same value.

The proof comes from the simplex algorithm itself. The final tableau of the simplex method provides both the optimal primal solution (the basic variables' values) and the optimal dual solution (the objective row's slack variable coefficients). Simplex solves both problems simultaneously.

Complementary Slackness

Strong duality implies complementary slackness: at optimality, for each constraint, either the constraint is tight (active) or the corresponding dual variable is zero. In symbols:

x_i* (A^Ty* - c)_i = 0 for all i # primal variable × dual slack = 0 y_j* (b - Ax*)_j = 0 for all j # dual variable × primal slack = 0

If a resource is not fully used (primal slack > 0), its price must be zero (dual variable = 0). If a resource has a positive price, it must be fully consumed. This makes economic sense: scarce resources (fully consumed) have positive prices; abundant resources (slack left over) are free.

Duality as a certificate. If someone claims "the maximum profit is $2333," how do you verify without re-solving? Find the dual solution. If b^Ty = 2333, weak duality tells you that no feasible x can do better. The dual solution is a proof that the primal is optimal. This "proof by certificate" idea underlies much of complexity theory — a dual solution is a short, verifiable witness of optimality.

Economic Interpretation of Duality

Duality has a beautiful economic interpretation. The primal problem asks: "Given limited resources, how do I maximize profit?" The dual asks the reverse: "What is the minimum price I should charge for my resources so that no production activity is profitable?"

Consider our furniture LP. The primal solution says: make 33.3 tables, 0 chairs, profit $2333. The dual solution assigns prices to resources: y₁ = 0 (carpentry is abundant), y₂ ≈ 23.3 (lumber is scarce, worth $23.3/unit), y₃ = 0 (finishing is abundant). The total resource value: 240(0) + 100(23.3) + 200(0) = $2333. Exactly the profit.

Why is y₁ = 0? Because the carpentry constraint is not tight — we use only 4(33.3) = 133 of 240 available hours. Abundant resources have zero price. Why is y₂ = 23.3? Because lumber is the binding constraint. Every additional unit of lumber allows us to make 1/3 more table (since each table uses 3 lumber), gaining 70/3 ≈ $23.3 more profit.

The dual as a market equilibrium. At the optimal dual prices, the resource market is in equilibrium: (1) No production activity is profitable — the cost of resources used exceeds the revenue for any product not being produced (chairs: 3(0)+5(23.3)+4(0) = 116.7 > 50 revenue, so chairs are unprofitable at these prices). (2) Products being produced break exactly even — tables: 4(0)+3(23.3)+2(0) = 70 = revenue of $70. This is complementary slackness: if you are producing something, its resource cost equals its revenue. If its resource cost exceeds its revenue, you produce zero of it.

Duality in Game Theory

Von Neumann's minimax theorem (1928) for two-person zero-sum games is equivalent to LP duality. Player 1 maximizes their minimum payoff (primal). Player 2 minimizes their maximum loss (dual). The minimax theorem says these values are equal — which is exactly strong duality. The mixed-strategy Nash equilibrium of a zero-sum game can be found by solving an LP.

This connection is not coincidental. Dantzig developed the simplex method partly inspired by von Neumann's game theory work. Von Neumann immediately recognized that his minimax theorem was a special case of LP duality when Dantzig showed him the simplex method in 1947. The two theories developed in parallel and are deeply intertwined.

Primal-Dual Canvas

The canvas below shows the primal and dual problems side by side. As you drag the primal solution toward optimality, the dual bound decreases to meet it. At the optimal solution, the two values are equal — the duality gap is zero.

Weak and Strong Duality

Left: primal feasible region with objective value. Right: dual bound. Watch them converge at the optimum. Drag the slider to move along the simplex path.

Simplex step 0%

Drag the slider to watch primal and dual values converge.

Quick check: You find a feasible primal solution with value 100 and a feasible dual solution with value 100. What can you conclude?

Both solutions are optimal — by weak duality, c^Tx ≤ b^Ty, so equality implies both are optimal Only the primal is optimal; the dual might improve Nothing useful — the values might be coincidentally equal without being optimal

Chapter 5: LP Showcase

This is the payoff. Below is a fully interactive LP solver. You define the problem — adjust constraint lines, change the objective — and watch the simplex algorithm solve it in real time. You will see the feasible region reshape, the simplex path change, and the dual solution appear alongside.

The Interactive LP Lab

The LP has two variables (x₁, x₂). You can:

Drag constraint lines — click and drag any colored constraint to change its RHS (the right-hand side b_i).
Change the objective — use the c₁, c₂ sliders to change the profit coefficients.
Load presets — try the Diet Problem, Production Planning, or Network Flow formulations.
Run simplex — watch the algorithm walk from vertex to vertex, with the tableau shown below.

LP Lab — Build, Solve, Explore

Drag constraints to reshape the feasible region. Adjust c₁, c₂ to change the objective. Run simplex to solve.

c₁ 70

c₂ 50

b₁ 240

b₂ 100

b₃ 200

Adjust sliders and click "Run Simplex" to solve. Try presets for classic problems.

Preset Problems

Diet Problem (1947). The original LP application. George Stigler asked: what is the cheapest diet that meets all nutritional requirements? Minimize cost subject to minimum nutrient constraints. Here, we maximize "nutrition score" subject to budget and calorie limits — a simplified version.

Production Planning. Our furniture workshop with different parameters. Adjust b₁, b₂, b₃ to see how changing resource availability shifts the optimum.

Transport. A simplified transportation problem: ship goods from two factories to meet demand, minimizing shipping cost. Formulated as an LP with supply and demand constraints.

Experiment! Try making b₂ very small (tight lumber constraint). Watch the feasible region shrink. The optimal vertex changes. The dual variable for lumber increases — scarce resources become expensive in the dual. Now try making c₁ = 0 (tables are worthless). The objective tilts, and the optimum shifts to the chair-heavy vertex.

Sensitivity Analysis: What Happens When Things Change

One of the most powerful uses of LP is sensitivity analysis — understanding how the optimal solution changes when the problem data changes. In practice, the constraint coefficients and RHS values are never perfectly known. Material costs fluctuate. Demand shifts. Resources become scarce or abundant. A good decision-maker needs to know: how sensitive is my plan to these changes?

The dual variables provide this directly. The dual variable y_i* for constraint i is the shadow price — it tells you how much the optimal objective would improve if you increased b_i by one unit. If the lumber constraint has shadow price y₂* = 23.3, then one more unit of lumber is worth $23.3 in additional profit. This is exactly what the dual variables compute.

Try it in the simulation above. Increase b₂ (lumber) from 100 to 105 and note the change in z*. The change should be approximately 5 × y₂*. This relationship holds exactly within a sensitivity range — the interval of b_i values over which the current basis remains optimal.

Shadow prices in business. Shadow prices answer the question every manager asks: "Should I buy more of resource X?" If the shadow price of machine time is $50/hour and you can rent extra machine time for $30/hour, rent it — you gain $20 per hour. If the shadow price is zero (the resource has slack), do not buy more — you already have more than you need. LP duality converts abstract math into actionable business insight.

Degeneracy in the Showcase

As you drag b₁, b₂, b₃, you may encounter configurations where three constraint lines pass through the same point. This is degeneracy — the vertex where the optimum sits has more active constraints than strictly necessary. The optimal solution is still correct, but the shadow prices may not be unique: multiple dual solutions exist, each providing different (but equally valid) shadow prices. In practice, commercial solvers handle this gracefully.

Try setting b₁ = 240, b₂ = 100, b₃ = 200. The optimum is at x₁ ≈ 33.3, x₂ = 0, where only the lumber constraint is tight. Now slowly decrease b₁ until it also becomes tight. At that moment, you have a degenerate vertex — two constraints active at the optimum. The feasible region changes shape, but the optimum may not move until one constraint "takes over" from another.

Chapter 6: Reductions to LP

Linear programming is not just one optimization problem. It is a universal language for optimization. An enormous number of seemingly different problems — shortest paths, maximum flow, minimum-cost flow, bipartite matching — can all be expressed as linear programs. If you can formulate a problem as an LP, you can solve it with any LP solver.

Shortest Paths as LP

The single-source shortest path problem: find the shortest path from source s to every other vertex in a weighted graph. This is Bellman-Ford (Chapter 24), but it can also be written as an LP.

For each vertex v, let d_v be the distance from s to v. The LP is:

maximize ∑_v d_v # we MAXIMIZE distances... subject to: d_v ≤ d_u + w(u,v) for every edge (u,v) # triangle inequality d_s = 0 # source distance is zero

Wait — maximize? That seems backward. The trick: the triangle inequality constraints force d_v ≤ shortest path length. Maximizing pushes each d_v up to the tightest constraint, which is exactly the shortest path. It is a beautiful application of LP duality: the "maximize subject to upper bounds" formulation naturally computes the tightest (shortest) distances.

Maximum Flow as LP

The max-flow problem (CLRS Chapter 26): maximize flow from source s to sink t through a network with edge capacities. Each edge (u,v) has capacity c(u,v). Flow f(u,v) must satisfy:

maximize ∑_v f(s,v) # total flow out of source subject to: 0 ≤ f(u,v) ≤ c(u,v) for all (u,v) # capacity constraints ∑_u f(u,v) = ∑_w f(v,w) for v ≠ s,t # flow conservation

This is a linear program. The variables are edge flows f(u,v). The constraints are linear. Ford-Fulkerson, Edmonds-Karp, push-relabel — all specialized algorithms for this LP. They are faster than general simplex because they exploit the network structure. But any LP solver can solve max-flow.

LP duality and max-flow/min-cut. The dual of the max-flow LP is the min-cut LP. The max-flow min-cut theorem (Chapter 26) is a special case of strong LP duality! The maximum flow equals the minimum cut because the primal optimum equals the dual optimum. This is one of the deepest connections in combinatorial optimization.

Min-Cost Flow

Min-cost flow generalizes both shortest paths and max-flow. Each edge has a capacity AND a per-unit cost. The LP: minimize total cost while shipping a required amount of flow from s to t. Both shortest path and max-flow are special cases — shortest path has unit flow, max-flow has zero costs.

Bipartite Matching

Given a bipartite graph, find the maximum matching (largest set of edges with no shared vertices). The LP relaxation:

maximize ∑_(u,v)∈E x_uv # number of edges matched subject to: ∑_v:(u,v)∈E x_uv ≤ 1 for all u # each left vertex matched at most once ∑_u:(u,v)∈E x_uv ≤ 1 for all v # each right vertex matched at most once x_uv ≥ 0 # non-negativity

Normally, matching requires x_uv ∈ {0, 1} (integer), making it an ILP. But for bipartite graphs, a miracle occurs: the LP relaxation always has an integer optimal solution. The constraint matrix is totally unimodular — every square submatrix has determinant 0, +1, or -1. Total unimodularity guarantees integer vertices. This is why bipartite matching is polynomial: it reduces to LP, and the LP happens to have integer solutions.

The LP universality principle. LP can express: shortest paths, max flow, min cut, min-cost flow, bipartite matching, assignment, transportation, diet, production planning, and countless other problems. It is the "assembly language" of optimization — low-level, universal, and efficiently solvable. Recognizing when a problem is an LP in disguise is one of the most valuable skills in algorithm design.

Reductions Canvas

The canvas below shows how max-flow can be expressed as an LP. A small network is shown with edge capacities. The LP variables (one per edge) and constraints (capacity + conservation) are displayed alongside. Run the LP solver to find the max flow.

Max Flow as LP

Left: network with capacities. Right: LP formulation. Green edges = flow. Run to solve.

Click "Solve LP" to find the max flow using LP formulation.

LP as a Modeling Language

The power of LP reductions is not just theoretical. In practice, operations researchers formulate problems in LP modeling languages like AMPL, GAMS, or PuLP (Python). You describe the problem declaratively — variables, objective, constraints — and the solver handles the rest. The formulation is separated from the algorithm. This separation is LP's greatest practical strength.

python
# Max flow as LP using PuLP (Python)
from pulp import *

# Network: s->a (cap 5), s->b (cap 4), a->b (cap 2), a->t (cap 3), b->t (cap 6)
prob = LpProblem("MaxFlow", LpMaximize)

# Flow variables (one per edge)
f_sa = LpVariable("f_sa", 0, 5)  # 0 <= f <= capacity
f_sb = LpVariable("f_sb", 0, 4)
f_ab = LpVariable("f_ab", 0, 2)
f_at = LpVariable("f_at", 0, 3)
f_bt = LpVariable("f_bt", 0, 6)

# Objective: maximize total flow out of source
prob += f_sa + f_sb

# Flow conservation at a: in = out
prob += f_sa == f_ab + f_at  # flow into a = flow out of a
# Flow conservation at b: in = out
prob += f_sb + f_ab == f_bt  # flow into b = flow out of b

prob.solve(PULP_CBC_CMD(msg=0))
print(f"Max flow = {value(prob.objective)}")
print(f"f(s,a)={f_sa.varValue}, f(s,b)={f_sb.varValue}")
# Max flow = 7.0
# f(s,a)=3.0, f(s,b)=4.0

Notice how clean the formulation is. You declare variables with bounds (capacity constraints). You write the objective (maximize flow from source). You write conservation constraints (flow in = flow out at internal nodes). Then you call solve(). The solver — which internally may use simplex, interior point, or a hybrid — handles everything. You never write a single line of algorithm code.

The LP stack. In practice, LP lives in a three-layer stack: (1) Modeling language — you describe the problem (PuLP, AMPL, CVXPY). (2) Solver — converts the model to standard form and solves it (CPLEX, Gurobi, HiGHS, GLPK). (3) Algorithm — simplex or interior point. Layer 1 is the human's job. Layers 2-3 are the computer's job. This division of labor is why LP scales to problems with millions of variables.

Quick check: Why can bipartite matching be solved in polynomial time by LP relaxation, when general matching requires integer variables?

Because bipartite graphs have fewer edges than general graphs Because the constraint matrix of bipartite matching is totally unimodular, which guarantees that all vertices of the LP polytope have integer coordinates Because LP solvers round fractional solutions to the nearest integer

Chapter 7: Interior Point Methods

The simplex algorithm walks along the boundary of the feasible polytope, hopping from vertex to vertex. Interior point methods take a radically different approach: they go straight through the middle. Instead of staying on the edges, they carve a path through the interior, converging to the optimum from the inside.

The Barrier Idea

The key insight: instead of treating constraints as hard walls (simplex bounces off them), treat them as soft barriers that repel you more strongly as you approach. Add a barrier function to the objective that penalizes getting close to any constraint boundary:

maximize c^Tx + μ ∑_i ln(b_i - a_i^Tx)

The logarithmic barrier μ ∑ ln(slack_i) goes to -∞ as any slack approaches zero, creating a "force field" that keeps the solution away from the boundary. The parameter μ controls the barrier strength. As μ → 0, the barrier weakens and the solution approaches the true LP optimum.

The algorithm works by solving a sequence of barrier problems with decreasing μ. Each barrier problem is a smooth, unconstrained optimization (because the barrier prevents constraint violation) that can be solved by Newton's method. The sequence of solutions traces a curve called the central path, which converges to the LP optimum.

Think of it this way. Imagine the feasible region is a room. Simplex walks along the walls and corners. Interior point floats through the center of the room, guided by a magnetic field (the objective) while being repelled from the walls (the barrier). The room gets "smaller" as μ decreases, squeezing the path toward the optimal corner. But the algorithm always stays strictly inside — never touching a wall until the very end.

Karmarkar's Algorithm (1984)

Narendra Karmarkar published the first practical interior point method in 1984. His algorithm was the first to be both (1) polynomial-time and (2) competitive with simplex in practice. It runs in O(n^3.5L) time where L is the bit length of the input. This was a breakthrough: for the first time, there was a provably polynomial algorithm for LP that was also fast in practice.

Before Karmarkar, the only known polynomial LP algorithm was the ellipsoid method (Khachiyan, 1979). The ellipsoid method was a theoretical triumph (it proved LP is in P) but was impractical — much slower than simplex on real problems. Karmarkar's algorithm changed the game by being both theoretically polynomial and practically competitive.

Method	Year	Worst-case	Practice	Path
Simplex	1947	Exponential	Very fast	Boundary (vertices)
Ellipsoid	1979	Polynomial	Very slow	Shrinking ellipsoids
Interior point	1984	Polynomial	Fast	Central path (interior)

Simplex vs. Interior Point: When to Use Which

Modern LP solvers (CPLEX, Gurobi, HiGHS) implement both methods. Rules of thumb:

Simplex wins on small to medium problems, and when you need to solve many similar problems (warm-starting from a nearby vertex is efficient).
Interior point wins on very large, sparse problems (100,000+ variables). The per-iteration cost depends on matrix factorization, and sparse LPs factor efficiently.
Hybrid: Most solvers run both in parallel and use whichever finishes first. Interior point can also produce a "near-optimal" interior solution that simplex then "crossover" converts to a vertex.

LP is solved. Between simplex and interior point, LP is an extremely well-solved problem. Problems with millions of variables and constraints are routine. The practical bottleneck is almost always formulation (modeling the real-world problem as an LP), not solution (solving the LP once formulated). This is why LP is the workhorse of operations research, logistics, finance, and engineering.

Interior Point Canvas

The canvas below contrasts simplex (green, walks along edges) with interior point (blue, curves through the interior). Both arrive at the same optimum, but by completely different paths.

Simplex vs. Interior Point

Green = simplex path (boundary). Blue = interior point path (central path through interior). Both reach the same optimum.

μ (barrier) 80

Click "Run Both" to compare simplex and interior point paths.

The Central Path in Detail

The central path is the curve traced by the unique minimizer of the barrier problem as μ varies from ∞ to 0. At μ = ∞, the barrier dominates and the solution is the analytic center of the polytope — the point that maximizes the product of all slacks, the "most interior" point. As μ → 0, the solution slides along the central path toward the optimal vertex.

Each point on the central path satisfies the KKT conditions (Karush-Kuhn-Tucker) of the barrier problem. These conditions require:

c = A^Ty + s # dual feasibility (gradient condition) Ax = b # primal feasibility x_i s_i = μ for all i # complementarity (the barrier condition) x, s ≥ 0 # non-negativity

The third condition x_is_i = μ is the key. As μ → 0, either x_i → 0 or s_i → 0 (or both) — which is exactly complementary slackness. The central path smoothly transitions from "everything is interior" (μ large) to "complementary slackness holds" (μ = 0). Each iteration of the interior point method takes a Newton step toward the current μ-target, then decreases μ.

The convergence is remarkably fast: the number of iterations is O(√n · log(1/ε)) where n is the number of variables and ε is the desired accuracy. This is polynomial and, crucially, almost independent of the constraint coefficients. Compare this to simplex, whose iteration count depends on the geometry of the polytope.

Interior point convergence. A typical interior point solve of an LP with 10,000 variables takes 20-40 iterations. An LP with 1,000,000 variables might take 50-80 iterations. The iteration count grows as O(√n), not linearly. Each iteration is expensive (solving a linear system), but the total iteration count is remarkably small. This is why interior point methods dominate on very large problems.

Quick check: Interior point methods add a logarithmic barrier function to the objective. What happens as the barrier parameter μ approaches zero?

The barrier becomes infinitely strong, pushing the solution to the center of the polytope The barrier weakens, and the interior point solution converges to the true LP optimum at a vertex The algorithm diverges because division by zero occurs

Chapter 8: Interview Arsenal

LP rarely appears directly in coding interviews, but it appears constantly in system design, optimization, and ML interviews. The key skill is recognizing when a problem is an LP in disguise — and knowing how to formulate it. Here is your cheat sheet.

LP Formulation Patterns

Pattern	Objective	Constraints	Example
Resource allocation	Maximize profit	Resource limits	Production planning, portfolio optimization
Diet / covering	Minimize cost	Minimum requirements	Cheapest diet meeting nutrition, minimum staff coverage
Network flow	Max flow or min cost	Capacity + conservation	Shipping, routing, matching
Regression / fitting	Minimize error	None (or regularization)	L1 regression, Chebyshev approximation
Game theory	Maximize min payoff	Strategy probabilities sum to 1	Mixed-strategy Nash equilibria

Formulation Recipe

1. Identify decision variables

"What am I choosing?" — quantities, flows, allocations, probabilities. Each gets a variable x_i.

↓

2. Write the objective

"What am I optimizing?" — profit, cost, time, flow. Must be linear in x_i.

↓

3. Write constraints

"What are the limits?" — resource budgets, minimum requirements, non-negativity. Each must be linear.

↓

4. Verify linearity

No x_i·x_j products, no x_i², no |x_i| (but |x| can be linearized!), no if/then logic (use binary variables for ILP).

Coding Drill: scipy.optimize.linprog

python
from scipy.optimize import linprog

# linprog MINIMIZES c^T x subject to A_ub @ x <= b_ub
# To maximize, negate c.

# Furniture problem: max 70x1 + 50x2
c = [-70, -50]  # negate for minimization
A_ub = [
    [4, 3],   # 4x1 + 3x2 <= 240
    [3, 5],   # 3x1 + 5x2 <= 100
    [2, 4],   # 2x1 + 4x2 <= 200
]
b_ub = [240, 100, 200]
bounds = [(0, None), (0, None)]  # x1, x2 >= 0

res = linprog(c, A_ub=A_ub, b_ub=b_ub, bounds=bounds)
print(f"x1 = {res.x[0]:.2f}, x2 = {res.x[1]:.2f}")
print(f"Optimal profit = {-res.fun:.2f}")  # negate back
# x1 = 33.33, x2 = 0.00
# Optimal profit = 2333.33

Linearization Tricks

Many "non-linear" objectives can be converted to LP:

Absolute value: Minimize |x| can be written as: minimize t, subject to t ≥ x, t ≥ -x, t ≥ 0. The variable t captures the absolute value.

Minimax: Minimize max(x₁, x₂, ..., x_n) becomes: minimize t, subject to t ≥ x_i for all i. The extra variable t acts as the "worst case."

L1 regression: Minimize ∑ |y_i - a_i^Tx| becomes an LP with 2m extra variables (one positive deviation and one negative deviation for each residual). This is why L1 regression produces sparse solutions — the LP optimal is at a vertex.

Interview Question Bank

Q1: "Design a system to optimize ad placement across web pages."

This is an LP. Variables: x_ij = fraction of time ad i appears on page j. Objective: maximize total click-through revenue. Constraints: each page has limited ad slots, each advertiser has a budget, CTR estimates provide the coefficients. In practice, solved as an LP or ILP billions of times per day by Google, Meta, etc.

Q2: "How would you formulate network routing as an optimization problem?"

Min-cost multi-commodity flow. Variables: flow of each commodity on each edge. Constraints: capacity per edge (shared), flow conservation per commodity. Objective: minimize total cost. This is an LP with network structure — specialized solvers handle billions of variables.

Q3: "What is the relationship between LP and machine learning?"

SVM (support vector machine) training is a quadratic program (QP), which is a generalization of LP. L1-regularized regression (Lasso) can be formulated as an LP. The simplex algorithm inspired the perceptron algorithm. LP duality is the foundation of Lagrangian relaxation, which appears everywhere in constrained ML. Understanding LP gives you the conceptual toolkit for all convex optimization in ML.

Q4: "An airline needs to assign crews to flights. How would you model this?"

This is a set covering / set partitioning problem, naturally modeled as an ILP: binary variable x_j = 1 if we use crew schedule j. Minimize total cost subject to: every flight is covered by at least one crew. The LP relaxation provides a lower bound on cost. In practice, airlines use column generation: start with a small set of schedules, solve the LP, use the dual prices to generate promising new schedules, and iterate. This technique solves problems with millions of potential crew schedules.

Q5: "What is the difference between LP and convex optimization?"

LP is a special case of convex optimization where both the objective and constraints are linear (technically, affine). General convex optimization allows convex objective functions (e.g., quadratic, logarithmic) and convex constraint sets (e.g., second-order cones, semidefinite cones). LP's special structure enables the simplex method (vertex-hopping), which is not applicable to general convex programs. But interior point methods work for both LP and general convex optimization — the barrier approach generalizes naturally.

Complexity Cheat Sheet

Problem	Complexity	Algorithm
LP	Polynomial (strongly)	Simplex (practice), Interior point (theory+practice)
ILP	NP-hard	Branch-and-bound, cutting planes
0/1 Knapsack	NP-hard (weakly)	DP, LP relaxation + rounding
TSP	NP-hard (strongly)	LP relaxation + cutting planes (Concorde solver)
Max Flow	Polynomial	Ford-Fulkerson, or LP
Min-Cost Flow	Polynomial	Network simplex, or LP
SVM Training	Polynomial (convex QP)	SMO, interior point

Interview readiness. For LP in interviews, you need: (1) the ability to formulate a word problem as an LP in 5 minutes, (2) knowledge that LP is polynomial-time solvable, (3) awareness of duality as a certificate, (4) the difference between LP (polynomial) and ILP (NP-hard), and (5) familiarity with scipy.optimize.linprog or a similar solver API.

Quick check: You want to minimize the maximum latency across 5 servers. Each server's latency is a linear function of the load assigned. How do you express "minimize the maximum" as an LP?

Introduce a variable t, minimize t, add constraints t ≥ latency_i for each server i. The variable t captures the worst-case latency. Minimize the sum of all latencies and divide by 5 This cannot be expressed as an LP because "max" is non-linear

Chapter 9: Connections

Linear programming sits at the crossroads of algorithms, complexity theory, and applied mathematics. Here is how it connects to the rest of the CLRS universe and beyond.

LP in the CLRS Landscape

Chapter	Connection
Ch 16: Greedy	Greedy algorithms work on problems where the LP relaxation has integer vertices (matroids, bipartite matching). The greedy-choice property is related to total unimodularity.
Ch 22-26: Graph Algorithms	Shortest paths, max flow, min cut, bipartite matching — all are special cases of LP. Specialized algorithms (Dijkstra, Ford-Fulkerson) exploit structure for speed.
Ch 34: NP-Completeness	Integer LP is NP-hard. LP relaxation + rounding is a key approximation technique. The LP relaxation gives a bound; rounding gives a feasible integer solution within a provable factor of optimal.
Ch 35: Approximation	LP-based approximation: solve the LP relaxation, round the fractional solution. Achieves constant-factor approximations for set cover, vertex cover, and many other NP-hard problems.

LP vs. Extensions

Problem	Objective	Constraints	Complexity
LP	Linear	Linear	Polynomial
ILP	Linear	Linear + integer	NP-hard
QP	Quadratic	Linear	Polynomial (convex)
Convex optimization	Convex	Convex	Polynomial
General optimization	Arbitrary	Arbitrary	NP-hard or worse

LP is the simplest member of the convex optimization family. Its special structure (linear objective + linear constraints) enables the simplex method's vertex-hopping strategy. Moving to quadratic objectives (QP) or general convex constraints adds complexity but maintains polynomial solvability. The moment you add integrality constraints (ILP) or non-convexity, the problem becomes NP-hard.

Related Lessons

CLRS Ch 16: Greedy Algorithms — When LP relaxation has integer vertices, greedy works
CLRS Ch 24: Shortest Paths — Bellman-Ford is a dual of the shortest-path LP
CLRS Ch 26: Maximum Flow — Max-flow min-cut theorem = LP strong duality

The Integer Programming Cliff

LP is polynomial. But add the constraint "all variables must be integers" and the problem becomes NP-hard. This is the integer linear programming (ILP) problem. The gap between LP and ILP is one of the sharpest complexity cliffs in all of computer science.

Why is integrality so hard? Because the feasible region is no longer a convex polytope — it is a set of isolated lattice points inside the polytope. You cannot slide smoothly from one feasible solution to another. The LP relaxation (drop the integrality constraint) gives a lower bound, but the gap between the LP relaxation and the true ILP optimum can be arbitrarily large.

In practice, ILP is solved by branch-and-bound: solve the LP relaxation, pick a fractional variable, create two sub-problems (variable ≤ floor and variable ≥ ceil), and recurse. The LP relaxation at each node provides bounds for pruning. Commercial solvers (Gurobi, CPLEX) add cutting planes — additional constraints that cut off fractional solutions without eliminating integer solutions — to tighten the relaxation. The combination of branch-and-bound with cutting planes is called branch-and-cut, and it is the dominant paradigm for solving ILP.

LP relaxation in approximation. Even when you cannot solve the ILP exactly, the LP relaxation is invaluable. Solve the LP. Round the fractional solution to integers using a clever rounding scheme. The LP optimum provides a bound on the true optimum. The rounding introduces error, but for many problems (vertex cover, set cover, facility location) the error is bounded by a constant factor. This is the basis of LP-based approximation algorithms — one of the most powerful tools in Chapter 35 of CLRS.

The Bigger Picture

LP was born in 1947 when George Dantzig invented the simplex method for military logistics during World War II. Since then, it has become one of the most practically important algorithms in existence. Every airline schedules flights with LP. Every supply chain optimizes logistics with LP. Every financial firm optimizes portfolios with LP (or its quadratic extension, QP). The US military, which funded Dantzig's original work, still uses LP-based planning tools.

The theoretical importance is equally profound. LP duality underpins the max-flow min-cut theorem, the theory of two-person zero-sum games (von Neumann's minimax theorem is equivalent to LP duality), and the entire field of combinatorial optimization. The polynomial-time solvability of LP (Khachiyan 1979, Karmarkar 1984) was a landmark in complexity theory. And the open question of whether the simplex method has a polynomial-time pivot rule remains one of the great unsolved problems in theoretical computer science.

Open Problems

Despite 75+ years of study, fundamental questions about LP remain open:

Is there a strongly polynomial simplex algorithm? The simplex method's running time depends on the magnitudes of the input numbers (it is weakly polynomial at best, exponential at worst). No one has found a pivot rule that guarantees a polynomial number of pivots independent of the input magnitudes. The Hirsch conjecture — that the diameter of any d-dimensional polytope with n facets is at most n - d — was disproved by Santos in 2010, but the polynomial Hirsch conjecture (diameter polynomial in n and d) remains open.

Is LP in NC? Can LP be solved in polylogarithmic time with polynomially many processors? LP is known to be P-complete (under logspace reductions), which suggests it cannot be efficiently parallelized. But this has not been proven.

Smoothed complexity of interior point methods. Spielman-Teng analyzed simplex under smoothed analysis. The analogous question for interior point methods — does perturbation improve their convergence rate? — is less well understood.

The LP worldview. Linear programming teaches a way of thinking: decompose a problem into an objective (what you want), variables (what you control), and constraints (what limits you). This framework applies far beyond mathematics — to engineering design, business strategy, policy analysis, and life decisions. The formal structure just makes the thinking precise and the solutions computable.