Simplex, duality, standard forms — the universal language of optimization.
You run a small furniture workshop. You make two products: tables and chairs. Each table earns you $70 profit. Each chair earns you $50 profit. Naturally, you want to make as much money as possible. But there are limits.
Your workshop has three constraints. You have 240 hours of carpentry time per week: each table needs 4 hours, each chair needs 3 hours. You have 100 units of lumber: each table uses 3 units, each chair uses 5 units. And your finishing department can handle at most 200 hours: each table takes 2 hours, each chair takes 4 hours. You also cannot produce negative furniture.
How many tables and chairs should you produce each week to maximize profit?
This is a linear program (LP). You have an objective function you want to maximize (profit = 70x1 + 50x2) subject to linear constraints (inequalities that limit your resources). Every term is a constant times a variable — nothing squared, no logarithms, no products of variables. Everything is linear.
Let us write the LP formally. Let x1 = number of tables, x2 = number of chairs.
In two dimensions, every inequality constraint defines a half-plane. The constraint 4x1 + 3x2 ≤ 240 means "stay below or on the line 4x1 + 3x2 = 240." The feasible region is the intersection of all these half-planes — a convex polygon.
The objective function 70x1 + 50x2 = z defines a family of parallel lines (one for each value of z). As z increases, these iso-profit lines sweep across the plane. The optimal solution is the point in the feasible region that the iso-profit line touches last as it sweeps upward. Because the region is convex and the boundary is made of straight lines, this last-touch point is always a vertex.
The canvas below shows the feasible region for our furniture problem. The shaded area is feasible. The dashed lines are iso-profit lines for the objective 70x1 + 50x2. Drag the profit slider to watch the iso-profit lines sweep across the region. The optimum is the last feasible vertex the line touches.
Shaded = feasible region. Dashed = iso-profit lines. Drag the slider to sweep the objective value. The optimum is at the last vertex touched.
For two variables, you could enumerate all vertices by finding every pair of constraint intersections, check which are feasible, and evaluate the objective at each. With 5 constraints (including non-negativity), there are at most C(5,2) = 10 intersection points. That is trivial.
But in n dimensions with m constraints, the number of vertices can be as large as C(m, n) — which is exponential in n. A problem with 100 variables and 200 constraints could have up to C(200, 100) ≈ 9 × 1058 vertices. Enumerating them all is absurd. We need an algorithm that walks from vertex to vertex, improving the objective at each step, without visiting most vertices. That algorithm is the simplex method.
An LP can have exactly three outcomes. Understanding all three prevents confusion during algorithm design.
1. Optimal solution exists. The feasible region is non-empty and the objective is bounded. There is at least one vertex where the optimum is achieved. This is the normal case.
2. Infeasible. The constraints are contradictory — no point satisfies all of them simultaneously. Example: x1 + x2 ≤ 1 and x1 + x2 ≥ 3 with x1, x2 ≥ 0. The feasible region is empty. The simplex algorithm detects this during Phase I (finding an initial feasible solution).
3. Unbounded. The feasible region extends to infinity in a direction that improves the objective. Example: maximize x1 subject to x1 - x2 ≤ 5, x1, x2 ≥ 0. You can increase x1 and x2 together forever. The simplex algorithm detects this when the minimum ratio test finds no positive coefficient — meaning you can increase the entering variable without limit.
| Outcome | Feasible Region | Objective | Simplex Detection |
|---|---|---|---|
| Optimal | Non-empty | Bounded | All objective coefficients ≤ 0 in final tableau |
| Infeasible | Empty | N/A | Phase I fails to find feasible starting point |
| Unbounded | Non-empty, unbounded in objective direction | +∞ | Entering column has no positive entries |
Before we can run simplex, we need to express the LP in a standard form that the algorithm can manipulate algebraically. That is the subject of Chapter 1.
Before we can solve an LP algorithmically, we need a canonical representation. There are many equivalent ways to write the same LP — maximizing vs. minimizing, ≤ vs. ≥ constraints, variables that can be negative vs. non-negative. The standard form picks one convention and sticks to it.
An LP is in standard form if it:
In matrix notation:
Where c is the n×1 objective vector, A is the m×n constraint matrix, b is the m×1 resource vector, and x is the n×1 decision variable vector. Our furniture LP is already in standard form: we maximize, all constraints are ≤, and x1, x2 ≥ 0.
Any LP can be converted to standard form using three transformations:
1. Minimization → Maximization. If the original LP minimizes cTx, negate the objective: maximize (-c)Tx. The optimal x is the same; the optimal value has opposite sign.
2. Equality constraints. Replace each equality aiTx = bi with two inequalities: aiTx ≤ bi and aiTx ≥ bi. The second inequality is then flipped: -aiTx ≤ -bi.
3. Unrestricted variables. If xj can be negative, replace it with xj = xj+ - xj- where both xj+, xj- ≥ 0. The difference can represent any real number.
Standard form uses inequalities. But algebraic manipulation (like the simplex algorithm) prefers equalities. We convert each inequality to an equality by introducing a slack variable.
The constraint 4x1 + 3x2 ≤ 240 becomes 4x1 + 3x2 + s1 = 240 where s1 ≥ 0 is the slack — it measures how much "room" we have left. If 4x1 + 3x2 = 200, then s1 = 40: we have 40 unused carpentry hours.
For our furniture LP, slack form is:
Now we have 5 variables (2 original + 3 slack) and 3 equality constraints. The slack variables s1, s2, s3 are the basic variables — they are expressed in terms of the non-basic variables x1, x2. Setting the non-basic variables to zero gives the basic feasible solution (BFS): x1 = 0, x2 = 0, s1 = 240, s2 = 100, s3 = 200. This corresponds to producing nothing — all resources are slack.
Let us convert a non-standard LP to standard form step by step. The original problem:
Step 1: Convert minimize to maximize. Negate the objective: maximize -2x1 + 3x2.
Step 2: Handle the equality constraint. Replace x1 + x2 = 10 with two inequalities: x1 + x2 ≤ 10 and -x1 - x2 ≤ -10.
Step 3: Flip the ≥ inequality. x1 - x2 ≥ 2 becomes -x1 + x2 ≤ -2.
Step 4: Handle unrestricted x2. Replace x2 = x2+ - x2- with x2+, x2- ≥ 0.
The result in standard form (3 variables, 3 constraints):
Note the negative RHS values (-10 and -2). This means the origin x = 0 is not feasible. The simplex algorithm needs a feasible starting point, so it uses Phase I: add artificial variables, solve an auxiliary LP to find a feasible BFS, then switch to Phase II (optimize the real objective). Most textbooks and solvers handle this automatically.
The canvas below shows the transformation from standard form (inequalities) to slack form (equalities with slack variables). Each constraint is shown with its slack variable. As you move the point (x1, x2) around the feasible region, watch the slack values change: they are zero on the boundary and positive in the interior.
Drag the point inside the feasible region. Watch the slack variables (right panel) change. At a vertex, exactly 2 slacks are zero (in 2D).
| Form | Variables | Constraints | Used By |
|---|---|---|---|
| General | n original | Mix of ≤, ≥, = | Problem statement |
| Standard | n (all ≥ 0) | All ≤, max objective | Theory, duality proofs |
| Slack | n + m (all ≥ 0) | All equalities | Simplex algorithm |
The transformation from general to standard to slack is mechanical and reversible. No information is lost. But the slack form is what the simplex algorithm actually operates on. It needs equalities so it can do algebra — solving for one variable in terms of others, pivoting variables between basic and non-basic sets.
The simplex algorithm, invented by George Dantzig in 1947, is one of the most important algorithms in all of computing. It solves linear programs by walking along the edges of the feasible polytope, moving from vertex to vertex, always improving the objective value, until it reaches the optimum.
Here is the high-level strategy. Start at some vertex (a basic feasible solution). Look at all adjacent vertices — those reachable by moving along one edge of the polytope. Pick the neighbor that improves the objective the most (or any improving neighbor). Move to it. Repeat until no neighbor is better. Because the feasible region is convex, this greedy uphill walk always finds the global optimum.
In algebraic terms, "moving to an adjacent vertex" means performing a pivot operation: one non-basic variable enters the basis, one basic variable leaves. The entering variable is chosen because increasing it improves the objective. The leaving variable is determined by the minimum ratio test — it is the basic variable that reaches zero first as we increase the entering variable.
Let us solve our furniture LP step by step. Starting slack form:
Initial BFS: x1 = 0, x2 = 0, s1 = 240, s2 = 100, s3 = 200. Objective z = 0. This is the origin — produce nothing.
Iteration 1: Choose entering variable. Look at the objective z = 70x1 + 50x2. Both coefficients are positive, so increasing either variable improves z. We pick x1 (coefficient 70 is largest). This is the largest-coefficient rule.
Minimum ratio test. How much can we increase x1 (keeping x2 = 0) before some basic variable goes negative?
The tightest constraint is s2 (lumber), giving x1 ≤ 33.3. So s2 is the leaving variable — it becomes zero. We solve the lumber equation for x1:
Now substitute this expression for x1 everywhere else:
After pivot 1: Basic: {x1, s1, s3}. Non-basic: {x2, s2}. BFS: x1 = 100/3 ≈ 33.3, x2 = 0, z ≈ 2333.3. We moved from the origin to the vertex at (33.3, 0).
Iteration 2: Choose entering variable. z = 7000/3 - 70s2/3 - 200x2/3. Both non-basic coefficients are negative. No variable can be increased to improve z. We are optimal!
The optimal solution is x1 = 100/3 ≈ 33.3 tables, x2 = 0 chairs, profit z = 7000/3 ≈ $2333.33.
The canvas below shows the simplex algorithm walking from vertex to vertex on our furniture LP. Each iteration is one edge traversal. The objective value increases at every step. Click "Step" to advance one pivot, or "Run" to animate.
Green path = simplex trajectory. Each arrow = one pivot. Orange star = current BFS. Watch the objective improve at every step.
The simplex algorithm is often implemented using a tableau — a matrix that encodes the current slack form compactly. Each row corresponds to a basic variable's equation, and the last row is the objective function. Pivoting is just Gaussian elimination on this matrix.
The initial tableau for our furniture LP looks like this:
| x1 | x2 | s1 | s2 | s3 | RHS | |
|---|---|---|---|---|---|---|
| s1 | 4 | 3 | 1 | 0 | 0 | 240 |
| s2 | 3 | 5 | 0 | 1 | 0 | 100 |
| s3 | 2 | 4 | 0 | 0 | 1 | 200 |
| z | -70 | -50 | 0 | 0 | 0 | 0 |
The objective row uses negated coefficients (for the implementation convention of maximizing). When all entries in the objective row are non-negative, we are optimal.
After pivot 1 (x1 enters, s2 leaves): we divide the s2 row by the pivot element (3) and eliminate x1 from all other rows. The result:
| x1 | x2 | s1 | s2 | s3 | RHS | |
|---|---|---|---|---|---|---|
| s1 | 0 | -7/3 | 1 | -4/3 | 0 | 106.7 |
| x1 | 1 | 5/3 | 0 | 1/3 | 0 | 33.3 |
| s3 | 0 | 2/3 | 0 | -2/3 | 1 | 133.3 |
| z | 0 | 200/3 | 0 | 70/3 | 0 | 2333.3 |
All entries in the objective row are non-negative (200/3 ≈ 66.7 and 70/3 ≈ 23.3). We are optimal. The solution: x1 = 33.3, x2 = 0, z = 2333.3.
Notice the dual solution hiding in the tableau. The objective row entries under the slack variables (s2 column = 70/3 ≈ 23.3) are the dual variables. The dual variable y2 = 23.3 is the shadow price of lumber — one more unit of lumber is worth $23.3 in additional profit. The simplex algorithm produces both primal and dual solutions simultaneously.
python import numpy as np def simplex(c, A, b): """Solve: maximize c^T x subject to Ax <= b, x >= 0. c: (n,) objective coefficients A: (m, n) constraint matrix b: (m,) resource vector (must be >= 0) Returns: optimal x, optimal value z """ m, n = A.shape # Build initial tableau: [A | I | b] with objective row [-c | 0 | 0] tab = np.zeros((m + 1, n + m + 1)) tab[:m, :n] = A # constraint coefficients tab[:m, n:n+m] = np.eye(m) # slack variables tab[:m, -1] = b # RHS tab[-1, :n] = -c # objective (negated for min) basis = list(range(n, n + m)) # initial basis = slack vars while True: # Step 2: find entering variable (most negative in obj row) obj_row = tab[-1, :-1] if np.all(obj_row >= -1e-10): break # optimal! enter = np.argmin(obj_row) # Step 3: minimum ratio test col = tab[:m, enter] rhs = tab[:m, -1] ratios = np.full(m, np.inf) pos = col > 1e-10 ratios[pos] = rhs[pos] / col[pos] if np.all(ratios == np.inf): raise ValueError("LP is unbounded") leave = np.argmin(ratios) # Step 4: pivot pivot = tab[leave, enter] tab[leave] /= pivot # normalize pivot row for i in range(m + 1): if i != leave: tab[i] -= tab[i, enter] * tab[leave] basis[leave] = enter # Extract solution x = np.zeros(n) for i, var in enumerate(basis): if var < n: x[var] = tab[i, -1] return x, tab[-1, -1] # Furniture problem c = np.array([70, 50]) A = np.array([[4, 3], [3, 5], [2, 4]]) b = np.array([240, 100, 200]) x_opt, z_opt = simplex(c, A, b) print(f"Optimal: x = {x_opt}, z = {z_opt:.2f}") # Optimal: x = [33.33 0. ], z = 2333.33
In Chapter 2, the simplex algorithm improved z at every iteration and terminated quickly. But there is a subtle pitfall: what happens when a pivot does NOT improve the objective? This occurs at degenerate vertices, and it can cause the algorithm to cycle forever.
A basic feasible solution is degenerate if one or more basic variables has value zero. Geometrically, this means more than n constraints are active (tight) at the same vertex. In 2D, a non-degenerate vertex has exactly 2 active constraints. A degenerate vertex has 3 or more constraints meeting at the same point.
Why does this cause problems? When the minimum ratio test yields a ratio of zero (because some basic variable is already at zero), the pivot moves to a new basis BUT the BFS stays at the same point in space. The objective does not improve. We changed the algebraic representation of the vertex without actually moving. If this happens repeatedly, the algorithm can cycle through a sequence of bases, all representing the same vertex, never making progress.
Bland's rule (1977) is the simplest anti-cycling rule: among all eligible entering variables (those with positive objective coefficient), always choose the one with the smallest index. Among all eligible leaving variables (those with the smallest ratio), always choose the one with the smallest index. That is it. This simple tie-breaking rule provably prevents cycling.
The proof is by contradiction: if cycling occurs, there is a smallest set of variables involved in the cycle. Bland's rule creates a contradiction within this set by showing the last variable to leave the basis could not have re-entered under the smallest-index rule.
The simplex algorithm is not polynomial in the worst case. In 1972, Klee and Minty constructed a family of LPs on which the simplex method with the largest-coefficient rule visits every vertex — an exponential number. The Klee-Minty cube in d dimensions has 2d vertices, and the algorithm visits all of them.
| Dimensions | Vertices | Simplex Pivots (worst case) |
|---|---|---|
| 2 | 4 | 4 |
| 3 | 8 | 8 |
| 10 | 1,024 | 1,024 |
| 20 | 1,048,576 | 1,048,576 |
| d | 2d | 2d |
But this is a pathological worst case. In practice, the simplex method is astonishingly fast.
In 2001, Daniel Spielman and Shang-Hua Teng introduced smoothed analysis, explaining why simplex works so well despite its exponential worst case. The idea: the Klee-Minty cube is razor-sharp. Even tiny random perturbations to the constraint coefficients destroy the pathological structure. Real-world problems are never perfectly aligned — noise, rounding, and imprecise measurements ensure that the constraints are slightly "jiggled."
Under smoothed analysis, the simplex method runs in polynomial time: O(n3 / σ4) where σ is the standard deviation of random perturbation. Spielman and Teng won the Godel Prize for this result. It explains a 50-year mystery: why does an exponential-time algorithm solve real problems so fast?
To understand cycling, consider this LP:
All three constraints are active at (2, 2): 2+2=4, 4+2=6, 2+4=6. This is a degenerate vertex. The simplex tableau at this vertex has a basic variable equal to zero. During the pivot, the minimum ratio test produces a tie (ratio = 0 for multiple basic variables). The choice of leaving variable is ambiguous.
Without Bland's rule, the algorithm might choose the wrong leaving variable, arrive at a different basis representing the same point (2,2), and repeat. The algebraic representation changes — which variables are basic vs. non-basic — but the actual solution does not move. After several such "degenerate pivots," the algorithm might return to the original basis, creating a cycle.
With Bland's rule, we always break ties by choosing the smallest-index variable. This systematic tie-breaking ensures that the algorithm eventually escapes the degenerate vertex and moves to a genuinely better vertex. The proof uses a clever potential function argument: Bland's rule ensures a lexicographic improvement even when the objective value stays flat.
The canvas below illustrates a degenerate vertex. Three constraint lines meet at the same point. The simplex algorithm can cycle between different bases at this vertex, but Bland's rule prevents it. Toggle Bland's rule on/off to see the difference.
The orange vertex is degenerate: 3 constraints active. Without Bland's rule, the algorithm can cycle. With it, it terminates.
Every linear program has a twin. If the original LP (the primal) is a maximization problem, its twin (the dual) is a minimization problem. They are mirror images, and their optimal values are always equal. This is the strong duality theorem, and it is one of the most beautiful results in optimization.
Start with the primal in standard form:
The dual is constructed by a systematic transposition:
The recipe: (1) The primal's resource vector b becomes the dual's objective. (2) The primal's objective c becomes the dual's constraint RHS. (3) The constraint matrix transposes. (4) ≤ becomes ≥. (5) Maximize becomes minimize. (6) Each primal constraint gets a dual variable yi.
For our furniture LP, the dual is:
Weak duality theorem: For any feasible primal solution x and any feasible dual solution y:
Every feasible dual solution provides an upper bound on the primal optimum. Every feasible primal solution provides a lower bound on the dual optimum. The proof is one line:
Strong duality theorem: If the primal has an optimal solution x*, then the dual also has an optimal solution y*, and:
At optimality, the primal and dual values are exactly equal. The gap closes completely. This is remarkable: two different optimization problems, one maximizing and one minimizing, arrive at exactly the same value.
The proof comes from the simplex algorithm itself. The final tableau of the simplex method provides both the optimal primal solution (the basic variables' values) and the optimal dual solution (the objective row's slack variable coefficients). Simplex solves both problems simultaneously.
Strong duality implies complementary slackness: at optimality, for each constraint, either the constraint is tight (active) or the corresponding dual variable is zero. In symbols:
If a resource is not fully used (primal slack > 0), its price must be zero (dual variable = 0). If a resource has a positive price, it must be fully consumed. This makes economic sense: scarce resources (fully consumed) have positive prices; abundant resources (slack left over) are free.
Duality has a beautiful economic interpretation. The primal problem asks: "Given limited resources, how do I maximize profit?" The dual asks the reverse: "What is the minimum price I should charge for my resources so that no production activity is profitable?"
Consider our furniture LP. The primal solution says: make 33.3 tables, 0 chairs, profit $2333. The dual solution assigns prices to resources: y1 = 0 (carpentry is abundant), y2 ≈ 23.3 (lumber is scarce, worth $23.3/unit), y3 = 0 (finishing is abundant). The total resource value: 240(0) + 100(23.3) + 200(0) = $2333. Exactly the profit.
Why is y1 = 0? Because the carpentry constraint is not tight — we use only 4(33.3) = 133 of 240 available hours. Abundant resources have zero price. Why is y2 = 23.3? Because lumber is the binding constraint. Every additional unit of lumber allows us to make 1/3 more table (since each table uses 3 lumber), gaining 70/3 ≈ $23.3 more profit.
Von Neumann's minimax theorem (1928) for two-person zero-sum games is equivalent to LP duality. Player 1 maximizes their minimum payoff (primal). Player 2 minimizes their maximum loss (dual). The minimax theorem says these values are equal — which is exactly strong duality. The mixed-strategy Nash equilibrium of a zero-sum game can be found by solving an LP.
This connection is not coincidental. Dantzig developed the simplex method partly inspired by von Neumann's game theory work. Von Neumann immediately recognized that his minimax theorem was a special case of LP duality when Dantzig showed him the simplex method in 1947. The two theories developed in parallel and are deeply intertwined.
The canvas below shows the primal and dual problems side by side. As you drag the primal solution toward optimality, the dual bound decreases to meet it. At the optimal solution, the two values are equal — the duality gap is zero.
Left: primal feasible region with objective value. Right: dual bound. Watch them converge at the optimum. Drag the slider to move along the simplex path.
This is the payoff. Below is a fully interactive LP solver. You define the problem — adjust constraint lines, change the objective — and watch the simplex algorithm solve it in real time. You will see the feasible region reshape, the simplex path change, and the dual solution appear alongside.
The LP has two variables (x1, x2). You can:
Drag constraints to reshape the feasible region. Adjust c1, c2 to change the objective. Run simplex to solve.
Diet Problem (1947). The original LP application. George Stigler asked: what is the cheapest diet that meets all nutritional requirements? Minimize cost subject to minimum nutrient constraints. Here, we maximize "nutrition score" subject to budget and calorie limits — a simplified version.
Production Planning. Our furniture workshop with different parameters. Adjust b1, b2, b3 to see how changing resource availability shifts the optimum.
Transport. A simplified transportation problem: ship goods from two factories to meet demand, minimizing shipping cost. Formulated as an LP with supply and demand constraints.
One of the most powerful uses of LP is sensitivity analysis — understanding how the optimal solution changes when the problem data changes. In practice, the constraint coefficients and RHS values are never perfectly known. Material costs fluctuate. Demand shifts. Resources become scarce or abundant. A good decision-maker needs to know: how sensitive is my plan to these changes?
The dual variables provide this directly. The dual variable yi* for constraint i is the shadow price — it tells you how much the optimal objective would improve if you increased bi by one unit. If the lumber constraint has shadow price y2* = 23.3, then one more unit of lumber is worth $23.3 in additional profit. This is exactly what the dual variables compute.
Try it in the simulation above. Increase b2 (lumber) from 100 to 105 and note the change in z*. The change should be approximately 5 × y2*. This relationship holds exactly within a sensitivity range — the interval of bi values over which the current basis remains optimal.
As you drag b1, b2, b3, you may encounter configurations where three constraint lines pass through the same point. This is degeneracy — the vertex where the optimum sits has more active constraints than strictly necessary. The optimal solution is still correct, but the shadow prices may not be unique: multiple dual solutions exist, each providing different (but equally valid) shadow prices. In practice, commercial solvers handle this gracefully.
Try setting b1 = 240, b2 = 100, b3 = 200. The optimum is at x1 ≈ 33.3, x2 = 0, where only the lumber constraint is tight. Now slowly decrease b1 until it also becomes tight. At that moment, you have a degenerate vertex — two constraints active at the optimum. The feasible region changes shape, but the optimum may not move until one constraint "takes over" from another.
Linear programming is not just one optimization problem. It is a universal language for optimization. An enormous number of seemingly different problems — shortest paths, maximum flow, minimum-cost flow, bipartite matching — can all be expressed as linear programs. If you can formulate a problem as an LP, you can solve it with any LP solver.
The single-source shortest path problem: find the shortest path from source s to every other vertex in a weighted graph. This is Bellman-Ford (Chapter 24), but it can also be written as an LP.
For each vertex v, let dv be the distance from s to v. The LP is:
Wait — maximize? That seems backward. The trick: the triangle inequality constraints force dv ≤ shortest path length. Maximizing pushes each dv up to the tightest constraint, which is exactly the shortest path. It is a beautiful application of LP duality: the "maximize subject to upper bounds" formulation naturally computes the tightest (shortest) distances.
The max-flow problem (CLRS Chapter 26): maximize flow from source s to sink t through a network with edge capacities. Each edge (u,v) has capacity c(u,v). Flow f(u,v) must satisfy:
This is a linear program. The variables are edge flows f(u,v). The constraints are linear. Ford-Fulkerson, Edmonds-Karp, push-relabel — all specialized algorithms for this LP. They are faster than general simplex because they exploit the network structure. But any LP solver can solve max-flow.
Min-cost flow generalizes both shortest paths and max-flow. Each edge has a capacity AND a per-unit cost. The LP: minimize total cost while shipping a required amount of flow from s to t. Both shortest path and max-flow are special cases — shortest path has unit flow, max-flow has zero costs.
Given a bipartite graph, find the maximum matching (largest set of edges with no shared vertices). The LP relaxation:
Normally, matching requires xuv ∈ {0, 1} (integer), making it an ILP. But for bipartite graphs, a miracle occurs: the LP relaxation always has an integer optimal solution. The constraint matrix is totally unimodular — every square submatrix has determinant 0, +1, or -1. Total unimodularity guarantees integer vertices. This is why bipartite matching is polynomial: it reduces to LP, and the LP happens to have integer solutions.
The canvas below shows how max-flow can be expressed as an LP. A small network is shown with edge capacities. The LP variables (one per edge) and constraints (capacity + conservation) are displayed alongside. Run the LP solver to find the max flow.
Left: network with capacities. Right: LP formulation. Green edges = flow. Run to solve.
The power of LP reductions is not just theoretical. In practice, operations researchers formulate problems in LP modeling languages like AMPL, GAMS, or PuLP (Python). You describe the problem declaratively — variables, objective, constraints — and the solver handles the rest. The formulation is separated from the algorithm. This separation is LP's greatest practical strength.
python # Max flow as LP using PuLP (Python) from pulp import * # Network: s->a (cap 5), s->b (cap 4), a->b (cap 2), a->t (cap 3), b->t (cap 6) prob = LpProblem("MaxFlow", LpMaximize) # Flow variables (one per edge) f_sa = LpVariable("f_sa", 0, 5) # 0 <= f <= capacity f_sb = LpVariable("f_sb", 0, 4) f_ab = LpVariable("f_ab", 0, 2) f_at = LpVariable("f_at", 0, 3) f_bt = LpVariable("f_bt", 0, 6) # Objective: maximize total flow out of source prob += f_sa + f_sb # Flow conservation at a: in = out prob += f_sa == f_ab + f_at # flow into a = flow out of a # Flow conservation at b: in = out prob += f_sb + f_ab == f_bt # flow into b = flow out of b prob.solve(PULP_CBC_CMD(msg=0)) print(f"Max flow = {value(prob.objective)}") print(f"f(s,a)={f_sa.varValue}, f(s,b)={f_sb.varValue}") # Max flow = 7.0 # f(s,a)=3.0, f(s,b)=4.0
Notice how clean the formulation is. You declare variables with bounds (capacity constraints). You write the objective (maximize flow from source). You write conservation constraints (flow in = flow out at internal nodes). Then you call solve(). The solver — which internally may use simplex, interior point, or a hybrid — handles everything. You never write a single line of algorithm code.
The simplex algorithm walks along the boundary of the feasible polytope, hopping from vertex to vertex. Interior point methods take a radically different approach: they go straight through the middle. Instead of staying on the edges, they carve a path through the interior, converging to the optimum from the inside.
The key insight: instead of treating constraints as hard walls (simplex bounces off them), treat them as soft barriers that repel you more strongly as you approach. Add a barrier function to the objective that penalizes getting close to any constraint boundary:
The logarithmic barrier μ ∑ ln(slacki) goes to -∞ as any slack approaches zero, creating a "force field" that keeps the solution away from the boundary. The parameter μ controls the barrier strength. As μ → 0, the barrier weakens and the solution approaches the true LP optimum.
The algorithm works by solving a sequence of barrier problems with decreasing μ. Each barrier problem is a smooth, unconstrained optimization (because the barrier prevents constraint violation) that can be solved by Newton's method. The sequence of solutions traces a curve called the central path, which converges to the LP optimum.
Narendra Karmarkar published the first practical interior point method in 1984. His algorithm was the first to be both (1) polynomial-time and (2) competitive with simplex in practice. It runs in O(n3.5L) time where L is the bit length of the input. This was a breakthrough: for the first time, there was a provably polynomial algorithm for LP that was also fast in practice.
Before Karmarkar, the only known polynomial LP algorithm was the ellipsoid method (Khachiyan, 1979). The ellipsoid method was a theoretical triumph (it proved LP is in P) but was impractical — much slower than simplex on real problems. Karmarkar's algorithm changed the game by being both theoretically polynomial and practically competitive.
| Method | Year | Worst-case | Practice | Path |
|---|---|---|---|---|
| Simplex | 1947 | Exponential | Very fast | Boundary (vertices) |
| Ellipsoid | 1979 | Polynomial | Very slow | Shrinking ellipsoids |
| Interior point | 1984 | Polynomial | Fast | Central path (interior) |
Modern LP solvers (CPLEX, Gurobi, HiGHS) implement both methods. Rules of thumb:
The canvas below contrasts simplex (green, walks along edges) with interior point (blue, curves through the interior). Both arrive at the same optimum, but by completely different paths.
Green = simplex path (boundary). Blue = interior point path (central path through interior). Both reach the same optimum.
The central path is the curve traced by the unique minimizer of the barrier problem as μ varies from ∞ to 0. At μ = ∞, the barrier dominates and the solution is the analytic center of the polytope — the point that maximizes the product of all slacks, the "most interior" point. As μ → 0, the solution slides along the central path toward the optimal vertex.
Each point on the central path satisfies the KKT conditions (Karush-Kuhn-Tucker) of the barrier problem. These conditions require:
The third condition xisi = μ is the key. As μ → 0, either xi → 0 or si → 0 (or both) — which is exactly complementary slackness. The central path smoothly transitions from "everything is interior" (μ large) to "complementary slackness holds" (μ = 0). Each iteration of the interior point method takes a Newton step toward the current μ-target, then decreases μ.
The convergence is remarkably fast: the number of iterations is O(√n · log(1/ε)) where n is the number of variables and ε is the desired accuracy. This is polynomial and, crucially, almost independent of the constraint coefficients. Compare this to simplex, whose iteration count depends on the geometry of the polytope.
LP rarely appears directly in coding interviews, but it appears constantly in system design, optimization, and ML interviews. The key skill is recognizing when a problem is an LP in disguise — and knowing how to formulate it. Here is your cheat sheet.
| Pattern | Objective | Constraints | Example |
|---|---|---|---|
| Resource allocation | Maximize profit | Resource limits | Production planning, portfolio optimization |
| Diet / covering | Minimize cost | Minimum requirements | Cheapest diet meeting nutrition, minimum staff coverage |
| Network flow | Max flow or min cost | Capacity + conservation | Shipping, routing, matching |
| Regression / fitting | Minimize error | None (or regularization) | L1 regression, Chebyshev approximation |
| Game theory | Maximize min payoff | Strategy probabilities sum to 1 | Mixed-strategy Nash equilibria |
python from scipy.optimize import linprog # linprog MINIMIZES c^T x subject to A_ub @ x <= b_ub # To maximize, negate c. # Furniture problem: max 70x1 + 50x2 c = [-70, -50] # negate for minimization A_ub = [ [4, 3], # 4x1 + 3x2 <= 240 [3, 5], # 3x1 + 5x2 <= 100 [2, 4], # 2x1 + 4x2 <= 200 ] b_ub = [240, 100, 200] bounds = [(0, None), (0, None)] # x1, x2 >= 0 res = linprog(c, A_ub=A_ub, b_ub=b_ub, bounds=bounds) print(f"x1 = {res.x[0]:.2f}, x2 = {res.x[1]:.2f}") print(f"Optimal profit = {-res.fun:.2f}") # negate back # x1 = 33.33, x2 = 0.00 # Optimal profit = 2333.33
Many "non-linear" objectives can be converted to LP:
Absolute value: Minimize |x| can be written as: minimize t, subject to t ≥ x, t ≥ -x, t ≥ 0. The variable t captures the absolute value.
Minimax: Minimize max(x1, x2, ..., xn) becomes: minimize t, subject to t ≥ xi for all i. The extra variable t acts as the "worst case."
L1 regression: Minimize ∑ |yi - aiTx| becomes an LP with 2m extra variables (one positive deviation and one negative deviation for each residual). This is why L1 regression produces sparse solutions — the LP optimal is at a vertex.
Q1: "Design a system to optimize ad placement across web pages."
This is an LP. Variables: xij = fraction of time ad i appears on page j. Objective: maximize total click-through revenue. Constraints: each page has limited ad slots, each advertiser has a budget, CTR estimates provide the coefficients. In practice, solved as an LP or ILP billions of times per day by Google, Meta, etc.
Q2: "How would you formulate network routing as an optimization problem?"
Min-cost multi-commodity flow. Variables: flow of each commodity on each edge. Constraints: capacity per edge (shared), flow conservation per commodity. Objective: minimize total cost. This is an LP with network structure — specialized solvers handle billions of variables.
Q3: "What is the relationship between LP and machine learning?"
SVM (support vector machine) training is a quadratic program (QP), which is a generalization of LP. L1-regularized regression (Lasso) can be formulated as an LP. The simplex algorithm inspired the perceptron algorithm. LP duality is the foundation of Lagrangian relaxation, which appears everywhere in constrained ML. Understanding LP gives you the conceptual toolkit for all convex optimization in ML.
Q4: "An airline needs to assign crews to flights. How would you model this?"
This is a set covering / set partitioning problem, naturally modeled as an ILP: binary variable xj = 1 if we use crew schedule j. Minimize total cost subject to: every flight is covered by at least one crew. The LP relaxation provides a lower bound on cost. In practice, airlines use column generation: start with a small set of schedules, solve the LP, use the dual prices to generate promising new schedules, and iterate. This technique solves problems with millions of potential crew schedules.
Q5: "What is the difference between LP and convex optimization?"
LP is a special case of convex optimization where both the objective and constraints are linear (technically, affine). General convex optimization allows convex objective functions (e.g., quadratic, logarithmic) and convex constraint sets (e.g., second-order cones, semidefinite cones). LP's special structure enables the simplex method (vertex-hopping), which is not applicable to general convex programs. But interior point methods work for both LP and general convex optimization — the barrier approach generalizes naturally.
| Problem | Complexity | Algorithm |
|---|---|---|
| LP | Polynomial (strongly) | Simplex (practice), Interior point (theory+practice) |
| ILP | NP-hard | Branch-and-bound, cutting planes |
| 0/1 Knapsack | NP-hard (weakly) | DP, LP relaxation + rounding |
| TSP | NP-hard (strongly) | LP relaxation + cutting planes (Concorde solver) |
| Max Flow | Polynomial | Ford-Fulkerson, or LP |
| Min-Cost Flow | Polynomial | Network simplex, or LP |
| SVM Training | Polynomial (convex QP) | SMO, interior point |
Linear programming sits at the crossroads of algorithms, complexity theory, and applied mathematics. Here is how it connects to the rest of the CLRS universe and beyond.
| Chapter | Connection |
|---|---|
| Ch 16: Greedy | Greedy algorithms work on problems where the LP relaxation has integer vertices (matroids, bipartite matching). The greedy-choice property is related to total unimodularity. |
| Ch 22-26: Graph Algorithms | Shortest paths, max flow, min cut, bipartite matching — all are special cases of LP. Specialized algorithms (Dijkstra, Ford-Fulkerson) exploit structure for speed. |
| Ch 34: NP-Completeness | Integer LP is NP-hard. LP relaxation + rounding is a key approximation technique. The LP relaxation gives a bound; rounding gives a feasible integer solution within a provable factor of optimal. |
| Ch 35: Approximation | LP-based approximation: solve the LP relaxation, round the fractional solution. Achieves constant-factor approximations for set cover, vertex cover, and many other NP-hard problems. |
| Problem | Objective | Constraints | Complexity |
|---|---|---|---|
| LP | Linear | Linear | Polynomial |
| ILP | Linear | Linear + integer | NP-hard |
| QP | Quadratic | Linear | Polynomial (convex) |
| Convex optimization | Convex | Convex | Polynomial |
| General optimization | Arbitrary | Arbitrary | NP-hard or worse |
LP is the simplest member of the convex optimization family. Its special structure (linear objective + linear constraints) enables the simplex method's vertex-hopping strategy. Moving to quadratic objectives (QP) or general convex constraints adds complexity but maintains polynomial solvability. The moment you add integrality constraints (ILP) or non-convexity, the problem becomes NP-hard.
LP is polynomial. But add the constraint "all variables must be integers" and the problem becomes NP-hard. This is the integer linear programming (ILP) problem. The gap between LP and ILP is one of the sharpest complexity cliffs in all of computer science.
Why is integrality so hard? Because the feasible region is no longer a convex polytope — it is a set of isolated lattice points inside the polytope. You cannot slide smoothly from one feasible solution to another. The LP relaxation (drop the integrality constraint) gives a lower bound, but the gap between the LP relaxation and the true ILP optimum can be arbitrarily large.
In practice, ILP is solved by branch-and-bound: solve the LP relaxation, pick a fractional variable, create two sub-problems (variable ≤ floor and variable ≥ ceil), and recurse. The LP relaxation at each node provides bounds for pruning. Commercial solvers (Gurobi, CPLEX) add cutting planes — additional constraints that cut off fractional solutions without eliminating integer solutions — to tighten the relaxation. The combination of branch-and-bound with cutting planes is called branch-and-cut, and it is the dominant paradigm for solving ILP.
LP was born in 1947 when George Dantzig invented the simplex method for military logistics during World War II. Since then, it has become one of the most practically important algorithms in existence. Every airline schedules flights with LP. Every supply chain optimizes logistics with LP. Every financial firm optimizes portfolios with LP (or its quadratic extension, QP). The US military, which funded Dantzig's original work, still uses LP-based planning tools.
The theoretical importance is equally profound. LP duality underpins the max-flow min-cut theorem, the theory of two-person zero-sum games (von Neumann's minimax theorem is equivalent to LP duality), and the entire field of combinatorial optimization. The polynomial-time solvability of LP (Khachiyan 1979, Karmarkar 1984) was a landmark in complexity theory. And the open question of whether the simplex method has a polynomial-time pivot rule remains one of the great unsolved problems in theoretical computer science.
Despite 75+ years of study, fundamental questions about LP remain open:
Is there a strongly polynomial simplex algorithm? The simplex method's running time depends on the magnitudes of the input numbers (it is weakly polynomial at best, exponential at worst). No one has found a pivot rule that guarantees a polynomial number of pivots independent of the input magnitudes. The Hirsch conjecture — that the diameter of any d-dimensional polytope with n facets is at most n - d — was disproved by Santos in 2010, but the polynomial Hirsch conjecture (diameter polynomial in n and d) remains open.
Is LP in NC? Can LP be solved in polylogarithmic time with polynomially many processors? LP is known to be P-complete (under logspace reductions), which suggests it cannot be efficiently parallelized. But this has not been proven.
Smoothed complexity of interior point methods. Spielman-Teng analyzed simplex under smoothed analysis. The analogous question for interior point methods — does perturbation improve their convergence rate? — is less well understood.
"The world is convex." — Stephen Boyd, 2004