Backtracking, pruning, Sudoku, and heuristic methods. How to systematically explore vast solution spaces and how to give up gracefully with simulated annealing.
A modern computer performs billions of operations per second. That sounds like a lot, until you count how many configurations exist for even modest problems.
| Problem | n = 10 | n = 20 | n = 30 |
|---|---|---|---|
| All subsets (2n) | 1,024 | ~1 million | ~1 billion |
| All permutations (n!) | 3.6 million | ~1018 | ~1032 |
One million items means all arrangements of about 10-11 objects, but not more. One million subsets means all combinations of about 20 items, but not more. For anything significantly larger, we need to be smart about what we search.
The key insight: you almost never need to look at every configuration. With careful pruning, you can eliminate vast swaths of the search space at once, reducing impossible searches to seconds.
Backtracking is a systematic method for iterating through all possible configurations of a search space. It builds solutions incrementally, one component at a time, and abandons a partial solution as soon as it determines that it cannot lead to a valid complete solution.
We model solutions as vectors a = (a1, a2, ..., an), where each ai is chosen from a set Si of candidates. At each step, we extend the partial solution by one element. If we reach a complete solution, we process it. If the partial solution cannot possibly lead to a valid solution, we backtrack -- undo the last choice and try the next candidate.
python def backtrack(a, k, input): if is_a_solution(a, k, input): process_solution(a, k, input) else: k += 1 candidates = construct_candidates(a, k, input) for c in candidates: a[k] = c make_move(a, k, input) backtrack(a, k, input) unmake_move(a, k, input) # clean up
The framework is general. Five application-specific functions plug in to customize it: is_a_solution (are we done?), construct_candidates (what choices exist?), process_solution (output the result), make_move and unmake_move (update/restore data structures).
How do we generate all 2n subsets of {1, ..., n}? Set up a vector of n cells. The value of ai (true or false) says whether item i is in the subset. Each element doubles the possibilities, giving exactly 2n subsets.
python def all_subsets(n): def bt(a, k): if k == n: print([i+1 for i in range(n) if a[i]]) else: for val in [True, False]: a[k] = val bt(a, k + 1) bt([None] * n, 0)
Since True always comes before False, the full set is generated first and the empty set last. For {1, 2, 3}:
{1,2,3}, {1,2}, {1,3}, {1}, {2,3}, {2}, {3}, {}
Click "Step" to advance through the subset tree for {1,2,3}. Left branch = include, right = exclude. Warm = current path.
This pattern generalizes beyond subsets. The backtracking framework generates any combinatorial object by changing only the construct_candidates function. For subsets: 2 candidates (true/false). For permutations: all unused elements. For graph paths: all unvisited neighbors. Same framework, different candidates.
How many permutations of {1, ..., n}? There are n choices for the first position, n-1 for the second, n-2 for the third, and so on. Total: n! = n × (n-1) × ... × 1.
The candidates for position k are the elements not yet used in positions 1 through k-1. We track used elements with a boolean array for O(1) lookup per candidate.
python def all_permutations(n): def bt(a, k, used): if k == n: print(a[:]) else: for i in range(1, n + 1): if not used[i]: a[k] = i used[i] = True bt(a, k + 1, used) used[i] = False bt([0] * n, 0, [False] * (n + 1))
This generates permutations in lexicographic order: 123, 132, 213, 231, 312, 321. The tree has n! leaves and depth n.
Backtracking guarantees correctness by enumerating all possibilities. But most of those possibilities are garbage. Pruning means cutting off branches of the search tree that cannot lead to valid or optimal solutions.
Consider finding the optimal traveling salesman tour. We could generate all n! permutations and check each tour's cost. But suppose our partial tour starts with vertices (v1, v2) and the edge (v1, v2) doesn't even exist in the graph. The next (n-2)! permutations starting with (v1, v2) are all wasted. Much better to prune immediately.
Two pruning strategies that work broadly:
| Strategy | Idea | Effect |
|---|---|---|
| Feasibility pruning | Abandon partial solutions that violate constraints | Avoids obviously bad paths |
| Optimality pruning | Abandon partial solutions that can't beat the best known answer | Branch-and-bound style cutoff |
| Symmetry breaking | Only explore one representative from each set of symmetric configurations | Can reduce search by factor of n! or more |
| Look-ahead | Check if the remaining subproblem has any feasible extension at all | Detects dead ends early |
Sudoku is a perfect showcase for backtracking with pruning. The rules: fill a 9 × 9 grid so every row, column, and 3 × 3 box contains the digits 1-9 with no repeats.
The state space: each open square must be filled with a digit. The candidates for square (i, j) are the digits 1-9 that don't appear in row i, column j, or the box containing (i, j). We backtrack when a square has zero candidates.
Skiena tested four backtracking variants on Sudoku puzzles of varying difficulty:
| Next Square | Look-ahead? | Easy | Medium | Hard |
|---|---|---|---|---|
| Arbitrary | No | 1,904,832 | 863,305 | never finished |
| Arbitrary | Yes | 127 | 142 | 12,507,212 |
| Most constrained | No | 48 | 84 | 1,243,838 |
| Most constrained | Yes | 48 | 65 | 10,374 |
Look-ahead checks whether any other open square has zero candidates. If so, there is no way to complete the puzzle from the current state, so we backtrack immediately instead of wasting effort filling in more squares before discovering the dead end.
When the search space is too large for exhaustive search, we need heuristic methods. The simplest is random sampling (Monte Carlo): repeatedly generate random solutions, evaluate them, and keep the best.
python def random_sampling(problem, n_samples): best = random_solution(problem) best_cost = cost(best) for _ in range(n_samples): s = random_solution(problem) c = cost(s) if c < best_cost: best, best_cost = s, c return best
When does random sampling work well?
When does it fail? For TSP, the number of reasonable tours is vanishingly small compared to all permutations. Random sampling will almost never find a good tour, no matter how many samples you take.
Simulated annealing is inspired by the physical process of cooling metal. A hot material has atoms bouncing wildly; as it cools, they settle into a low-energy crystal structure. The algorithm starts "hot" (accepting random moves, even bad ones) and gradually cools (becoming increasingly picky).
python def simulated_annealing(problem, T=1.0, cooling=0.999, min_T=0.001): s = random_solution(problem) best = s while T > min_T: s_new = random_neighbor(s) delta = cost(s_new) - cost(s) if delta < 0 or random() < exp(-delta / T): s = s_new if cost(s) < cost(best): best = s T *= cooling return best
The three critical design choices: (1) the cooling schedule (how fast T decreases), (2) the neighborhood structure (what counts as a "small" move), and (3) the initial temperature (hot enough to escape any local minimum).
For TSP, a natural neighborhood is swapping two cities in the tour (a "2-opt" move). The gradient-descent version (always accept improvements, never accept worse) is called hill climbing. It is fast but gets trapped in local optima. Simulated annealing is hill climbing with strategic occasional mistakes.
| Method | Accepts Worse? | Guarantee | Speed |
|---|---|---|---|
| Random sampling | N/A (no neighbors) | None | Fast per sample, needs many |
| Hill climbing | Never | Local optimum only | Very fast |
| Simulated annealing | Sometimes (decreasing) | Converges to global (in theory) | Moderate |
| Genetic algorithms | Via crossover/mutation | None | Slow per generation |
The classic N-Queens problem: place N queens on an N × N chessboard so that no two queens attack each other (no shared row, column, or diagonal). This is a perfect backtracking problem with beautiful pruning opportunities.
Click "Solve" to watch backtracking find all solutions. Queens are placed row by row. Red = conflict detected (backtrack!).
Combinatorial search bridges the gap between exact algorithms and the hard problems of the next two chapters:
| Concept | Where It Leads |
|---|---|
| Backtracking | Chapter 8: DP caches overlapping subproblems that backtracking recomputes |
| Pruning | Branch-and-bound extends pruning with optimality bounds |
| Heuristics | Chapter 9: NP-complete problems have no exact efficient algorithm, making heuristics essential |
| Simulated annealing | Widely used for VLSI layout, protein folding, scheduling |
| Search tree = DFS | Chapter 5: Backtracking is DFS on an implicit search tree |