Workbook — Introduction to Algorithms (CLRS)

CLRS Algorithms

Asymptotic analysis, recurrences, sorting, hashing, BSTs, DP, greedy, graphs, shortest paths, MSTs — 56 exercises with instant feedback covering every algorithm you need for technical interviews.

Prerequisites: Basic algebra + Logarithms + Arrays & loops. That's it.
10
Chapters
56
Exercises
5
Exercise Types
Mastery
0 / 56 exercises (0%)
0
Day Streak
Best: 0

Chapter 0: Asymptotic Analysis

You have two algorithms for the same problem. Algorithm A runs in 3n² + 5n + 100 operations. Algorithm B runs in 0.01 · 2n operations. For n=10, A does 450 operations and B does about 10. B looks faster. But for n=50, A does 7,850 operations while B does over 10 trillion. Asymptotic analysis strips away the constants and lower-order terms so you can see which algorithm survives at scale.

Big-O (upper bound): f(n) = O(g(n)) means ∃ c > 0, n0 ≥ 0 such that f(n) ≤ c · g(n) for all n ≥ n0
Big-Ω (lower bound): f(n) = Ω(g(n)) means ∃ c > 0, n0 ≥ 0 such that f(n) ≥ c · g(n) for all n ≥ n0
Big-Θ (tight bound): f(n) = Θ(g(n)) means f(n) = O(g(n)) AND f(n) = Ω(g(n))
The intuition. O(g(n)) means "grows no faster than g(n)." Ω(g(n)) means "grows at least as fast as g(n)." Θ(g(n)) means "grows at exactly the same rate as g(n)." When someone says an algorithm is "O(n log n)," they mean the runtime grows at most as fast as n log n. If they say Θ(n log n), they mean it grows at exactly that rate — not faster, not slower.
Exercise 0.1: Is 3n² + 5n = O(n²)? Trace
To prove 3n² + 5n = O(n²), you need constants c and n0 such that 3n² + 5n ≤ c · n² for all n ≥ n0. Which of these works?
Show reasoning

For n ≥ 5, we have 5n ≤ n² (since 5 ≤ n). Therefore 3n² + 5n ≤ 3n² + n² = 4n². So c = 4, n0 = 5 is a valid witness. Option (a) fails because c = 3 is too small to absorb the 5n term. Option (c) uses n0 = 0, but we need the inequality to hold for all n ≥ n0, not just at n0 = 0.

Exercise 0.2: Rank by Growth Rate Design

Arrange these functions from slowest growth (left) to fastest growth (right).

?
?
?
?
?
?
log n n n log n 2n n!
Show ordering
log n < n < n log n < n² < 2n < n!

This is the fundamental hierarchy. Logarithmic < linear < linearithmic < quadratic < exponential < factorial. Every algorithm designer needs this ordering burned into memory.

Exercise 0.3: Which Is Faster for n = 1000? Derive

Algorithm A: 100n log2 n operations. Algorithm B: 2n² operations. For n = 1000, which does fewer operations? Enter the operation count for algorithm A (in thousands).

thousand operations (A)
Show derivation
A: 100 × 1000 × log2(1000) = 100,000 × 9.97 ≈ 997,000
B: 2 × 1000² = 2,000,000

A ≈ 997K operations, B = 2000K operations. A is about 2× faster despite the 100× constant factor, because n log n grows slower than n². The crossover point is around n ≈ 54: for n < 54, B is faster (smaller constant), and for n > 54, A wins.

Exercise 0.4: Tight Bound Trace
What is the tightest asymptotic bound for f(n) = 5n³ + 2n² + 7n + 13?
Show reasoning

The dominant term is 5n³. Upper bound: 5n³ + 2n² + 7n + 13 ≤ 5n³ + 2n³ + 7n³ + 13n³ = 27n³ for n ≥ 1, so it is O(n³). Lower bound: 5n³ + 2n² + 7n + 13 ≥ 5n³ for all n ≥ 0, so it is Ω(n³). Together: Θ(n³).

Exercise 0.5: Implement isBigO() Build

Write a function that empirically tests whether f(n) = O(g(n)) by checking if f(n)/g(n) stays bounded as n grows. Evaluate f and g at n = 100, 1000, 10000, 100000. If the ratio f(n)/g(n) is decreasing or bounded (max ratio < 1000), return true.

Return a boolean. The ratio should stay bounded, not grow without limit.
Show solution
javascript
function isBigO(f, g) {
  const ns = [100, 1000, 10000, 100000];
  let maxRatio = 0;
  for (const n of ns) {
    const ratio = f(n) / g(n);
    if (ratio > maxRatio) maxRatio = ratio;
  }
  return maxRatio < 1000;
}
Exercise 0.6: Max Operations in 1 Second Derive

A computer does 109 operations per second. What is the largest n you can handle in 1 second with an O(n²) algorithm?

max n (approximate)
Show derivation
n² ≤ 109
n ≤ √(109) = 104.5 ≈ 31,623

For O(n log n): n ≈ 4 × 107. For O(n³): n ≈ 1000. For O(2n): n ≈ 30. This is why complexity class matters: the difference between n² and n³ is the difference between handling 31K elements vs 1K elements in a second.

Chapter 1: Recurrences

Divide-and-conquer algorithms split problems into subproblems, solve each recursively, then combine the results. The runtime of such algorithms follows a recurrence relation. The Master Theorem gives you the answer instantly for a wide class of recurrences — no guessing, no induction proofs needed.

Master Theorem: For T(n) = aT(n/b) + Θ(nd) where a ≥ 1, b > 1, d ≥ 0:

Case 1: If d < logb a  →  T(n) = Θ(nlogb a)   // leaves dominate
Case 2: If d = logb a  →  T(n) = Θ(nd log n)   // balanced
Case 3: If d > logb a  →  T(n) = Θ(nd)   // root dominates
The intuition. At each level, you do nd work and create a subproblems of size n/b. The number of leaves in the recursion tree is alogb n = nlogb a. The Master Theorem compares the work at the root (nd) to the number of leaves (nlogb a). Whichever grows faster dominates.
Exercise 1.1: Merge Sort Recurrence Trace
T(n) = 2T(n/2) + Θ(n). Here a = 2, b = 2, d = 1. What is logb a, and which case applies?
Show reasoning

a = 2, b = 2, d = 1. log2 2 = 1 = d. This is Case 2, so T(n) = Θ(n1 log n) = Θ(n log n). The tree has log2 n levels, each doing Θ(n) total merge work, for a total of Θ(n log n). This is the classic merge sort runtime.

Exercise 1.2: T(n) = 4T(n/2) + n² Derive

Apply the Master Theorem: a = 4, b = 2, d = 2. Compute logb a and determine T(n).

log2 4 = ?
Show derivation
a = 4, b = 2, d = 2
log2 4 = 2 = d

Since d = logb a, this is Case 2: T(n) = Θ(n² log n).

At each level, the total work quadruples (4 subproblems) but each subproblem is half the size (squared input = quarter the work). These effects exactly balance, giving log n levels of Θ(n²) work each.

Exercise 1.3: T(n) = 9T(n/3) + n Trace
a = 9, b = 3, d = 1. log3 9 = 2 > 1 = d. Which case and what is T(n)?
Show reasoning

log3 9 = 2 > 1 = d, so Case 1 applies. The leaves dominate. There are 9log3 n = n2 leaves, each doing Θ(1) work. The combine work at each level (nd = n) is overwhelmed by the explosive branching. T(n) = Θ(n²).

Exercise 1.4: Merge Sort Levels for n = 16 Derive

Merge sort splits n = 16 elements in half at each level. How many levels of recursion are there, and what is the total merge work across all levels?

total merge comparisons (approx)
Show derivation
Levels = log2(16) = 4
Level 0: merge 1 pair of size 16 → 16 comparisons
Level 1: merge 2 pairs of size 8 → 16 comparisons
Level 2: merge 4 pairs of size 4 → 16 comparisons
Level 3: merge 8 pairs of size 2 → 16 comparisons
Total = 4 × 16 = 64 ≈ n log2 n = 16 × 4 = 64

Each level does exactly n comparisons total (each element participates in one merge). With log2 n levels, the total is n log2 n. This is the cleanest proof of merge sort's Θ(n log n) complexity.

Exercise 1.5: Binary Search Recurrence Derive

Binary search: T(n) = T(n/2) + Θ(1). What are a, b, d? Apply the Master Theorem.

log2 a = ?
Show derivation
a = 1, b = 2, d = 0
log2 1 = 0 = d

Case 2: T(n) = Θ(n0 log n) = Θ(log n). One subproblem, half the size, constant work per level. The tree has log2 n levels, each with Θ(1) work.

Exercise 1.6: Strassen's Matrix Multiply Trace
Strassen: T(n) = 7T(n/2) + Θ(n²). a = 7, b = 2, d = 2. log2 7 ≈ 2.81. Which case?
Show reasoning

d = 2 < log2 7 ≈ 2.807, so Case 1 applies. T(n) = Θ(nlog2 7) ≈ Θ(n2.81). Strassen reduces the exponent from 3 (naive) to 2.81 by doing 7 recursive multiplications instead of 8. The savings become significant for large matrices.

Chapter 2: Sorting

Sorting is the most studied problem in computer science. Comparison-based sorts cannot do better than Θ(n log n), but non-comparison sorts like counting sort can achieve Θ(n + k) by exploiting the structure of the input. These exercises test whether you can trace each algorithm by hand and understand why each step happens.

Exercise 2.1: Quicksort Partition Derive

Trace Lomuto partition on [3, 8, 2, 5, 1, 4, 7, 6] with pivot = last element (6). The partition places all elements ≤ pivot before it and all elements > pivot after. After partitioning, what is the index (0-based) of the pivot?

pivot final index
Show trace

Lomuto partition: i starts at lo - 1 = -1, j scans from lo to hi - 1. If A[j] ≤ pivot, increment i and swap A[i] with A[j].

Initial: [3, 8, 2, 5, 1, 4, 7, 6]   pivot = A[7] = 6, i = -1
j=0: A[0]=3 ≤ 6 → i=0, swap(0,0) → [3, 8, 2, 5, 1, 4, 7, 6]
j=1: A[1]=8 > 6 → skip
j=2: A[2]=2 ≤ 6 → i=1, swap(1,2) → [3, 2, 8, 5, 1, 4, 7, 6]
j=3: A[3]=5 ≤ 6 → i=2, swap(2,3) → [3, 2, 5, 8, 1, 4, 7, 6]
j=4: A[4]=1 ≤ 6 → i=3, swap(3,4) → [3, 2, 5, 1, 8, 4, 7, 6]
j=5: A[5]=4 ≤ 6 → i=4, swap(4,5) → [3, 2, 5, 1, 4, 8, 7, 6]
j=6: A[6]=7 > 6 → skip
Final: swap A[i+1]=A[5] with A[7] (pivot) → [3, 2, 5, 1, 4, 6, 7, 8]
Pivot index = i + 1 = 5

Five elements [3,2,5,1,4] are all ≤ 6, placed at indices 0-4. Pivot 6 goes to index 5. Elements [7,8] are at indices 6-7. The partition made 7 comparisons (one per element except the pivot).

Exercise 2.2: Quicksort Comparisons Derive

In the partition above (n = 8 elements), the pivot is compared to every other element exactly once. How many comparisons does the partition step perform?

comparisons
Show derivation
Comparisons = n - 1 = 8 - 1 = 7

Lomuto partition compares each element (except the pivot itself) to the pivot exactly once. For n elements, that is n - 1 comparisons per partition call. In the best case (balanced splits), quicksort does Θ(n log n) total. In the worst case (already sorted + last-element pivot), the splits are maximally unbalanced: n-1, n-2, ..., 1 comparisons = n(n-1)/2 = Θ(n²).

Exercise 2.3: Counting Sort Derive

Run counting sort on A = [4, 2, 2, 8, 3, 3, 1]. Values range from 0 to 8, so the count array has 9 slots. After building the count array (before prefix sums), what is count[3]?

count[3]
Show trace
A = [4, 2, 2, 8, 3, 3, 1]
count[0]=0, count[1]=1, count[2]=2, count[3]=2, count[4]=1, count[5]=0, count[6]=0, count[7]=0, count[8]=1

count[3] = 2 because the value 3 appears twice in A. After computing prefix sums, count[i] tells the position where element i should go in the output. Counting sort runs in Θ(n + k) where k is the range of values. It is NOT comparison-based, so it bypasses the Ω(n log n) lower bound.

Exercise 2.4: Implement partition() Build

Implement Lomuto partition. Use the last element as pivot. Return the final index of the pivot.

Modify arr in-place and return the pivot index.
Show solution
javascript
function partition(arr, lo, hi) {
  const pivot = arr[hi];
  let i = lo - 1;
  for (let j = lo; j < hi; j++) {
    if (arr[j] <= pivot) {
      i++;
      [arr[i], arr[j]] = [arr[j], arr[i]];
    }
  }
  [arr[i + 1], arr[hi]] = [arr[hi], arr[i + 1]];
  return i + 1;
}
Exercise 2.5: Find the Bug in Merge Sort Debug

This merge function produces wrong results on some inputs. Click the buggy line.

function merge(arr, lo, mid, hi) {
  let left = arr.slice(lo, mid + 1);
  let right = arr.slice(mid + 1, hi + 1);
  let i = 0, j = 0, k = lo;
  while (i < left.length && j < right.length) {
    if (left[i] < right[j]) {
      arr[k++] = left[i++];
    } else {
      arr[k++] = right[j++];
    }
  }
  while (i < left.length) arr[k++] = left[i++];
}
Show explanation

Line 13 is the bug — there is no loop to copy remaining elements from the right subarray. When all left elements are placed but right elements remain, those right elements are lost. The fix: add while (j < right.length) arr[k++] = right[j++]; before the closing brace.

Exercise 2.6: Comparison-Based Lower Bound Trace
Any comparison-based sort must make at least ⌈log2(n!)⌉ comparisons in the worst case. For n = 8, what is this lower bound?
Show derivation
8! = 40,320
log2(40,320) ≈ 15.30
⌈15.30⌉ = 16

The decision tree for sorting n elements has n! leaves (one per permutation). A binary tree with n! leaves has height at least ⌈log2(n!)⌉. By Stirling's approximation: log2(n!) ≈ n log2 n - n log2 e ≈ Θ(n log n). This proves the Ω(n log n) lower bound for comparison sorts.

Chapter 3: Hash Tables

Hash tables give O(1) average-case lookup, insert, and delete. But that "average" hides important details: collision resolution, load factor, and the gap between theory and practice. These exercises test whether you can trace hash operations by hand and predict performance under load.

Exercise 3.1: Division Method Hashing Derive

Hash function h(k) = k mod 13. Insert keys in order: [5, 28, 19, 15, 20, 33, 12, 17] into a table of size 13 (indices 0-12) with chaining. How many keys hash to slot 7?

keys in slot 7
Show trace
h(5) = 5 mod 13 = 5
h(28) = 28 mod 13 = 2
h(19) = 19 mod 13 = 6
h(15) = 15 mod 13 = 2   // collision with 28
h(20) = 20 mod 13 = 7
h(33) = 33 mod 13 = 7   // collision with 20
h(12) = 12 mod 13 = 12
h(17) = 17 mod 13 = 4

Slot 7 contains {20, 33} — 2 keys. Slot 2 also has 2 keys {28, 15}. With chaining, collisions just extend the linked list at that slot. The load factor α = 8/13 ≈ 0.62.

Exercise 3.2: Linear Probing Derive

Same keys [5, 28, 19, 15, 20, 33, 12, 17], h(k) = k mod 13, but now use open addressing with linear probing (if slot is full, try next slot). After inserting all 8 keys, which slot does key 33 occupy?

slot index (0-12)
Show trace
Insert 5 → slot 5 (empty) ✓
Insert 28 → slot 2 (empty) ✓
Insert 19 → slot 6 (empty) ✓
Insert 15 → slot 2 (full) → 3 (empty) ✓
Insert 20 → slot 7 (empty) ✓
Insert 33 → slot 7 (full) → 8 (empty) ✓
Insert 12 → slot 12 (empty) ✓
Insert 17 → slot 4 (empty) ✓

Key 33 hashes to 7, which is occupied by 20. Linear probing tries 8, which is empty. So 33 sits at slot 8. Notice the cluster forming at slots 2-3 and 5-8. This clustering is the Achilles' heel of linear probing — clusters grow and merge, degrading performance.

Exercise 3.3: Expected Probes Derive

For open addressing with uniform hashing, the expected number of probes for an unsuccessful search is 1/(1 - α), where α is the load factor. For α = 0.7, how many probes are expected?

expected probes
Show derivation
Expected probes = 1 / (1 - α) = 1 / (1 - 0.7) = 1 / 0.3 ≈ 3.33

At 70% load, you expect about 3.3 probes per lookup. At 90% load: 1/0.1 = 10 probes. At 95%: 20 probes. This is why practical hash tables resize when α exceeds 0.7-0.8. The formula assumes uniform hashing (each slot equally likely), which linear probing does NOT satisfy — linear probing clusters, making real performance worse than this formula predicts.

Exercise 3.4: Implement hashInsert() Build

Implement insert for open addressing with linear probing. The table is an array of size m; empty slots are null.

Return the index where the key ends up.
Show solution
javascript
function hashInsert(table, key) {
  const m = table.length;
  let idx = key % m;
  for (let i = 0; i < m; i++) {
    const probe = (idx + i) % m;
    if (table[probe] === null) {
      table[probe] = key;
      return probe;
    }
  }
  return -1;
}
Exercise 3.5: Load Factor and Resizing Trace
A hash table has 16 slots and 12 elements (α = 0.75). You insert one more, triggering a resize to 32 slots. What is the new load factor?
Show reasoning

After inserting, there are 13 elements. The table resizes to 32 slots. New α = 13/32 = 0.40625. The resize roughly halves the load factor (from ~0.81 to ~0.41), which restores fast O(1) operations. The amortized cost of all the resizes over a sequence of n insertions is O(n) total, or O(1) per insertion.

Chapter 4: Binary Search Trees

A BST stores keys so that for every node x: all keys in the left subtree are ≤ x, and all keys in the right subtree are ≥ x. This invariant enables O(h) search, insert, and delete, where h is the tree height. The exercises below test your ability to build and traverse BSTs by hand.

Exercise 4.1: BST Insertion Derive

Insert the keys [5, 3, 7, 1, 4, 6, 8] into an initially empty BST (in that order). What is the height of the resulting tree? (Height = longest path from root to leaf, counting edges.)

height
Show trace
Insert 5: root = 5
Insert 3: 3 < 5 → left of 5
Insert 7: 7 > 5 → right of 5
Insert 1: 1 < 5, 1 < 3 → left of 3
Insert 4: 4 < 5, 4 > 3 → right of 3
Insert 6: 6 > 5, 6 < 7 → left of 7
Insert 8: 8 > 5, 8 > 7 → right of 7

The tree is perfectly balanced with height 2. Root 5 has children {3, 7}. Node 3 has children {1, 4}. Node 7 has children {6, 8}. This is the best case — inserting the median first produces a balanced tree.

Exercise 4.2: Inorder Traversal Trace
An inorder traversal of the BST from Exercise 4.1 visits nodes in which order?
Show reasoning

Inorder traversal visits: left subtree, root, right subtree. For a BST, this always produces keys in sorted order. This is a direct consequence of the BST property — everything left is smaller, everything right is larger. You can use inorder traversal to sort n elements in O(n) time if you already have a balanced BST.

Exercise 4.3: Worst-Case BST Height Derive

If you insert [1, 2, 3, 4, 5, 6, 7] into an empty BST (in that order), what is the height?

height
Show reasoning

Inserting sorted data into a BST creates a degenerate tree — a linked list. 1 is the root, 2 is its right child, 3 is the right child of 2, and so on. Height = n - 1 = 6. Every operation becomes O(n) instead of O(log n). This is why balanced BSTs (AVL, red-black) exist: they guarantee h = O(log n) through rotations after each insert/delete.

Exercise 4.4: Red-Black Tree Rotation Trace
After inserting 4 into a red-black tree containing {1, 2, 3}, which operation restores the red-black properties?
Show reasoning

Red-black trees insert new nodes as red. If the parent is also red, we have a "double red" violation (property 4: no red node has a red child). The fix depends on the uncle's color: if the uncle is red, recolor. If the uncle is black, rotate and recolor. Rotations take O(1) and recoloring propagates up at most O(log n) levels. This guarantees h ≤ 2 log2(n+1), making all operations O(log n).

Exercise 4.5: BST Successor Trace
In the BST [5, 3, 7, 1, 4, 6, 8], what is the inorder successor of node 5?
Show reasoning

The inorder successor is the smallest key larger than 5. Since 5 has a right subtree (rooted at 7), the successor is the leftmost node in that subtree: go right to 7, then left to 6. Node 6 has no left child, so 6 is the successor. This is the standard algorithm for BST deletion: when deleting a node with two children, replace it with its inorder successor.

Chapter 5: Dynamic Programming

DP transforms exponential brute force into polynomial time by storing solutions to subproblems. These exercises require you to fill actual DP tables by hand — not just recognize the pattern, but compute every cell.

Exercise 5.1: LCS Table — Fill It Derive

Compute the LCS (Longest Common Subsequence) of X = "ABCBDAB" and Y = "BDCAB". Fill the DP table. What is the length of the LCS?

Recurrence:
c[i][j] = 0 if i = 0 or j = 0
c[i][j] = c[i-1][j-1] + 1 if X[i] = Y[j]
c[i][j] = max(c[i-1][j], c[i][j-1]) if X[i] ≠ Y[j]
LCS length
Show table
      "" B D C A B
""   0 0 0 0 0 0
A    0 0 0 0 1 1
B    0 1 1 1 1 2
C    0 1 1 2 2 2
B    0 1 1 2 2 3
D    0 1 2 2 2 3
A    0 1 2 2 3 3
B    0 1 2 2 3 4

LCS length = c[7][5] = 4. The LCS itself is "BCAB" (trace back from c[7][5] following the arrows: diagonal when characters match, up/left otherwise). Time: Θ(mn) where m = 7, n = 5, so 35 cells. Space: Θ(mn) for the full table, but Θ(min(m,n)) if you only need the length (keep two rows).

Exercise 5.2: Rod Cutting Derive

Prices: p = [1, 5, 8, 9, 10, 17, 17, 20] (indices 1-8). What is the maximum revenue for a rod of length 4?

r[j] = max1 ≤ i ≤ j (p[i] + r[j - i]),   r[0] = 0
max revenue ($)
Show table
r[0] = 0
r[1] = p[1] + r[0] = 1 + 0 = 1
r[2] = max(p[1]+r[1], p[2]+r[0]) = max(1+1, 5+0) = max(2, 5) = 5
r[3] = max(p[1]+r[2], p[2]+r[1], p[3]+r[0]) = max(1+5, 5+1, 8+0) = max(6, 6, 8) = 8
r[4] = max(p[1]+r[3], p[2]+r[2], p[3]+r[1], p[4]+r[0]) = max(1+8, 5+5, 8+1, 9+0) = max(9, 10, 9, 9) = 10

Maximum revenue for rod length 4 is $10, achieved by cutting into two pieces of length 2 (each worth $5). Note that p[4] = $9 is less than the optimal $10, so leaving the rod uncut is suboptimal.

Exercise 5.3: Implement lcs() Build

Write a function that computes the length of the Longest Common Subsequence of two strings.

Use bottom-up DP with a 2D table.
Show solution
javascript
function lcs(a, b) {
  const m = a.length, n = b.length;
  const dp = Array.from({length: m + 1}, () => Array(n + 1).fill(0));
  for (let i = 1; i <= m; i++) {
    for (let j = 1; j <= n; j++) {
      if (a[i - 1] === b[j - 1]) {
        dp[i][j] = dp[i - 1][j - 1] + 1;
      } else {
        dp[i][j] = Math.max(dp[i - 1][j], dp[i][j - 1]);
      }
    }
  }
  return dp[m][n];
}
Exercise 5.4: 0/1 Knapsack Derive

Items: weights = [2, 3, 4, 5], values = [3, 4, 5, 6]. Knapsack capacity W = 5. What is the maximum value you can carry?

max value
Show table
K[i][w] = max value using items 1..i with capacity w

      w=0 1 2 3 4 5
i=0   0 0 0 0 0 0
i=1   0 0 3 3 3 3 // item 1: w=2, v=3
i=2   0 0 3 4 4 7 // item 2: w=3, v=4 → K[2][5]=max(K[1][5], 4+K[1][2])=max(3,7)=7
i=3   0 0 3 4 5 7 // item 3: w=4, v=5
i=4   0 0 3 4 5 7 // item 4: w=5, v=6 → K[4][5]=max(K[3][5], 6+K[3][0])=max(7,6)=7

Maximum value = 7, achieved by taking items 1 and 2 (weights 2 + 3 = 5, values 3 + 4 = 7). Item 4 alone gives value 6 for weight 5, which is worse. The 0/1 knapsack has Θ(nW) time, which is pseudo-polynomial — it depends on the value of W, not its bit length.

Exercise 5.5: DP vs Greedy Trace
A greedy approach to the knapsack above would pick the item with the best value/weight ratio first. Item 1: 3/2 = 1.5, Item 2: 4/3 = 1.33, Item 3: 5/4 = 1.25, Item 4: 6/5 = 1.2. Greedy picks item 1 (w=2, v=3), then item 2 (w=3, v=4), total = 7. Does greedy always find the optimal 0/1 knapsack solution?
Exercise 5.6: Find the Bug in LCS Debug

This LCS function returns wrong values. Click the buggy line.

function lcs(a, b) {
  const m = a.length, n = b.length;
  const dp = Array.from({length: m+1}, () => Array(n+1).fill(0));
  for (let i = 1; i <= m; i++) {
    for (let j = 1; j <= n; j++) {
      if (a[i-1] === b[j-1]) {
        dp[i][j] = dp[i-1][j-1] + 1;
      } else {
        dp[i][j] = dp[i-1][j] + dp[i][j-1];
      }
    }
  }
  return dp[m][n];
}
Show explanation

Line 9 is the bug. When characters do not match, the recurrence should be Math.max(dp[i-1][j], dp[i][j-1]), not dp[i-1][j] + dp[i][j-1]. Adding the two values instead of taking the max double-counts shared subproblems and produces inflated results.

Chapter 6: Greedy Algorithms

Greedy algorithms make the locally optimal choice at each step and never reconsider. When they work, they are simpler and faster than DP. The trick is proving they actually produce a globally optimal solution. These exercises test both the mechanics and the reasoning.

Exercise 6.1: Activity Selection Derive

Activities with (start, finish) times: a1(1,4), a2(3,5), a3(0,6), a4(5,7), a5(3,9), a6(5,9), a7(6,10), a8(8,11), a9(8,12), a10(2,14), a11(12,16). Sort by finish time and greedily select non-overlapping activities. How many can you select?

activities selected
Show trace

Sorted by finish time: a1(1,4), a2(3,5), a3(0,6), a4(5,7), a5(3,9), a6(5,9), a7(6,10), a8(8,11), a9(8,12), a10(2,14), a11(12,16).

Select a1(1,4) ✓   last_finish = 4
a2(3,5): start 3 < 4 → skip
a3(0,6): start 0 < 4 → skip
Select a4(5,7) ✓   last_finish = 7
a5(3,9): start 3 < 7 → skip
a6(5,9): start 5 < 7 → skip
a7(6,10): start 6 < 7 → skip
Select a8(8,11) ✓   last_finish = 11
a9(8,12): start 8 < 11 → skip
a10(2,14): start 2 < 11 → skip
Select a11(12,16) ✓   last_finish = 16

4 activities selected: {a1, a4, a8, a11}. The greedy choice of earliest-finish-time is provably optimal via an exchange argument: swapping any selected activity for a later-finishing one can only reduce (never increase) the number of subsequent compatible activities.

Exercise 6.2: Huffman Coding — Build Tree Derive

Character frequencies: a:5, b:9, c:12, d:13, e:16, f:45. Build the Huffman tree by repeatedly merging the two lowest-frequency nodes. What is the total number of bits to encode a message where each character appears exactly as many times as its frequency?

total bits
Show tree construction
Step 1: Merge a(5) + b(9) = 14 → [c:12, d:13, ab:14, e:16, f:45]
Step 2: Merge c(12) + d(13) = 25 → [ab:14, e:16, cd:25, f:45]
Step 3: Merge ab(14) + e(16) = 30 → [cd:25, abe:30, f:45]
Step 4: Merge cd(25) + abe(30) = 56 → [f:45, cdabe:56]
Step 5: Merge f(45) + cdabe(56) = 100 → root

Code lengths: f=1, c=3, d=3, a=4, b=4, e=3.

Total bits = 45×1 + 12×3 + 13×3 + 5×4 + 9×4 + 16×3 = 45 + 36 + 39 + 20 + 36 + 48 = 224

Fixed-length codes would need ⌈log2 6⌉ = 3 bits per character, totaling 100 × 3 = 300 bits. Huffman saves 76 bits (25%) by giving shorter codes to more frequent characters.

Exercise 6.3: Huffman Code Length Trace
In the Huffman tree above, what is the code length (depth in the tree) for character 'f' with frequency 45?
Exercise 6.4: Arrange Greedy Algorithm Steps Design

Put these steps of the activity selection algorithm in the correct order.

?
?
?
?
Sort by finish time Select first activity Scan: skip overlapping, select compatible Return selected set
Exercise 6.5: Greedy Correctness Trace
Why does sorting by START time (instead of finish time) NOT yield an optimal solution for activity selection?

Chapter 7: Graph Algorithms

BFS and DFS are the two fundamental graph traversals. BFS explores level by level (finding shortest paths in unweighted graphs). DFS explores as deep as possible (enabling cycle detection, topological sort, and SCC decomposition). These exercises require you to trace both algorithms on concrete graphs.

Graph for exercises 7.1-7.3. Directed graph with vertices {A, B, C, D, E, F} and edges: A→B, A→C, B→D, B→E, C→E, D→F, E→F.
Exercise 7.1: BFS Distances Derive

Run BFS from source A on the graph above. What is the shortest distance (number of edges) from A to F?

edges
Show BFS trace
Queue: [A]   dist: A=0
Dequeue A, enqueue B(1), C(1)   Queue: [B, C]
Dequeue B, enqueue D(2), E(2)   Queue: [C, D, E]
Dequeue C, E already discovered   Queue: [D, E]
Dequeue D, enqueue F(3)   Queue: [E, F]
Dequeue E, F already discovered   Queue: [F]
Dequeue F   Queue: []
Distances: A=0, B=1, C=1, D=2, E=2, F=3

Shortest path A→F has 3 edges. Two paths achieve this: A→B→D→F and A→B→E→F. BFS guarantees shortest paths in unweighted graphs because it discovers all distance-k vertices before any distance-(k+1) vertex.

Exercise 7.2: DFS Edge Classification Trace
Run DFS from A (visiting neighbors in alphabetical order). The edge C→E leads to E, which was already discovered during B's exploration. What type of edge is C→E?
Show DFS trace
DFS(A): discover A, visit B
  DFS(B): discover B, visit D
    DFS(D): discover D, visit F
      DFS(F): discover F, finish F
    finish D
  visit E
    DFS(E): discover E, visit F (already finished) → forward edge
    finish E
  finish B
visit C
  DFS(C): discover C, visit E (already finished) → cross edge
  finish C
finish A

C→E is a cross edge: E is in a different branch of the DFS tree (B's subtree) and is already finished when C examines it. Back edges point to ancestors (still gray/in-progress). A directed graph has a cycle if and only if DFS finds a back edge.

Exercise 7.3: Topological Sort Derive

The graph above is a DAG (no cycles). A topological sort orders vertices so that for every edge u→v, u appears before v. Using DFS finish times (reverse order), list a valid topological ordering. How many valid topological orderings exist for this graph?

valid orderings
Show orderings

One valid topological order (by reverse DFS finish time): A, C, B, E, D, F. The 4 valid orderings are:

1. A, B, C, D, E, F
2. A, B, C, E, D, F
3. A, B, D, C, E, F
4. A, C, B, D, E, F

A must come first (no incoming edges). F must come last (only incoming edges). B and C can be swapped when both follow A. D and E can be reordered under constraints. The total of 4 can be verified by enumeration. Topological sort runs in Θ(V + E).

Exercise 7.4: Implement bfs() Build

Write BFS that returns the shortest distance from a source to every reachable vertex. The graph is given as an adjacency list (object mapping vertex to array of neighbors).

Use a queue (array with shift/push is fine for this size).
Show solution
javascript
function bfs(graph, source) {
  const dist = {};
  dist[source] = 0;
  const queue = [source];
  while (queue.length > 0) {
    const u = queue.shift();
    for (const v of (graph[u] || [])) {
      if (dist[v] === undefined) {
        dist[v] = dist[u] + 1;
        queue.push(v);
      }
    }
  }
  return dist;
}
Exercise 7.5: Cycle Detection Trace
You add the edge F→A to the graph. DFS from A now discovers that A is still gray (in-progress) when F examines it. What does this mean?

Chapter 8: Shortest Paths & MST

Weighted graphs add a new dimension: finding the cheapest path, not just the shortest by hops. Dijkstra handles non-negative weights. Bellman-Ford handles negative weights and detects negative cycles. For minimum spanning trees, Kruskal's sorts edges and uses union-find, while Prim's grows a tree from a source.

Graph for exercises 8.1-8.2. Vertices: {A, B, C, D, E}. Edges with weights: A→B(10), A→C(3), B→C(1), B→D(2), C→B(4), C→D(8), C→E(2), D→E(7), E→D(9).
Exercise 8.1: Dijkstra Trace Derive

Run Dijkstra's algorithm from source A on the graph above. What is the shortest distance from A to D?

shortest distance A→D
Show Dijkstra trace
Init: dist = {A:0, B:∞, C:∞, D:∞, E:∞}

Extract A (d=0):
  Relax A→B: 0+10=10 < ∞ → dist[B]=10
  Relax A→C: 0+3=3 < ∞ → dist[C]=3

Extract C (d=3):
  Relax C→B: 3+4=7 < 10 → dist[B]=7
  Relax C→D: 3+8=11 < ∞ → dist[D]=11
  Relax C→E: 3+2=5 < ∞ → dist[E]=5

Extract E (d=5):
  Relax E→D: 5+9=14 > 11 → no update

Extract B (d=7):
  Relax B→C: 7+1=8 > 3 → no update
  Relax B→D: 7+2=9 < 11 → dist[D]=9

Extract D (d=9): done
Final: A=0, C=3, E=5, B=7, D=9

Shortest path to D: A→C→B→D with cost 3+4+2 = 9. The direct A→B→D costs 10+2 = 12. Going through C first is cheaper despite the detour. Dijkstra with a binary heap runs in O((V+E) log V).

Exercise 8.2: Bellman-Ford Negative Cycle Trace
You add edge D→C with weight -15 to the graph. After running V-1 = 4 rounds of Bellman-Ford, you perform one more relaxation pass. If any distance decreases, there is a negative cycle. Does this graph have a negative cycle?
Show reasoning

The cycle C→B→D→C has total weight 4 + 2 + (-15) = -9 < 0. This is a negative-weight cycle. Any path that includes this cycle can be made arbitrarily cheap by going around it more times. Bellman-Ford detects this: after V-1 relaxation passes, a (V)th pass still finds improvements, which is the signal. Dijkstra CANNOT handle negative edges and would give wrong answers on this graph.

Exercise 8.3: Kruskal's MST Derive

Undirected graph: vertices {A,B,C,D,E}, edges with weights: AB(4), AC(2), BC(1), BD(5), CD(8), CE(10), DE(2). Sort edges by weight and apply Kruskal's algorithm. What is the total weight of the MST?

MST total weight
Show Kruskal trace
Sorted edges: BC(1), AC(2), DE(2), AB(4), BD(5), CD(8), CE(10)

BC(1): B,C in different sets → add ✓   MST = {BC}
AC(2): A,C in different sets → add ✓   MST = {BC, AC}
DE(2): D,E in different sets → add ✓   MST = {BC, AC, DE}
AB(4): A,B in same set (via A-C-B) → skip — would create cycle
BD(5): B,D in different sets → add ✓   MST = {BC, AC, DE, BD}
4 edges connect 5 vertices → MST complete. Total = 1 + 2 + 2 + 4 = 9.

MST total weight = 1 + 2 + 2 + 4 = 9. Kruskal's greedily adds the cheapest edge that does not create a cycle. Union-find enables cycle detection in nearly O(1) amortized time. The total runtime is O(E log E) dominated by sorting.

Exercise 8.4: Prim's vs Kruskal's Trace
Which statement about Prim's and Kruskal's algorithms is correct?
Exercise 8.5: Dijkstra — Find the Bug Debug

This Dijkstra implementation sometimes gives wrong shortest distances. Click the buggy line.

function dijkstra(graph, source) {
  let dist = {}, visited = new Set();
  for (let v in graph) dist[v] = Infinity;
  dist[source] = 0;
  while (visited.size < Object.keys(graph).length) {
    let u = null, minD = Infinity;
    for (let v in dist) {
      if (!visited.has(v) && dist[v] < minD) { u = v; minD = dist[v]; }
    }
    if (u === null) break;
    visited.add(u);
    for (let [v, w] of graph[u]) {
      if (dist[u] + w < dist[v]) dist[v] = dist[u] + w;
    }
  }
  return dist;
}
Show explanation

Line 13 is the bug. The relaxation updates dist[v] even when v has already been visited. While this code looks like it might work (it does for non-negative weights), the correct Dijkstra skips already-visited vertices in the neighbor loop: if (!visited.has(v) && dist[u] + w < dist[v]). Without this check, previously finalized vertices can have their distances incorrectly re-updated if the graph has certain edge patterns, and with negative weights this can loop infinitely.

Exercise 8.6: Dijkstra Runtime Derive

A graph has V = 1000 vertices and E = 5000 edges. What is the runtime of Dijkstra with a binary heap, expressed as a number of operations? Use the formula (V + E) log V.

approximate operations (thousands)
Show derivation
(V + E) × log2 V = (1000 + 5000) × log2(1000) = 6000 × 9.97 ≈ 59,800

≈ 60K operations. Compare to the naive O(V²) implementation: 1,000,000 operations. The binary heap saves ~17× on this sparse graph. For dense graphs (E ≈ V²), the naive version is better because the heap overhead is not worth it.

Chapter 9: Capstone

These final exercises combine concepts from across the workbook. You must identify the right algorithm, analyze its complexity, implement it, and reason about tradeoffs. This is what interviews test.

Exercise 9.1: Algorithm Selection Trace
You need to find the shortest path between two cities in a road network with 10M intersections and 25M road segments (all non-negative weights). Which algorithm do you use?
Show analysis

BFS only works for unweighted graphs. Floyd-Warshall is O(V³) = 1021 — absurd. Bellman-Ford is O(VE) ≈ 2.5 × 1014 — would take days. Dijkstra with a Fibonacci heap achieves O(V log V + E) ≈ 260M — runs in under a second. Even a binary heap gives O((V+E) log V) ≈ 350M × 23 ≈ 800M, still feasible in ~1 second.

Exercise 9.2: DP vs Greedy — Course Schedule Trace
You have n courses with prerequisites forming a DAG. You want to find the minimum number of semesters to complete all courses (each semester you can take any courses whose prerequisites are met, unlimited courses per semester). What approach works?
Show reasoning

The minimum semesters equals the length of the longest path in the prerequisite DAG (critical path). Topological sort gives a valid processing order, and a single DP pass computes the longest path: for each vertex v in topological order, dp[v] = 1 + max{dp[u] : u is a prerequisite of v}. The answer is max(dp[v]) over all v. BFS level-by-level also works (Kahn's algorithm variant): each "level" is one semester, and the number of levels is the answer.

Exercise 9.3: Complexity Analysis Derive

An algorithm has two nested loops: the outer runs n times, the inner runs from 1 to i (where i is the outer loop variable). What is the total number of iterations?

iterations for n = 100
Show derivation
Total = ∑i=1n i = n(n+1)/2 = 100 × 101 / 2 = 5050

This is Θ(n²). The pattern appears in selection sort, insertion sort (worst case), and Bellman-Ford's inner loop. Gauss famously computed this sum as a child: pair the first and last terms (1+n), second and second-to-last (2+n-1), etc., getting n/2 pairs each summing to n+1.

Exercise 9.4: Design an Algorithm Design

Problem: given a weighted directed graph, find if there is a path from s to t with total weight ≤ W. Arrange the algorithm pipeline.

?
?
?
?
Build adjacency list Run Dijkstra from s Extract dist[t] Return dist[t] ≤ W
Exercise 9.5: Space-Time Tradeoff Trace
You are computing LCS of two strings of length 1000 each. The standard DP uses a 1001 × 1001 table. How much space (in entries) can you save if you only need the LCS length (not the actual subsequence)?
Show reasoning

The recurrence c[i][j] depends only on c[i-1][j-1], c[i-1][j], and c[i][j-1] — all from the current or previous row. So you only need two rows at any time: the previous row and the current row being filled. Space goes from Θ(mn) = ~1M entries to Θ(min(m,n)) = ~2K entries. The tradeoff: you lose the ability to reconstruct the actual LCS (which requires backtracking through the full table).

Exercise 9.6: Full Pipeline — Implementation + Analysis Build

Implement Dijkstra's algorithm. The graph is given as an adjacency list where each entry maps a vertex to an array of [neighbor, weight] pairs. Return the shortest distances object.

Use a simple extract-min over unvisited vertices (no priority queue needed for these small tests).
Show solution
javascript
function dijkstra(graph, source) {
  const dist = {}, visited = new Set();
  for (const v in graph) dist[v] = Infinity;
  dist[source] = 0;
  const n = Object.keys(graph).length;
  for (let iter = 0; iter < n; iter++) {
    let u = null, minD = Infinity;
    for (const v in dist) {
      if (!visited.has(v) && dist[v] < minD) {
        u = v; minD = dist[v];
      }
    }
    if (u === null) break;
    visited.add(u);
    for (const [v, w] of graph[u]) {
      if (!visited.has(v) && dist[u] + w < dist[v]) {
        dist[v] = dist[u] + w;
      }
    }
  }
  return dist;
}
Proof of mastery. If you completed every exercise in this workbook from scratch — computed asymptotics, applied the Master Theorem, traced sorting and hashing by hand, filled DP tables, built Huffman trees, ran BFS/DFS/Dijkstra step by step, and implemented the algorithms — you have the algorithmic foundation that every technical interview assumes. These are the exact computations that CLRS teaches and that interviewers expect you to perform under pressure. "An algorithm must be seen to be believed." — Donald Knuth

Related Lessons

TopicLesson
Dynamic Programming (CLRS Ch 15)Dynamic Programming — From Absolute Zero
Graph Algorithms (CLRS Ch 22)Graph Algorithms — From Absolute Zero
Shortest Paths (CLRS Ch 24)Shortest Paths — From Absolute Zero
Transformer Math WorkbookTransformer Math Workbook