Elementary Graph Algorithms (CLRS Ch 22) — From Absolute Zero to Mastery

Chapter 0: The Problem

You have six friends. Alice knows Bob. Bob knows Carol. Carol knows Dave. Dave knows Eve. Eve knows Frank. Frank knows Alice. Can you figure out who is reachable from who? Easy enough with six people. Now imagine 2 billion Facebook users, each with hundreds of connections. How do you find the shortest chain of friends between any two people?

This is a graph problem. Graphs are the universal language for representing relationships: friendships, web links, road networks, course prerequisites, function call chains, circuit wiring, molecular bonds. Every time you have things and connections between things, you have a graph.

A graph G = (V, E) consists of a set of vertices (also called nodes) V and a set of edges E that connect pairs of vertices. That is the entire definition. Everything else — BFS, DFS, topological sort, SCCs — is just clever ways to walk through this structure.

Graphs are the second most important data structure in computer science (after arrays). Trees, linked lists, heaps — these are all special cases of graphs. Hash tables use graphs internally (chaining = linked list at each bucket). Even arrays can be viewed as graphs (element i has an edge to element i+1). When you learn graphs, you learn the general structure that subsumes all the others.

In this lesson, we cover the four fundamental graph algorithms from CLRS Chapter 22. By the end, you will be able to traverse any graph (BFS, DFS), detect structure (cycles, DAGs, SCCs), and order vertices (topological sort). These four algorithms, combined with the ability to recognize graph problems in disguise, solve 80% of all graph-related interview questions.

Directed vs Undirected

In an undirected graph, edges have no direction. If Alice knows Bob, then Bob knows Alice. The edge {A, B} is the same as {B, A}. Think: Facebook friendships, road networks (two-way streets).

In a directed graph (digraph), edges have direction. If Alice follows Bob on Twitter, that does not mean Bob follows Alice. The edge (A, B) is different from (B, A). Think: Twitter follows, web links, course prerequisites.

Why does direction matter? In an undirected graph, "reachable from A" is symmetric — if you can reach B from A, you can reach A from B. In a directed graph, this symmetry breaks. You can follow a chain of links from Wikipedia's homepage to any article, but you cannot always get back. This asymmetry is what makes directed graphs harder and what gives rise to strongly connected components (Chapter 6).

How to Store a Graph in Memory

There are two standard representations. The choice between them affects the runtime of every algorithm in this lesson.

Adjacency list: For each vertex u, store a list of all vertices v such that (u, v) is an edge. Space: Θ(V + E). Looking up "does edge (u, v) exist?" takes O(degree(u)) time. Iterating over all neighbors of u takes O(degree(u)) time. This is the default choice for sparse graphs (most real-world graphs).

Adjacency matrix: A V × V matrix M where M[u][v] = 1 if edge (u, v) exists, 0 otherwise. Space: Θ(V²). Looking up "does edge (u, v) exist?" takes O(1) time. Iterating over all neighbors takes O(V) time regardless of how many neighbors exist. This is better for dense graphs or when you need constant-time edge lookup.

Operation	Adjacency List	Adjacency Matrix
Space	Θ(V + E)	Θ(V²)
Edge lookup	O(degree(u))	O(1)
All neighbors of u	O(degree(u))	O(V)
Add edge	O(1)	O(1)
Best for	Sparse (E « V²)	Dense (E ≈ V²)

The sparse graph assumption. Most graphs you encounter in practice — social networks, web graphs, road networks, dependency graphs — are sparse. A person has hundreds of friends, not billions. A web page links to dozens of other pages, not all of them. For sparse graphs, adjacency lists win on space and neighbor iteration. Every algorithm in this chapter (BFS, DFS, topological sort, SCCs) runs in O(V + E) time with adjacency lists. With an adjacency matrix, they would run in O(V²). That is why CLRS assumes adjacency lists throughout Chapter 22.

When to use adjacency matrices: (1) The graph is dense (E close to V²). (2) You need O(1) edge existence queries (e.g., "are u and v neighbors?"). (3) You are implementing Floyd-Warshall (all-pairs shortest paths, Ch 25), which naturally uses a V×V matrix. (4) The graph is small enough that V² space is acceptable. In interviews, default to adjacency lists unless one of these conditions holds.

Key Terminology

Before we dive into algorithms, let us nail down the vocabulary. Every term below appears in CLRS and in interviews. Bold terms are the ones you must know cold.

Term	Definition	Example
Path	A sequence of vertices v₀, v₁, ..., v_k where each consecutive pair has an edge	A → B → D is a path of length 2
Cycle	A path from a vertex back to itself (length ≥ 1)	A → B → C → A is a cycle of length 3
Connected	(Undirected) Every pair of vertices has a path between them	Social network giant component
Strongly connected	(Directed) Every pair u,v has paths u↝v AND v↝u	A cycle is strongly connected
DAG	Directed Acyclic Graph — directed graph with no cycles	Course prerequisite graph
Tree	Connected acyclic undirected graph. \|E\| = \|V\| - 1.	File system directory structure
Forest	Acyclic undirected graph (collection of trees)	DFS forest, spanning forest
Sparse	\|E\| is much less than \|V\|²	Road networks, social graphs
Dense	\|E\| is close to \|V\|²	Complete graphs, tournament brackets

The handshaking lemma. In any undirected graph, the sum of all vertex degrees equals 2|E|. Each edge contributes 1 to the degree of each of its two endpoints. Consequence: the number of vertices with odd degree is always even. This is surprisingly useful in proofs and interview warmups.

Build a Graph

The canvas below lets you create a directed graph by clicking to place nodes and clicking two nodes in sequence to create an edge. Toggle between adjacency list and adjacency matrix views to see how the same graph looks in memory.

Interactive Graph Builder

Click empty space to add a node. Click a node, then click another node to add a directed edge. Toggle the view below.

// Add nodes and edges to see the representation

The representation matters for runtime. BFS and DFS both visit every vertex and every edge exactly once. With adjacency lists, "visit all edges" takes O(E) total across all vertices. With an adjacency matrix, "visit all edges" requires scanning an entire row of length V for each vertex, giving O(V²) even if most entries are zero. For a social network with 1 billion users and 150 average friends, that is the difference between 150 billion operations and 10¹⁸ operations. Adjacency lists are not optional — they are survival.

Implementation: Both Representations

python
# Adjacency List (dictionary of lists)
graph_list = {
    'A': ['B', 'C'],
    'B': ['D'],
    'C': ['D', 'E'],
    'D': ['F'],
    'E': [],
    'F': []
}
# Space: O(V + E) = O(6 + 6) = O(12)
# "Who can A reach?" → graph_list['A'] → ['B', 'C'] in O(1)
# "Does edge A→D exist?" → 'D' in graph_list['A'] → O(degree(A))

# Adjacency Matrix (2D array)
#        A  B  C  D  E  F
matrix = [
    [0, 1, 1, 0, 0, 0],  # A → B, C
    [0, 0, 0, 1, 0, 0],  # B → D
    [0, 0, 0, 1, 1, 0],  # C → D, E
    [0, 0, 0, 0, 0, 1],  # D → F
    [0, 0, 0, 0, 0, 0],  # E → nothing
    [0, 0, 0, 0, 0, 0],  # F → nothing
]
# Space: O(V²) = O(36) — already 3x larger for this tiny graph
# "Does edge A→D exist?" → matrix[0][3] → O(1)
# "Who can A reach?" → scan row 0 → O(V) regardless of degree

Degree, In-Degree, Out-Degree

The degree of a vertex in an undirected graph is the number of edges incident to it. In a directed graph, we distinguish between in-degree (number of incoming edges) and out-degree (number of outgoing edges). These concepts are critical for Kahn's topological sort algorithm (Chapter 5), which peels off zero-in-degree vertices.

For undirected graph: ∑_v∈V degree(v) = 2|E| (handshaking lemma — each edge contributes 2 to total degree)

For directed graph: ∑_v∈V in-degree(v) = ∑_v∈V out-degree(v) = |E|

Weighted vs Unweighted

Every algorithm in this chapter assumes unweighted graphs — all edges are equal. When edges have weights (road distances, network latencies, costs), you need the algorithms from CLRS Chapters 23-25: Dijkstra, Bellman-Ford, Prim's, Kruskal's. Think of this chapter as the foundation: master unweighted traversal first, then adding weights is a natural extension.

Self-Loops and Multi-Edges

A self-loop is an edge from a vertex to itself: (v, v). Self-loops appear in state machines (a state that transitions to itself) and in adjacency matrices (the diagonal). Most algorithms in this chapter handle self-loops gracefully — BFS and DFS simply skip the edge if v is already visited.

A multi-edge (or parallel edge) means multiple edges between the same pair of vertices. For adjacency lists, this is natural (just add v to u's list multiple times). For adjacency matrices, you would store a count instead of 0/1. Most algorithms treat multi-edges the same as single edges for traversal purposes.

A simple graph has no self-loops and no multi-edges. CLRS generally assumes simple graphs unless stated otherwise. In interviews, always clarify: "Are there self-loops or parallel edges?" This can affect edge cases in cycle detection and bipartiteness.

Graph Density: Sparse vs Dense

For a graph with V vertices, the maximum number of edges is:

Undirected: |E|_max = V(V-1)/2 = O(V²)
Directed: |E|_max = V(V-1) = O(V²) (self-loops excluded)

A graph is sparse when E = O(V) or E = O(V log V). A graph is dense when E = Θ(V²). Most real-world graphs are sparse:

Graph	V	E	E/V	Density
Facebook social graph	~3B	~400B	~130	Sparse
Web graph	~50B pages	~1T links	~20	Very sparse
US road network	~24M intersections	~58M roads	~2.4	Extremely sparse
Airline routes	~3,500 airports	~37,000 routes	~10	Sparse
Complete graph K₁₀₀	100	4,950	49.5	Dense

The density determines your representation. For the US road network (E/V ≈ 2.4), an adjacency matrix would waste 24M × 24M × 4 bytes ≈ 2.3 petabytes. An adjacency list uses 24M + 58M entries ≈ 650 MB. For a complete graph of 100 nodes, the matrix is 10,000 entries (40 KB) — perfectly fine, and the O(1) edge lookup can be useful.

Question: You have a graph with 10,000 vertices and 20,000 edges. You need to run BFS. Which representation should you use, and what is the runtime?

Adjacency matrix, O(V²) = O(10⁸). The constant-time edge lookup makes BFS faster. Adjacency list, O(V + E) = O(30,000). The graph is sparse (E = 2V), so adjacency lists save both space and time. BFS visits each vertex once and each edge once, giving O(V + E). Either works equally well since BFS is always O(V + E).

Chapter 1: Breadth-First Search

Imagine you drop a stone in a pond. Ripples spread outward in concentric rings — first the point of impact, then one meter out, then two meters, then three. Breadth-First Search (BFS) explores a graph exactly like those ripples. Starting from a source vertex s, it discovers all vertices at distance 1 first, then all at distance 2, then distance 3, and so on. It never jumps ahead. It is the most patient algorithm in graph theory.

The Coloring Scheme

CLRS uses three colors to track vertex states during BFS. This is not just notation — the colors correspond to a fundamental invariant that makes BFS correct.

White: undiscovered. The vertex has not been seen at all. Every vertex starts white.

Gray: discovered but not finished. The vertex is in the queue. We know it exists, but we have not yet explored all of its neighbors. Gray vertices form the "frontier" — the wavefront of the ripple.

Black: finished. We have dequeued this vertex and explored all of its neighbors. Every neighbor of a black vertex has been discovered (is gray or black). Black vertices are behind the wavefront.

The BFS invariant. At every moment during BFS, the queue contains exactly the gray vertices. All vertices behind the wavefront are black. All vertices ahead of the wavefront are white. This means BFS processes vertices in strict distance order from the source — first distance 0 (just s), then distance 1 (s's neighbors), then distance 2, and so on. This is why BFS finds shortest paths in unweighted graphs.

The Algorithm

// BFS(G, s): explore graph G starting from source vertex s
// Initialize: all vertices white, distance ∞, no parent
for each vertex u ∈ V - {s}:
  color[u] = WHITE, d[u] = ∞, π[u] = NIL
color[s] = GRAY, d[s] = 0, π[s] = NIL
Q = empty queue
ENQUEUE(Q, s)

while Q is not empty:
  u = DEQUEUE(Q)
  for each v ∈ Adj[u]: // scan u's adjacency list
    if color[v] == WHITE: // v is undiscovered
      color[v] = GRAY
      d[v] = d[u] + 1 // distance from source
      π[v] = u // u is v's parent in BFS tree
      ENQUEUE(Q, v)
  color[u] = BLACK // all of u's neighbors discovered

Every vertex is enqueued at most once (when it turns gray) and dequeued at most once (when it turns black). Every edge is examined at most once (when its source vertex is dequeued and we scan its adjacency list). The V initialization loop contributes O(V). The total work across all dequeue operations is ∑ degree(u) = 2|E| for undirected (|E| for directed). Total runtime: O(V + E).

Space: O(V) for the queue, color array, distance array, and parent array. In the worst case (star graph — source has V-1 neighbors), all V-1 vertices enter the queue simultaneously. But we never need more than O(V) space total.

Worked Example

Consider a graph with vertices {A, B, C, D, E, F} and edges A→B, A→C, B→D, C→D, C→E, D→F. Let the source be A.

Step	Dequeue	Queue (after)	Discovered	Distances
Init	—	[A]	A(gray)	A:0
1	A	[B, C]	B, C(gray)	B:1, C:1
2	B	[C, D]	D(gray)	D:2
3	C	[D, E]	E(gray), D already gray	E:2
4	D	[E, F]	F(gray)	F:3
5	E	[F]	—	—
6	F	[]	—	—

BFS tree (parent pointers): B←A, C←A, D←B, E←C, F←D. The shortest path from A to F is A → B → D → F (length 3).

The BFS Tree

The parent pointers π[v] define a BFS tree (or BFS forest if the graph is disconnected). The BFS tree has a remarkable property: every tree edge connects a vertex at distance d to a vertex at distance d+1. There are no "shortcuts" — the tree faithfully represents the shortest-path structure.

For an undirected graph, every non-tree edge connects vertices at the same distance or adjacent distances (|d[u] - d[v]| ≤ 1). This is because if |d[u] - d[v]| ≥ 2, then BFS would have discovered the closer vertex through the farther one, making this a tree edge.

PRINT-PATH from CLRS (p. 601). To print the shortest path from source s to vertex v, use parent pointers recursively. This is one of the most frequently asked BFS follow-up questions in interviews.

PRINT-PATH(G, s, v):
  if v == s:
    print s
  elif π[v] == NIL:
    print "no path from s to v exists"
  else:
    PRINT-PATH(G, s, π[v])
    print v

Step Through BFS

The simulation below runs BFS on a sample graph. Each step dequeues one vertex, explores its neighbors, and colors them. Watch the queue (bottom) and the discovery distances update in real time.

BFS Step-Through Animation

Press Step to advance one BFS iteration. Reset to start over. The queue is shown below the graph.

Implementation

python
from collections import deque

def bfs(graph, source):
    """BFS on adjacency-list graph. Returns (dist, parent) dicts."""
    color = {v: 'white' for v in graph}
    dist  = {v: float('inf') for v in graph}
    parent = {v: None for v in graph}

    color[source] = 'gray'
    dist[source]  = 0
    queue = deque([source])

    while queue:
        u = queue.popleft()
        for v in graph[u]:
            if color[v] == 'white':
                color[v]  = 'gray'
                dist[v]   = dist[u] + 1
                parent[v] = u
                queue.append(v)
        color[u] = 'black'

    return dist, parent

# Example
G = {
    'A': ['B', 'C'],
    'B': ['D'],
    'C': ['D', 'E'],
    'D': ['F'],
    'E': [],
    'F': []
}
dist, par = bfs(G, 'A')
# dist = {'A':0, 'B':1, 'C':1, 'D':2, 'E':2, 'F':3}

BFS gives shortest paths in unweighted graphs. Because BFS discovers vertices in strict distance order, d[v] is always the shortest path distance from s to v. This is Theorem 22.5 in CLRS. For weighted graphs, you need Dijkstra (Chapter 24). But for unweighted graphs — friend-of-friend distance, minimum hops in a network, fewest moves in a puzzle — BFS is optimal and O(V + E).

Why BFS Is Correct: The Proof Sketch

Let δ(s, v) denote the true shortest-path distance from s to v. We want to prove that d[v] = δ(s, v) for all v. The argument has three steps:

Step 1 (Upper bound): d[v] ≥ δ(s, v) always. BFS can only set d[v] = d[u] + 1 for some u, and by induction d[u] ≥ δ(s, u). Since δ(s, v) ≤ δ(s, u) + 1 (triangle inequality for unweighted graphs), we get d[v] ≥ δ(s, v).

Step 2 (Lower bound by contradiction): Suppose d[v] > δ(s, v) for some v. Pick the first such v in BFS order (smallest δ). Let u be v's predecessor on a true shortest path, so δ(s, v) = δ(s, u) + 1. Since δ(s, u) < δ(s, v), vertex u was correct: d[u] = δ(s, u). When u was dequeued, BFS examined edge (u, v). If v was white, BFS set d[v] = d[u] + 1 = δ(s, u) + 1 = δ(s, v). Contradiction. If v was already gray/black, then d[v] ≤ d[u] + 1 = δ(s, v). Also contradiction.

Step 3: Therefore d[v] = δ(s, v) for all reachable v. For unreachable v, d[v] remains ∞ = δ(s, v). QED.

The FIFO property is load-bearing. This proof breaks if you replace the queue with a stack (DFS) or a random bag. The FIFO queue ensures that when we process vertex u at distance k, all vertices at distance k-1 have already been processed. Replace FIFO with LIFO and you get DFS, which does NOT find shortest paths.

BFS Properties Summarized (CLRS Theorems)

Property	Statement	CLRS Reference
Correctness	d[v] = δ(s, v) for all v ∈ V	Theorem 22.5
Monotonicity	If u is enqueued before v, then d[u] ≤ d[v]	Lemma 22.3
BFS tree	Parent pointers define a shortest-path tree rooted at s	Corollary 22.4
Non-tree edges	In undirected graphs: \|d[u] - d[v]\| ≤ 1 for every edge {u,v}	Exercise 22.2-4
Diameter	BFS from any vertex finds the eccentricity of that vertex	Implied by correctness

The eccentricity of a vertex s is max_v δ(s, v) — the maximum shortest-path distance from s to any other vertex. The diameter of a graph is the maximum eccentricity over all vertices: max_s max_v δ(s, v). Computing the diameter of a general graph requires BFS from every vertex: O(V(V + E)). For trees, two BFS passes suffice: BFS from any vertex, then BFS from the farthest vertex found. The farthest vertex from that second BFS is the diameter endpoint.

python
def tree_diameter(adj):
    """Diameter of an unweighted tree in 2 BFS passes."""
    def bfs_farthest(start):
        dist = {start: 0}
        queue = deque([start])
        farthest = start
        while queue:
            u = queue.popleft()
            for v in adj[u]:
                if v not in dist:
                    dist[v] = dist[u] + 1
                    queue.append(v)
                    if dist[v] > dist[farthest]:
                        farthest = v
        return farthest, dist[farthest]

    # Pass 1: find farthest from any vertex
    far1, _ = bfs_farthest(next(iter(adj)))
    # Pass 2: find farthest from that vertex = diameter
    far2, diameter = bfs_farthest(far1)
    return diameter

Multi-Source BFS

A powerful BFS variant: instead of starting from one source, start from multiple sources simultaneously. Add all sources to the queue at distance 0. BFS then finds the shortest distance from each vertex to its nearest source.

python
def multi_source_bfs(graph, sources):
    """BFS from multiple sources simultaneously.
    Returns dist[v] = min distance from v to any source."""
    dist = {v: float('inf') for v in graph}
    queue = deque()
    for s in sources:
        dist[s] = 0
        queue.append(s)
    while queue:
        u = queue.popleft()
        for v in graph[u]:
            if dist[v] == float('inf'):
                dist[v] = dist[u] + 1
                queue.append(v)
    return dist

# Use case: given a grid with multiple fire sources,
# find how long until fire reaches each cell.
# Use case: "01 Matrix" (LC 542) — distance to nearest 0.

When to use multi-source BFS. Any problem asking "what is the nearest X to each cell/node?" is multi-source BFS. Classic examples: nearest exit in a maze from multiple starting positions, "Walls and Gates" (LC 286), "01 Matrix" (LC 542), "Rotting Oranges" (LC 994). The trick is always the same: enqueue all sources at distance 0, then BFS normally.

Question: During BFS, vertex u is dequeued and we examine edge (u, v). We find that v is already gray. What does this tell us about v's distance from the source?

v was discovered by some other vertex at the same level or at most one level closer than u. Since v is gray, it is in the queue but not yet processed. Its distance satisfies d[v] ≤ d[u] + 1 (it was discovered at the same or earlier BFS level). v's distance is exactly d[u] + 1, because we just discovered it. v's distance is unknown; gray only means it was enqueued.

Chapter 2: BFS Applications

BFS does more than find shortest paths. Its level-by-level exploration pattern makes it the right tool for several fundamental graph problems. Let us look at three.

Application 1: Connected Components (Undirected)

In an undirected graph, a connected component is a maximal set of vertices such that every pair is reachable from every other. "Maximal" means you cannot add any more vertices without breaking the reachability property.

To find all connected components, run BFS from an unvisited vertex. Everything BFS discovers is one component. Then find the next unvisited vertex and BFS again. Repeat until all vertices are visited.

// Find all connected components in undirected graph G
component_id = 0
for each vertex u ∈ V:
  if u is unvisited:
    BFS(G, u) // marks all reachable vertices as visited
    label all discovered vertices with component_id
    component_id += 1

Runtime: O(V + E) total — each vertex and edge is processed exactly once across all BFS calls.

Application 2: Bipartiteness Testing

A graph is bipartite if you can color every vertex with one of two colors (say red and blue) such that no edge connects two vertices of the same color. Think: assigning students to two exam rooms so that no pair of friends is in the same room. Or: can you divide a class into two teams for a debate, such that no pair of enemies is on the same team?

BFS naturally tests bipartiteness. During BFS, assign the source color 0. For each newly discovered vertex, assign the opposite color of its parent. If you ever encounter an already-colored neighbor with the same color as the current vertex, the graph is not bipartite.

Equivalently, a graph is bipartite if and only if it contains no odd-length cycle. This is because in a 2-coloring, traversing any edge switches the color. After an even number of edges, you return to the original color. After an odd number of edges, you are at the opposite color — and if this brings you back to the start vertex, you need both colors for the same vertex. Contradiction.

Worked Example: Bipartite Testing by Hand

Graph: A-B, A-D, B-C, C-D. Start BFS from A, color A=0.

Dequeue	Neighbor	Action
A (color=0)	B (uncolored)	Color B=1. Enqueue.
A (color=0)	D (uncolored)	Color D=1. Enqueue.
B (color=1)	A (color=0)	Different colors. OK.
B (color=1)	C (uncolored)	Color C=0. Enqueue.
D (color=1)	A (color=0)	Different colors. OK.
D (color=1)	C (color=0)	Different colors. OK.
C (color=0)	B (color=1)	Different colors. OK.
C (color=0)	D (color=1)	Different colors. OK.

Result: bipartite! Partition: {A, C} (color 0) and {B, D} (color 1). Cycle A-B-C-D-A has length 4 (even). No odd cycles.

Now add edge B-D: B (color=1) examines D (color=1) — same color! Not bipartite. The odd cycle is A-B-D-A (length 3).

Why BFS works for bipartiteness. A graph is bipartite if and only if it contains no odd-length cycles (CLRS Theorem 22.2 exercise). BFS assigns "layers" to vertices: layer 0, layer 1, layer 2, etc. In a bipartite graph, all edges go between adjacent layers (even↔odd). If an edge connects two vertices in the same layer, that creates an odd cycle, and BFS catches it immediately because both vertices have the same color.

Bipartite Checker

Load a sample graph and watch BFS try to 2-color it. If it finds a conflict (same-color neighbors), the graph is not bipartite.

python
from collections import deque

def is_bipartite(graph):
    """Returns True if undirected graph is bipartite."""
    color = {}
    for start in graph:
        if start in color:
            continue
        color[start] = 0
        queue = deque([start])
        while queue:
            u = queue.popleft()
            for v in graph[u]:
                if v not in color:
                    color[v] = 1 - color[u]  # opposite color
                    queue.append(v)
                elif color[v] == color[u]:
                    return False  # same-color neighbor!
    return True

Application 3: Connected Components (Full Implementation)

Finding connected components is one of the most common graph problems. Here is the complete implementation with component labeling.

python
def connected_components(graph):
    """Find all connected components in undirected graph.
    Returns: dict mapping vertex → component_id, and list of components."""
    comp_id = {}
    components = []
    current_id = 0

    for start in graph:
        if start in comp_id:
            continue
        # BFS from this unvisited vertex
        component = []
        queue = deque([start])
        comp_id[start] = current_id
        while queue:
            u = queue.popleft()
            component.append(u)
            for v in graph[u]:
                if v not in comp_id:
                    comp_id[v] = current_id
                    queue.append(v)
        components.append(component)
        current_id += 1

    return comp_id, components

# Example: friendship graph
G = {
    'Alice': ['Bob'], 'Bob': ['Alice', 'Carol'],
    'Carol': ['Bob'], 'Dave': ['Eve'],
    'Eve': ['Dave'], 'Frank': []
}
ids, comps = connected_components(G)
# comps = [['Alice','Bob','Carol'], ['Dave','Eve'], ['Frank']]
# 3 components: friend group, couple, loner

Connected components tell you about graph structure. A graph with 1 component is connected. A graph with V components has no edges (each vertex is isolated). A tree has 1 component and V-1 edges. A forest with k trees has k components and V-k edges. The number of components is V - (number of edges in a spanning forest). A disconnected graph has 2+ components. Adding one edge can decrease the component count by at most 1 (it connects two components or stays within one).

Multi-Source BFS: A Powerful Variant

Consider this problem: in a grid, some cells are "fire" sources. Fire spreads to adjacent cells each minute. How long until fire reaches every cell? This is multi-source BFS. Initialize the queue with ALL fire cells at distance 0, then run standard BFS. The wavefront expands from all sources simultaneously.

This is exactly "Rotting Oranges" (LC 994), "01 Matrix" (LC 542), and "Walls and Gates" (LC 286). The key insight: multi-source BFS is identical to regular BFS with a "virtual source" connected to all real sources. Instead of creating this virtual node explicitly, just enqueue all real sources at the start.

Application 4: BFS on Trees (Level-Order Traversal)

A tree is a special graph: connected, acyclic, V-1 edges. BFS on a tree from the root gives level-order traversal — visit all nodes at depth 0 (root), then depth 1, then depth 2. This is the basis of many tree problems: maximum depth, level averages, zigzag traversal, right-side view.

python
def level_order(root):
    """BFS level-order traversal of a binary tree."""
    if not root:
        return []
    result = []
    queue = deque([root])
    while queue:
        level = []
        for _ in range(len(queue)):  # process one level at a time
            node = queue.popleft()
            level.append(node.val)
            if node.left:  queue.append(node.left)
            if node.right: queue.append(node.right)
        result.append(level)
    return result
# The "process one level at a time" pattern (recording queue length
# before processing) is used in MANY BFS problems.

Application 4: Shortest Path Recovery

BFS computes distances, but often you need the actual path. The parent pointers π[v] give you this. To reconstruct the shortest path from s to v, follow parent pointers backwards from v to s, then reverse.

python
def shortest_path(parent, source, target):
    """Reconstruct path from BFS parent pointers."""
    if parent[target] is None and target != source:
        return None  # not reachable
    path = []
    v = target
    while v is not None:
        path.append(v)
        v = parent[v]
    return path[::-1]  # reverse to get s-to-t order

Application 5: Shortest Path in a Chess Board (Knight Moves)

What is the minimum number of knight moves to reach from position (r1, c1) to (r2, c2) on an 8x8 chess board? This is BFS on an implicit graph. Each board position is a vertex. Each legal knight move creates an edge. The graph has at most 64 vertices and 64 × 8 = 512 edges.

python
def min_knight_moves(r1, c1, r2, c2, n=8):
    """Min knight moves on nxn board from (r1,c1) to (r2,c2)."""
    moves = [(-2,-1),(-2,1),(-1,-2),(-1,2),
             (1,-2),(1,2),(2,-1),(2,1)]
    visited = set()
    visited.add((r1, c1))
    queue = deque([(r1, c1, 0)])

    while queue:
        r, c, d = queue.popleft()
        if r == r2 and c == c2:
            return d
        for dr, dc in moves:
            nr, nc = r + dr, c + dc
            if 0 <= nr < n and 0 <= nc < n and (nr, nc) not in visited:
                visited.add((nr, nc))
                queue.append((nr, nc, d + 1))
    return -1

# min_knight_moves(0, 0, 7, 7) → 6 moves
# min_knight_moves(0, 0, 1, 1) → 4 moves (not 2 — knight can't go diagonal)

BFS on implicit graphs is a superpower. You never build the adjacency list explicitly. Instead, generate neighbors on the fly: for grid problems, use direction arrays. For string problems, try all single-character mutations. For puzzle problems, enumerate all legal moves from the current state. The graph exists only in your BFS queue.

Application 7: Flood Fill (LC 733)

Given a 2D image (grid of pixel colors) and a starting pixel, change the color of all connected same-color pixels. This is DFS/BFS connected component traversal on a grid, changing colors as you go.

python
def floodFill(image, sr, sc, newColor):
    """LC 733: Flood fill from (sr, sc) with newColor."""
    rows, cols = len(image), len(image[0])
    origColor = image[sr][sc]
    if origColor == newColor:
        return image  # no-op avoids infinite loop

    def dfs(r, c):
        if r < 0 or r >= rows or c < 0 or c >= cols:
            return
        if image[r][c] != origColor:
            return
        image[r][c] = newColor
        dfs(r+1, c); dfs(r-1, c)
        dfs(r, c+1); dfs(r, c-1)

    dfs(sr, sc)
    return image
# The origColor == newColor check is CRITICAL.
# Without it, DFS revisits cells endlessly
# (newColor == origColor means cells keep matching).

Complexity Analysis Deep Dive

BFS visits each vertex at most once (it is enqueued when discovered, dequeued when processed). Each edge is examined at most once (when its source vertex is dequeued, we scan its adjacency list). With adjacency lists:

T(BFS) = O(V + E) where V = |vertices|, E = |edges|

The V term comes from initializing every vertex (color, distance, parent). The E term comes from scanning all adjacency lists. Together: O(V) + O(E) = O(V + E).

Space: O(V) for the color/distance/parent arrays, plus O(V) for the queue (in the worst case, all vertices are in the queue simultaneously — imagine a star graph with one center node connected to V-1 leaves). Total space: O(V).

Question: You run BFS-based bipartite checking on a graph and encounter an edge (u, v) where both u and v have color 0. What can you conclude?

The edge (u, v) is invalid and should be removed. The graph contains an odd-length cycle and is NOT bipartite. The edge (u, v) connects two vertices at the same BFS level, meaning the cycle through u, v, and their common ancestor has odd length. We need to recolor v to color 1 and continue.

Chapter 3: Depth-First Search

BFS spreads out like ripples. Depth-First Search (DFS) does the opposite: it dives as deep as possible down one path before backtracking. Think of exploring a maze: you follow one corridor all the way to a dead end, then backtrack to the last junction and try the next corridor. You never stop mid-corridor to explore a different one.

Timestamps: Discovery and Finish

DFS assigns two timestamps to every vertex. These timestamps are the secret weapon that makes DFS so powerful for structural analysis.

d[v] (discovery time): the moment DFS first encounters v and colors it gray. The vertex enters the "call stack."

f[v] (finish time): the moment DFS finishes processing all of v's descendants and colors it black. The vertex leaves the call stack.

A global clock ticks upward from 1. Each discovery and each finish increments the clock by 1. For a graph with V vertices, timestamps range from 1 to 2V (every vertex gets exactly one discovery and one finish). The timestamps tell you the structural history of the DFS exploration — when each vertex was entered and when it was fully processed.

Why do we need TWO timestamps? Discovery time alone tells you when DFS first reached a vertex, but not when it finished exploring that vertex's subtree. The finish time captures the moment when all descendants have been fully explored. The interval [d[v], f[v]] represents the time during which v was "active" — on the recursion stack. Two vertices' intervals either nest (ancestor-descendant) or are disjoint (no relationship). This nesting property is the Parenthesis Theorem.

The Algorithm

// DFS(G): explore the entire graph
time = 0
for each vertex u ∈ V:
  color[u] = WHITE, π[u] = NIL

for each vertex u ∈ V:
  if color[u] == WHITE:
    DFS-VISIT(G, u)

DFS-VISIT(G, u):
  time = time + 1
  d[u] = time // discover u
  color[u] = GRAY
  for each v ∈ Adj[u]:
    if color[v] == WHITE:
      π[v] = u
      DFS-VISIT(G, v) // recurse deeper
  color[u] = BLACK
  time = time + 1
  f[u] = time // finish u

Runtime: Θ(V + E). Each vertex is visited exactly once (transitions white → gray → black irreversibly). Each edge is examined exactly once (when DFS-VISIT processes its source vertex). The outer loop in DFS(G) costs O(V) for checking all vertices. Each call to DFS-VISIT on vertex u costs O(1 + |Adj[u]|). Summing over all vertices: ∑ (1 + |Adj[u]|) = V + E. Total: Θ(V + E).

Space: O(V) for the recursion stack (at most V nested calls in a path graph), plus O(V) for the color, timestamp, and parent arrays. The recursion stack depth equals the length of the longest path in the DFS tree. For balanced graphs this is O(log V); for path graphs this is O(V).

The Parenthesis Theorem

The timestamps obey a beautiful nesting property. For any two vertices u and v, exactly one of three cases holds:

Case 1: [d[u], f[u]] and [d[v], f[v]] are entirely disjoint. Neither is a descendant of the other in the DFS tree.

Case 2: [d[u], f[u]] is entirely contained within [d[v], f[v]]. Then u is a descendant of v in the DFS tree.

Case 3: [d[v], f[v]] is entirely contained within [d[u], f[u]]. Then v is a descendant of u in the DFS tree.

In other words, the intervals either nest or are disjoint — they never partially overlap. This is why it is called the "parenthesis theorem": if you write an open-paren at discovery and a close-paren at finish, the result is always a valid parenthesization.

Visualize it. For vertices A(d=1,f=8), B(d=2,f=5), C(d=3,f=4), D(d=6,f=7): the parenthesization is (A (B (C) ) (D) ). C nests inside B, B and D nest inside A, but B and D are disjoint from each other. This structure tells you everything about the DFS tree: A is the root, B and D are A's children, C is B's child.

Worked Example: Timestamps By Hand

Consider a directed graph with vertices {A, B, C, D, E, F} and edges A→B, A→D, B→C, C→E, D→B, E→F, F→C. Let us trace DFS starting from A (alphabetical order for neighbor processing).

Action	Time	Stack (gray vertices)	Event
Discover A	1	[A]	d[A]=1
Discover B (via A→B)	2	[A,B]	d[B]=2, tree edge
Discover C (via B→C)	3	[A,B,C]	d[C]=3, tree edge
Discover E (via C→E)	4	[A,B,C,E]	d[E]=4, tree edge
Discover F (via E→F)	5	[A,B,C,E,F]	d[F]=5, tree edge
F→C: C is gray	—	[A,B,C,E,F]	Back edge! (cycle: C→E→F→C)
Finish F	6	[A,B,C,E]	f[F]=6
Finish E	7	[A,B,C]	f[E]=7
Finish C	8	[A,B]	f[C]=8
Finish B	9	[A]	f[B]=9
A→D: D is white	10	[A,D]	d[D]=10, tree edge
D→B: B is black, d[D]>d[B]	—	[A,D]	Cross edge
Finish D	11	[A]	f[D]=11
Finish A	12	[]	f[A]=12

Final timestamps: A(1/12), B(2/9), C(3/8), D(10/11), E(4/7), F(5/6). Parenthesization: (A (B (C (E (F) ) ) ) (D) ). Notice: B's interval [2,9] contains C's [3,8], which contains E's [4,7], which contains F's [5,6]. All properly nested. D's interval [10,11] is disjoint from B's [2,9] — they are siblings.

Reading timestamps like a pro. Given any two vertices u and v, you can determine their DFS relationship in O(1): (1) If d[u] < d[v] < f[v] < f[u], then v is a descendant of u. (2) If d[v] < f[v] < d[u] < f[u], then v was fully processed before u was discovered — cross edge territory. This O(1) ancestor check is used in many advanced algorithms (LCA, dominator trees, etc.).

Edge Classification

When DFS examines an edge (u, v), the color of v tells us what kind of edge it is:

Edge Type	Color of v	Meaning	Timestamps
Tree edge	White	v is being discovered for the first time via u. These edges form the DFS forest.	d[u] < d[v]
Back edge	Gray	v is an ancestor of u in the DFS tree. u descended from v and is now looking back up.	d[v] < d[u] < f[u] < f[v]
Forward edge	Black	v is a descendant of u, but was already discovered via a different path. d[u] < d[v].	d[u] < d[v] < f[v] < f[u]
Cross edge	Black	v is in a different branch or a different DFS tree. d[v] < d[u] (v was finished before u was discovered).	d[v] < f[v] < d[u]

The key distinction: back edge vs the rest. When you encounter a black vertex, you need timestamps to tell forward from cross edges. But when you encounter a gray vertex, it is always a back edge — no ambiguity. This is critical because back edges are cycle indicators. A directed graph has a cycle if and only if DFS finds a back edge.

The White-Path Theorem

Vertex v is a descendant of u in the DFS forest if and only if at time d[u] (when u is discovered), there exists a path from u to v consisting entirely of white vertices. This is CLRS Theorem 22.9.

Why is this useful? It tells you that the structure of the DFS tree depends on the order in which DFS processes vertices. Different processing orders produce different DFS trees (different tree/back/forward/cross edge classifications) — but the parenthesis theorem and white-path theorem hold for ALL of them.

Undirected Graph Edge Classification

In undirected graphs, the edge classification is simpler. When DFS examines edge {u, v}:

Tree edge: v is white. We discover v through u.

Back edge: v is gray (an ancestor of u). This is the ONLY other possibility in undirected graphs.

There are no forward or cross edges in undirected DFS. Why? If v is black and v is not u's parent, then v must have processed u as a neighbor when v was gray — so the edge was already classified as a tree or back edge from v's perspective. Every edge in an undirected graph is either a tree edge or a back edge.

Undirected DFS guarantee. In an undirected graph, DFS produces only tree edges and back edges. No forward edges. No cross edges. This simplifies cycle detection (any non-parent visited neighbor is a back edge = cycle) and is why undirected cycle detection does not need the three-color scheme — just a visited set and parent tracking suffice.

Recursive vs Iterative DFS

The recursive DFS shown above is elegant but has a fatal flaw: on large graphs (V > 10,000), it can overflow Python's default recursion limit (1000). The iterative version uses an explicit stack and handles arbitrarily large graphs.

python
def dfs_iterative_timestamps(graph):
    """Iterative DFS with timestamps (handles large graphs)."""
    d, f = {}, {}
    color = {v: 'white' for v in graph}
    time = [0]
    edge_types = []

    for start in graph:
        if color[start] != 'white':
            continue
        # Stack entries: (vertex, neighbor_iterator)
        stack = [(start, iter(graph[start]))]
        time[0] += 1
        d[start] = time[0]
        color[start] = 'gray'

        while stack:
            u, neighbors = stack[-1]
            try:
                v = next(neighbors)
                if color[v] == 'white':
                    edge_types.append((u, v, 'tree'))
                    time[0] += 1
                    d[v] = time[0]
                    color[v] = 'gray'
                    stack.append((v, iter(graph[v])))
                elif color[v] == 'gray':
                    edge_types.append((u, v, 'back'))
                else:
                    etype = 'forward' if d[u] < d[v] else 'cross'
                    edge_types.append((u, v, etype))
            except StopIteration:
                color[u] = 'black'
                time[0] += 1
                f[u] = time[0]
                stack.pop()

    return d, f, edge_types

Python recursion limit. The default limit is 1000 frames. For competitive programming, you can do sys.setrecursionlimit(200000), but this is fragile (OS-dependent stack size). For production code, always use iterative DFS. The iterative version is slightly harder to write but handles any graph size. In interviews, mention this trade-off — it shows systems awareness.

DFS Step-Through

DFS Step-Through with Timestamps

Press Step to advance DFS. Watch discovery (d) and finish (f) timestamps update. Edge types are color-coded: tree (teal), back (red), forward (blue), cross (yellow).

python
def dfs(graph):
    """Full DFS with timestamps and edge classification."""
    color  = {v: 'white' for v in graph}
    d      = {}  # discovery times
    f      = {}  # finish times
    parent = {v: None for v in graph}
    edges  = []  # (u, v, type) tuples
    time   = [0]  # mutable counter

    def visit(u):
        time[0] += 1
        d[u] = time[0]
        color[u] = 'gray'
        for v in graph[u]:
            if color[v] == 'white':
                edges.append((u, v, 'tree'))
                parent[v] = u
                visit(v)
            elif color[v] == 'gray':
                edges.append((u, v, 'back'))
            elif d[u] < d[v]:
                edges.append((u, v, 'forward'))
            else:
                edges.append((u, v, 'cross'))
        color[u] = 'black'
        time[0] += 1
        f[u] = time[0]

    for u in graph:
        if color[u] == 'white':
            visit(u)

    return d, f, parent, edges

Question: During DFS, you are processing vertex u (u is gray) and examine edge (u, v). You find v is gray. What type of edge is (u, v), and what does it imply about the graph?

It is a BACK edge. v is gray means v is still on the recursion stack — it is an ancestor of u. The path from v to u in the DFS tree plus the edge (u, v) forms a cycle. The graph contains a cycle. It is a cross edge. v is in a different DFS branch being processed concurrently. It is a forward edge. v is a descendant of u that DFS already started exploring.

Chapter 4: DFS Applications

DFS timestamps and edge classification unlock two critical algorithms: cycle detection and topological sorting. Both rely on the same insight — back edges reveal cycles, and the absence of back edges means the graph can be linearly ordered.

Cycle Detection

A directed graph has a cycle if and only if DFS discovers a back edge. This follows directly from the edge classification: a back edge (u, v) means v is an ancestor of u on the current DFS path. The path from v down to u through the DFS tree, plus the back edge from u back to v, forms a cycle.

For undirected graphs, the rule is simpler: any edge to an already-visited vertex (that is not the parent) indicates a cycle. We do not need timestamps — just parent tracking.

// Cycle detection in directed graph
def has_cycle(G):
Run DFS on G
return True if any back edge is found

python
def has_cycle_directed(graph):
    """Returns True if directed graph has a cycle."""
    color = {v: 'white' for v in graph}

    def visit(u):
        color[u] = 'gray'
        for v in graph[u]:
            if color[v] == 'gray':
                return True   # back edge = cycle!
            if color[v] == 'white' and visit(v):
                return True
        color[u] = 'black'
        return False

    for u in graph:
        if color[u] == 'white' and visit(u):
            return True
    return False

Theorem: Directed Graph Has Cycle ⇔ DFS Finds Back Edge

This is CLRS Lemma 22.11. Let us prove both directions.

(⇒) If DFS finds back edge (u, v), then G has a cycle. A back edge means v is gray when u examines it — v is an ancestor of u on the current DFS path. The path from v down the DFS tree to u, plus the back edge from u to v, forms a cycle. Done.

(⇐) If G has a cycle, then DFS finds a back edge. Let C = v₁ → v₂ → ... → v_k → v₁ be a cycle. Let v_i be the first vertex in C discovered by DFS. At time d[v_i], by the White-Path Theorem, there is a white path from v_i to v_i-1 (going around the cycle). So v_i-1 becomes a descendant of v_i. When DFS processes v_i-1, it examines the edge v_i-1 → v_i. At this point, v_i is still gray (it started before v_i-1 and finishes after). So this edge is classified as a back edge. Done.

The cycle-back-edge equivalence is bidirectional and constructive. Not only does the back edge prove a cycle exists, but it tells you exactly where the cycle is: follow parent pointers from u up to v, then add the back edge (u, v). This is why DFS is the standard tool for cycle detection — it finds cycles AND identifies them.

Interactive Cycle Detector

Cycle Detector

Load a graph with or without cycles. DFS runs step by step. When a back edge is found, it is highlighted red and the cycle is identified.

Cycle Detection in Undirected Graphs

For undirected graphs, cycle detection is slightly different. Every edge appears twice in the adjacency list (once for each endpoint). When DFS from u visits neighbor v, and v is already visited but v is NOT u's parent, we have found a cycle. We need the parent check because in an undirected graph, the edge {u, v} naturally appears when we process u from v — that is not a cycle, just the same edge traversed backwards.

python
def has_cycle_undirected(graph):
    """Cycle detection in undirected graph."""
    visited = set()

    def dfs(u, parent):
        visited.add(u)
        for v in graph[u]:
            if v not in visited:
                if dfs(v, u):
                    return True
            elif v != parent:
                return True  # visited and not parent = cycle!
        return False

    for u in graph:
        if u not in visited:
            if dfs(u, None):
                return True
    return False

Directed vs undirected cycle detection — the critical difference. In a directed graph, finding a visited neighbor is NOT enough to claim a cycle. The neighbor might be in a different DFS branch (cross edge) or a descendant (forward edge). Only a gray neighbor (back edge) means cycle. In an undirected graph, the only "non-cycle" visited neighbor is the parent. Every other visited neighbor implies a cycle. This distinction trips up many interview candidates.

Extracting the Actual Cycle

Finding that a cycle exists is step one. Often you need the cycle itself. When DFS detects back edge (u, v), the cycle is: v → ... → u → v. Walk up the DFS parent chain from u to v to extract it.

python
def find_cycle_directed(graph):
    """Returns the cycle as a list, or None."""
    color  = {v: 'white' for v in graph}
    parent = {v: None for v in graph}

    def visit(u):
        color[u] = 'gray'
        for v in graph[u]:
            if color[v] == 'gray':
                # Back edge (u, v). Extract cycle.
                cycle = [v, u]
                w = u
                while parent[w] != v:
                    w = parent[w]
                    cycle.append(w)
                cycle.append(v)
                return cycle[::-1]
            if color[v] == 'white':
                parent[v] = u
                result = visit(v)
                if result:
                    return result
        color[u] = 'black'
        return None

    for u in graph:
        if color[u] == 'white':
            result = visit(u)
            if result:
                return result
    return None

From Cycles to Ordering

If DFS finds NO back edges, the graph is a Directed Acyclic Graph (DAG). DAGs are special because their vertices can be topologically sorted — arranged in a linear order such that for every edge (u, v), u appears before v. Think: you can take all courses in an order that respects every prerequisite.

The finish-time trick. Here is the remarkable connection: if you sort vertices by decreasing finish time, you get a valid topological order. Why? If edge (u, v) exists and the graph is a DAG (no back edges), then f[u] > f[v]. DFS finishes v before u because either v is a descendant of u (so v finishes first during u's recursion) or v was finished in a previous DFS call (so f[v] < d[u] < f[u]). Either way, u appears before v in reverse-finish order. This is exactly topological order.

CLRS Theorem 22.11: DAG ⇔ No Back Edges (Formal)

A directed graph G is a DAG if and only if a DFS of G yields no back edges. This is the formal version of what we proved for cycles. The forward direction: if G is a DAG, then no cycle exists, so no back edge exists (since every back edge creates a cycle). The reverse: if DFS produces no back edges, then every edge goes from a vertex to a descendant (tree/forward) or a previously finished vertex (cross). In all these cases f[u] > f[v] for edge (u, v), meaning there is a topological order, meaning no cycle exists, meaning G is a DAG.

This theorem connects three equivalent statements: (1) G is a DAG, (2) DFS on G has no back edges, (3) G has a topological ordering. Any one implies the other two. This triangle of equivalences is the foundation of Chapters 4 and 5.

The DAG-BackEdge-TopoSort triangle.
G is a DAG ⇔ DFS finds no back edges ⇔ G has a topological ordering.
Learn this equivalence cold. In interviews, stating it immediately shows deep understanding.

Question: You run DFS on a directed graph and find these finish times: A=12, B=10, C=8, D=6, E=4, F=2. No back edges were found. What is the topological order?

A, B, C, D, E, F — sorted by decreasing finish time. Since no back edges were found, the graph is a DAG, and reverse finish order is a valid topological order. F, E, D, C, B, A — sorted by increasing finish time. We cannot determine topological order from finish times alone.

Chapter 5: Topological Sort

You are registering for university courses. CS201 requires CS101. CS301 requires CS201 and MATH201. MATH201 requires MATH101. In what order should you take these courses so that every prerequisite is satisfied before you take the course that needs it?

This is topological sorting: given a DAG, produce a linear ordering of vertices such that for every edge (u, v), vertex u appears before vertex v. Topological sort only works on DAGs — if there is a cycle, no valid ordering exists (you would need to take CS201 before CS101, which requires CS201).

Topological sort is one of the most frequently used graph algorithms in practice. Build systems, package managers, task schedulers, spreadsheet formula evaluation, event processing, data pipeline orchestration — all rely on topological sort. If you learn one algorithm from this chapter, make it topological sort.

Two ways to think about topological sort. Method 1 (DFS): explore the graph deeply, record when you finish each vertex, then reverse the order. Intuition: a vertex finishes only after all its descendants finish, so reversing puts parents before children. Method 2 (Kahn's): repeatedly remove vertices with no incoming edges. Intuition: a vertex with no prerequisites can go first; removing it might unlock more vertices. Both give O(V + E) and produce valid (but potentially different) orderings.

Method 1: DFS-Based (Reverse Finish Order)

We derived this in the previous chapter. Run DFS on the entire graph. As each vertex finishes (turns black), prepend it to a list. The resulting list is a valid topological order.

python
def topological_sort_dfs(graph):
    """Topological sort via DFS (reverse finish order)."""
    color = {v: 'white' for v in graph}
    order = []

    def visit(u):
        color[u] = 'gray'
        for v in graph[u]:
            if color[v] == 'gray':
                raise ValueError("Graph has a cycle!")
            if color[v] == 'white':
                visit(v)
        color[u] = 'black'
        order.append(u)  # append on finish

    for u in graph:
        if color[u] == 'white':
            visit(u)

    return order[::-1]  # reverse for topological order

Method 2: Kahn's Algorithm (BFS with In-Degree)

There is a second way that does not use DFS at all. Kahn's algorithm works by repeatedly removing vertices with no incoming edges (in-degree 0). The intuition: a vertex with no prerequisites can be placed first. Remove it, decrement the in-degree of its neighbors. Some neighbors now have in-degree 0 — they can go next. Repeat until the graph is empty.

// Kahn's Algorithm
Compute in-degree[v] for all v ∈ V
Q = queue of all v with in-degree[v] == 0
order = []

while Q is not empty:
  u = DEQUEUE(Q)
  order.append(u)
  for each v ∈ Adj[u]:
    in-degree[v] -= 1
    if in-degree[v] == 0:
      ENQUEUE(Q, v)

if len(order) ≠ |V|:
  Graph has a cycle! // not all vertices were processable

python
from collections import deque

def topological_sort_kahn(graph):
    """Topological sort via Kahn's algorithm (BFS + in-degree)."""
    in_deg = {v: 0 for v in graph}
    for u in graph:
        for v in graph[u]:
            in_deg[v] += 1

    queue = deque(v for v in graph if in_deg[v] == 0)
    order = []

    while queue:
        u = queue.popleft()
        order.append(u)
        for v in graph[u]:
            in_deg[v] -= 1
            if in_deg[v] == 0:
                queue.append(v)

    if len(order) != len(graph):
        raise ValueError("Graph has a cycle!")
    return order

DFS-based vs Kahn's. Both run in O(V + E). The DFS method is elegant and often shorter to code. Kahn's method is more intuitive ("peel off the zero-in-degree vertices") and naturally detects cycles (if the output has fewer than V vertices, there is a cycle). In interviews, know both. Kahn's is easier to explain under pressure.

Uniqueness of Topological Order

A DAG can have multiple valid topological orderings. Consider a DAG with edges A→C, B→C. Both [A, B, C] and [B, A, C] are valid. The topological order is unique if and only if there is a Hamiltonian path (a path visiting every vertex) in the DAG — equivalently, at every step of Kahn's algorithm, the queue has exactly one element.

Interview insight: detecting unique topological order. Run Kahn's algorithm. If at any point the queue has more than one element, multiple orderings exist. This is LeetCode 444 "Sequence Reconstruction" — check if there is only one valid topological order that matches a given sequence.

Topological Sort on a DAG with Parallel Levels

A variant asks: what is the minimum number of time steps if tasks at the same "level" can be parallelized? This is the longest path in the DAG, which equals the number of rounds in Kahn's algorithm. Each round of Kahn's processes all current zero-in-degree vertices simultaneously.

python
def parallel_stages(graph):
    """Number of parallel stages = longest path + 1."""
    in_deg = {v: 0 for v in graph}
    for u in graph:
        for v in graph[u]:
            in_deg[v] += 1

    queue = deque(v for v in graph if in_deg[v] == 0)
    stages = 0

    while queue:
        stages += 1
        next_queue = deque()
        for _ in range(len(queue)):
            u = queue.popleft()
            for v in graph[u]:
                in_deg[v] -= 1
                if in_deg[v] == 0:
                    next_queue.append(v)
        queue = next_queue

    return stages
# CS101, MATH101 can be taken in parallel (stage 1)
# CS201, MATH201 in parallel (stage 2) ... etc.

Worked Example: Course Prerequisites

Course	Prerequisites
CS101	None
MATH101	None
CS201	CS101
MATH201	MATH101
CS301	CS201, MATH201
CS401	CS301

Kahn's algorithm trace:

Step	Dequeue	Queue (after decrement)	In-degrees updated
Init	—	[CS101, MATH101]	CS101:0, MATH101:0, CS201:1, MATH201:1, CS301:2, CS401:1
1	CS101	[MATH101, CS201]	CS201: 1→0
2	MATH101	[CS201, MATH201]	MATH201: 1→0
3	CS201	[MATH201]	CS301: 2→1
4	MATH201	[CS301]	CS301: 1→0
5	CS301	[CS401]	CS401: 1→0
6	CS401	[]	—

Final order: CS101, MATH101, CS201, MATH201, CS301, CS401. All 6 courses processed, so no cycle. This is a valid graduation plan.

DFS-based trace: DFS from CS101: CS101→CS201→CS301→CS401. Finish: CS401(f=1), CS301(f=2), CS201(f=3), CS101(f=4). Then DFS from MATH101: MATH101→MATH201→CS301(already visited). Finish: MATH201(f=5), MATH101(f=6). Reverse finish order: MATH101, MATH201, CS101, CS201, CS301, CS401. Also valid!

Multiple valid topological orders. Both methods give valid orders but different ones. Kahn's: CS101, MATH101, CS201, MATH201, CS301, CS401. DFS: MATH101, MATH201, CS101, CS201, CS301, CS401. Both respect all prerequisites. In a real university, either plan works — you just take the courses in a different sequence. The non-uniqueness comes from the parallelism: CS101 and MATH101 are independent and can go in either order.

Kahn's Algorithm: Why It Works

The correctness of Kahn's algorithm relies on a simple inductive argument. At each step, we remove a vertex with in-degree 0. This vertex has no unprocessed predecessors, so it can safely come next in the topological order. After removing it, we decrement the in-degree of its successors — which might create new zero-in-degree vertices.

Correctness proof sketch: Suppose Kahn's outputs order [v₁, v₂, ..., v_n]. Consider any edge (v_i, v_j). When v_i was dequeued, v_j's in-degree was decremented. v_j was enqueued only after all its predecessors (including v_i) were processed. So i < j. Every edge goes from earlier to later in the order. This is exactly the topological ordering property.

Cycle detection: If the graph has a cycle C, no vertex in C ever reaches in-degree 0 (each vertex in C always has at least one unprocessed predecessor in C). So Kahn's cannot dequeue any vertex in C, and |output| < |V|. The "leftover" vertices are exactly those participating in cycles.

Interview Problem: Course Schedule II (LC 210)

Given numCourses and prerequisite pairs, return a valid course order (topological order). If impossible (cycle), return empty array. This is the quintessential topological sort interview problem.

python
def findOrder(numCourses, prerequisites):
    """LC 210: Return a valid topological order or []."""
    graph = [[] for _ in range(numCourses)]
    in_deg = [0] * numCourses
    for course, prereq in prerequisites:
        graph[prereq].append(course)
        in_deg[course] += 1

    queue = deque(i for i in range(numCourses) if in_deg[i] == 0)
    order = []

    while queue:
        u = queue.popleft()
        order.append(u)
        for v in graph[u]:
            in_deg[v] -= 1
            if in_deg[v] == 0:
                queue.append(v)

    return order if len(order) == numCourses else []

# Example:
# findOrder(4, [[1,0],[2,0],[3,1],[3,2]])
# Returns [0, 1, 2, 3] or [0, 2, 1, 3] (both valid)

Interview Problem: Alien Dictionary (LC 269)

Given a sorted list of words in an alien language, determine the character ordering. This is topological sort in disguise. Compare consecutive words to extract ordering constraints: if word[i] differs from word[i+1] at position j, then word[i][j] comes before word[i+1][j] in the alien alphabet. Build a graph of these constraints and topologically sort.

python
def alienOrder(words):
    """LC 269: Topological sort on character ordering."""
    # Build graph from adjacent word comparisons
    graph = {c: set() for w in words for c in w}
    in_deg = {c: 0 for c in graph}

    for i in range(len(words) - 1):
        w1, w2 = words[i], words[i + 1]
        # Edge case: "abc" before "ab" is invalid
        if len(w1) > len(w2) and w1[:len(w2)] == w2:
            return ""
        for j in range(min(len(w1), len(w2))):
            if w1[j] != w2[j]:
                if w2[j] not in graph[w1[j]]:
                    graph[w1[j]].add(w2[j])
                    in_deg[w2[j]] += 1
                break

    # Kahn's topological sort
    queue = deque(c for c in in_deg if in_deg[c] == 0)
    result = []
    while queue:
        c = queue.popleft()
        result.append(c)
        for n in graph[c]:
            in_deg[n] -= 1
            if in_deg[n] == 0:
                queue.append(n)

    return "".join(result) if len(result) == len(graph) else ""

The alien dictionary is the ultimate topological sort interview question. It tests three skills simultaneously: (1) recognizing that character ordering = graph, (2) correctly extracting edges from word comparisons, (3) handling edge cases (prefix conflict, cycle = invalid dictionary). If you can solve this in 15 minutes, you own topological sort.

Interactive Topological Sort

Topological Sort — DFS vs Kahn's

Watch both algorithms produce a topological ordering on the same DAG. Kahn's peels off zero-in-degree nodes (green). DFS records reverse finish order (teal). Both outputs are valid topological orders.

Question: Kahn's algorithm processes 8 out of 10 vertices before the queue becomes empty. What happened?

Two vertices are isolated (no edges) and were never added to the queue. The remaining 2 vertices are in a cycle. Their in-degrees never reached 0 because they depend on each other. Kahn's detects cycles by checking if |order| < |V|. We need to restart Kahn's with one of the remaining vertices as the seed.

Chapter 6: Strongly Connected Components

In a directed graph, vertex u can reach vertex v, but can v reach u? When both directions are possible — every vertex can reach every other vertex — the group forms a strongly connected component (SCC). SCCs are the "clusters of mutual reachability" in a directed graph.

Formally, an SCC is a maximal set of vertices C ⊆ V such that for every pair u, v ∈ C, both u ↝ v and v ↝ u (there are directed paths in both directions). Maximal means you cannot add any more vertices to C without breaking the mutual reachability property.

How many SCCs can a graph have? At most V (every vertex is its own SCC — no edges). At least 1 (the whole graph is one SCC — like a single big cycle). A DAG has exactly V SCCs (each vertex is its own, since no vertex can reach itself through a non-trivial cycle).

The SCC decomposition is unique. Unlike DFS trees (which depend on processing order), the SCCs of a graph are uniquely determined by the graph structure. Different DFS orderings will discover the same SCCs, just in different order. This is because SCCs are defined by reachability, not by any particular traversal.

Why SCCs matter. SCCs decompose any directed graph into a DAG of components. Once you find the SCCs, you can "collapse" each one into a single super-node. The resulting component graph is always a DAG (if two SCCs could reach each other bidirectionally, they would be one SCC). This DAG structure lets you apply topological sort, shortest paths, and dynamic programming to the component graph. Every hard directed-graph problem starts with SCC decomposition.

How Many SCCs Does a Graph Have?

The number of SCCs is between 1 (the entire graph is one SCC — a single strongly connected component, like a big cycle that visits every vertex) and V (every vertex is its own SCC — the graph is a DAG with no non-trivial cycles).

A graph has exactly 1 SCC if and only if it is strongly connected: for every pair of vertices u, v, there exist paths u↝v and v↝u. A graph has V SCCs if and only if it is a DAG. In between, the number of SCCs tells you "how far from being a DAG" the graph is — more SCCs means more DAG-like structure, fewer SCCs means more cyclic structure.

SCC decomposition = the ultimate directed graph preprocessing step. Once you decompose into SCCs and build the component DAG, you can: (1) topologically sort the DAG of components, (2) do DP on the component DAG, (3) answer reachability queries in O(1) per SCC (two vertices in the same SCC are mutually reachable), (4) find the minimum number of additional edges to make the graph strongly connected. It is the Swiss army knife of directed graph analysis.

Kosaraju's Algorithm

Kosaraju's algorithm finds all SCCs in two DFS passes. It is conceptually simple and uses a beautiful symmetry argument.

Step 1: DFS on G

Run DFS on the original graph G. Record the finish order (last vertex to finish goes on top of a stack). This gives us vertices ordered by decreasing finish time.

↓

Step 2: Transpose G

Create G^T — the graph with every edge reversed. If (u, v) ∈ E, then (v, u) ∈ E^T. This takes O(V + E) time.

↓

Step 3: DFS on G^T in reverse finish order

Process vertices in the order from Step 1's stack (top to bottom = decreasing finish time). Each DFS tree in this second pass is exactly one SCC.

Why does this work? The argument hinges on a key lemma (CLRS Lemma 22.14): if there is an edge from SCC C to SCC C' in the component graph, then the maximum finish time in C is greater than the maximum finish time in C'. This is because C can reach C' (via the edge), so when DFS enters C, it will discover all of C' before finishing all of C.

Now in Step 3, we process vertices in decreasing finish time. The first vertex we process belongs to the SCC with the highest finish time — call it C_max. In G^T, all edges are reversed. C_max had the highest finish time, meaning in the original graph, it had edges TO other SCCs. In G^T, those edges point FROM other SCCs INTO C_max. So DFS from C_max in G^T can only reach vertices within C_max — the reversed edges prevent it from escaping into other SCCs. The first DFS tree is exactly C_max.

After removing C_max, the next unvisited vertex with highest finish time belongs to the next SCC in the component DAG ordering. The same argument applies: DFS in G^T from this vertex discovers exactly its SCC and nothing more. This continues until all SCCs are found.

python
def kosaraju(graph):
    """Find SCCs using Kosaraju's algorithm."""
    # Step 1: DFS on G, record finish order
    visited = set()
    finish_stack = []

    def dfs1(u):
        visited.add(u)
        for v in graph[u]:
            if v not in visited:
                dfs1(v)
        finish_stack.append(u)  # push on finish

    for u in graph:
        if u not in visited:
            dfs1(u)

    # Step 2: Build transpose graph
    g_t = {v: [] for v in graph}
    for u in graph:
        for v in graph[u]:
            g_t[v].append(u)

    # Step 3: DFS on G^T in reverse finish order
    visited2 = set()
    sccs = []

    def dfs2(u, component):
        visited2.add(u)
        component.append(u)
        for v in g_t[u]:
            if v not in visited2:
                dfs2(v, component)

    while finish_stack:
        u = finish_stack.pop()  # highest finish time first
        if u not in visited2:
            comp = []
            dfs2(u, comp)
            sccs.append(comp)

    return sccs

Tarjan's Algorithm

Tarjan's algorithm finds SCCs in a single DFS pass using a clever trick: low-link values. The low-link of a vertex u is the smallest discovery time reachable from u by following tree edges downward and at most one back edge upward. If low[u] == d[u], then u is the root of an SCC — no vertex in its subtree can "escape" to an ancestor above u.

// Tarjan's Algorithm (single-pass SCC)
index = 0, stack S = empty

STRONGCONNECT(u):
  d[u] = low[u] = index++
  push u onto S, mark u as on-stack

  for each v ∈ Adj[u]:
    if v is not visited:
      STRONGCONNECT(v)
      low[u] = min(low[u], low[v])
    elif v is on-stack:
      low[u] = min(low[u], d[v])

  if low[u] == d[u]: // u is root of SCC
    SCC = {}
    do:
      w = pop from S
      remove w from on-stack
      add w to SCC
    while w ≠ u
    output SCC

python
def tarjan(graph):
    """Find SCCs using Tarjan's algorithm (single DFS)."""
    index_counter = [0]
    stack = []
    on_stack = set()
    d = {}
    low = {}
    sccs = []

    def strongconnect(u):
        d[u] = low[u] = index_counter[0]
        index_counter[0] += 1
        stack.append(u)
        on_stack.add(u)

        for v in graph[u]:
            if v not in d:  # not visited
                strongconnect(v)
                low[u] = min(low[u], low[v])
            elif v in on_stack:
                low[u] = min(low[u], d[v])

        if low[u] == d[u]:  # root of SCC
            component = []
            while True:
                w = stack.pop()
                on_stack.discard(w)
                component.append(w)
                if w == u:
                    break
            sccs.append(component)

    for u in graph:
        if u not in d:
            strongconnect(u)

    return sccs

Kosaraju's vs Tarjan's. Both run in O(V + E). Kosaraju's requires two DFS passes and building the transpose graph (extra O(V + E) space and time). Tarjan's does it in one pass with a stack. In practice, Tarjan's is faster by a constant factor (one pass, better cache locality). In interviews, Kosaraju's is easier to explain. Know both — interviewers often ask "is there a single-pass version?"

Worked Example: SCCs By Hand

Consider this directed graph: A→B, B→C, C→A (cycle 1), C→D, D→E, E→F, F→D (cycle 2), E→G (G is a sink).

Kosaraju's Step 1 — DFS on G: Start from A. Visit A(d=1)→B(d=2)→C(d=3)→D(d=4)→E(d=5)→F(d=6). F→D but D is gray (back edge). Finish F(f=7), finish E... wait, E also has E→G. Visit G(d=8), finish G(f=9). Finish E(f=10), D(f=11). C→A but A is gray (back edge). Finish C(f=12), B(f=13), A(f=14).

Finish stack (top first): A(14), B(13), C(12), D(11), E(10), G(9), F(7).

Kosaraju's Step 2 — Transpose G^T: Reverse every edge: B→A, C→B, A→C, D→C, E→D, F→E, D→F, G→E.

Kosaraju's Step 3 — DFS on G^T in finish order:

Process (from stack)	DFS on G^T discovers	SCC
A (f=14)	A → C → B (all reach each other in G^T)	{A, B, C}
B — already visited	skip	—
C — already visited	skip	—
D (f=11)	D → F → E (all reach each other in G^T)	{D, E, F}
E — already visited	skip	—
G (f=9)	G alone (G→E but E visited)	{G}
F — already visited	skip	—

Result: 3 SCCs: {A,B,C}, {D,E,F}, {G}. The component DAG: {A,B,C} → {D,E,F} → {G}.

The Component DAG

Once you have the SCCs, you can build the component graph (also called the condensation). Replace each SCC with a single vertex. Add an edge between two super-vertices if any vertex in one SCC has an edge to any vertex in the other SCC. The component graph is always a DAG (Lemma 22.13 in CLRS).

python
def build_component_dag(graph, sccs):
    """Build DAG of SCCs (condensation)."""
    # Map each vertex to its SCC index
    scc_id = {}
    for i, scc in enumerate(sccs):
        for v in scc:
            scc_id[v] = i

    # Build condensation DAG
    dag = {i: set() for i in range(len(sccs))}
    for u in graph:
        for v in graph[u]:
            if scc_id[u] != scc_id[v]:
                dag[scc_id[u]].add(scc_id[v])

    return {k: list(v) for k, v in dag.items()}

Why the component graph is always a DAG. Suppose there were a cycle in the component graph: SCC₁ → SCC₂ → ... → SCC_k → SCC₁. Then every vertex in any of these SCCs can reach every vertex in every other one. That means they should all be in the same SCC, contradicting the fact that they are separate. Therefore no cycle exists.

Tarjan's Worked Example

Same graph: A→B, B→C, C→A, C→D, D→E, E→F, F→D, E→G. Let us trace Tarjan's algorithm.

Action	d[v]	low[v]	Stack	Event
Visit A	d[A]=0	low[A]=0	[A]
Visit B (A→B)	d[B]=1	low[B]=1	[A,B]
Visit C (B→C)	d[C]=2	low[C]=2	[A,B,C]
C→A: A on stack		low[C]=min(2,0)=0	[A,B,C]	C can reach A (ancestor)
Visit D (C→D)	d[D]=3	low[D]=3	[A,B,C,D]
Visit E (D→E)	d[E]=4	low[E]=4	[A,B,C,D,E]
Visit F (E→F)	d[F]=5	low[F]=5	[A,B,C,D,E,F]
F→D: D on stack		low[F]=min(5,3)=3	[A,B,C,D,E,F]	F can reach D
Return to E		low[E]=min(4,3)=3		E inherits F's low
Visit G (E→G)	d[G]=6	low[G]=6	[A,B,C,D,E,F,G]
G has no neighbors		low[G]=6=d[G]		G is SCC root! Pop G. SCC={G}
Return to E		low[E]=3	[A,B,C,D,E,F]
Return to D		low[D]=min(3,3)=3=d[D]		D is SCC root! Pop F,E,D. SCC={D,E,F}
Return to C		low[C]=0	[A,B,C]
Return to B		low[B]=min(1,0)=0	[A,B]	B inherits C's low
Return to A		low[A]=0=d[A]		A is SCC root! Pop C,B,A. SCC={A,B,C}

Result: SCCs are {G}, {D,E,F}, {A,B,C} — same answer as Kosaraju's, found in a single DFS pass. The key insight: when we return from recursion and find low[u] == d[u], everything on the stack above u (including u) forms an SCC. They are all mutually reachable because the low-link values propagated back edges upward, proving circular reachability.

When to Use SCCs in Practice

SCC decomposition is a preprocessing step for many problems on directed graphs:

2-SAT: Build an implication graph, find SCCs. A variable and its negation in the same SCC = unsatisfiable.
Minimum vertices to reach all vertices: Find SCCs with zero in-degree in the component DAG.
Redundant connections: Edges within the same SCC are redundant for reachability (the SCC is already mutually reachable).
Course equivalence: If course A requires B and B requires A, they form an SCC — you must take them together.
Minimum edges to make graph strongly connected: Find the component DAG. Count sources (in-degree 0) and sinks (out-degree 0) in the component DAG. Answer = max(sources, sinks) if graph has >1 SCC, else 0.

SCC Interview Problem: Critical Connections (LC 1192)

Find all bridges (critical connections) in a graph — edges whose removal disconnects the graph. This uses the same DFS + low-link technique as Tarjan's SCC, adapted for undirected graphs. An edge (u, v) is a bridge if and only if low[v] > d[u] — meaning v's subtree has no back edge that reaches u or above.

python
def criticalConnections(n, connections):
    """LC 1192: Find all bridges in undirected graph."""
    graph = [[] for _ in range(n)]
    for u, v in connections:
        graph[u].append(v)
        graph[v].append(u)

    d = [0] * n
    low = [0] * n
    visited = [False] * n
    bridges = []
    timer = [0]

    def dfs(u, parent):
        visited[u] = True
        timer[0] += 1
        d[u] = low[u] = timer[0]
        for v in graph[u]:
            if v == parent:
                continue
            if not visited[v]:
                dfs(v, u)
                low[u] = min(low[u], low[v])
                if low[v] > d[u]:  # v can't reach u without (u,v)
                    bridges.append([u, v])
            else:
                low[u] = min(low[u], d[v])

    dfs(0, -1)
    return bridges

Bridges vs Articulation Points. A bridge is an edge whose removal disconnects the graph. An articulation point is a vertex whose removal disconnects the graph. Both are found using DFS + low-link in O(V + E). The conditions are slightly different: edge (u,v) is a bridge if low[v] > d[u]. Vertex u is an articulation point if (a) u is the root and has 2+ children, or (b) u is not the root and has a child v with low[v] ≥ d[u]. These are frequently asked in system reliability interviews: "which router/link is a single point of failure?"

The SCC Showcase Simulation

This is the payoff. The simulation below shows a directed graph with multiple SCCs. Choose Kosaraju's or Tarjan's algorithm, then step through (or auto-play). Watch SCCs light up as they are discovered. The component DAG forms at the bottom.

SCC Discovery — Kosaraju's & Tarjan's

Step through the algorithm. Each SCC gets a unique color when discovered. The component DAG forms below the graph.

Question: In Tarjan's algorithm, vertex u has discovery time d[u] = 5 and after processing all descendants, low[u] = 5. What does this mean?

u is the ROOT of a strongly connected component. low[u] == d[u] means no vertex in u's subtree can reach an ancestor above u via a back edge. So u and all vertices on the stack above it form a complete SCC. Pop them off. u has no neighbors and is an isolated SCC of size 1. u is part of a larger SCC whose root has not been found yet.

Chapter 7: Graph Problems in Practice

Everything in Chapters 1-6 was theory. Now let us see where these algorithms appear in real systems. Every example below is a production system you use daily.

Social Networks: Connected Components

Facebook has ~3 billion users. "People you may know" suggestions start with BFS from your profile. Finding all friend clusters (connected components) uses repeated BFS/DFS. The vast majority of Facebook users are in one giant connected component — it is called the "giant component" phenomenon in network science.

Build Systems: Topological Sort

When you type make or gradle build, the build system constructs a dependency graph: source files, object files, libraries. It topologically sorts this graph to determine compilation order. If a cycle exists (A depends on B, B depends on A), the build system reports a circular dependency error — that is Kahn's algorithm detecting |order| < |V|.

Web Crawling: BFS

Google's web crawler starts from seed URLs and explores the web using BFS. Each URL is a vertex. Each hyperlink is a directed edge. BFS ensures that pages close to the seeds (high-authority pages) are crawled first. The "depth" of BFS corresponds to how many clicks away a page is from the seed.

Deadlock Detection: Cycle Detection

In operating systems, processes wait for resources held by other processes. This creates a wait-for graph: vertices are processes, edges represent "process A waits for process B." A cycle in this graph is a deadlock — a group of processes waiting for each other forever. The OS kernel runs periodic DFS cycle detection on this graph.

Package Managers: SCCs + Topological Sort

When npm install or pip install resolves dependencies, it builds a dependency graph. Circular dependencies form SCCs. The package manager collapses each SCC into a group that must be installed together, then topologically sorts the resulting DAG to determine installation order.

Compilers: SCCs for Optimization

In compiler optimization, the call graph (which functions call which) is analyzed using SCCs. Each SCC is a group of mutually recursive functions. The compiler can inline or optimize within an SCC as a unit, then process SCCs in reverse topological order of the component DAG.

Google PageRank: BFS + Graph Structure

Google's original PageRank algorithm models the web as a directed graph (pages are vertices, hyperlinks are edges). The algorithm performs a random walk: at each step, follow a random outgoing link (with probability 0.85) or jump to a random page (probability 0.15). The stationary distribution of this random walk gives the "importance" of each page. Finding this distribution requires iterating over the graph structure — and understanding SCCs is crucial because pages in the same SCC reinforce each other's importance.

Neural Network Computation Graphs

Deep learning frameworks (PyTorch, TensorFlow) represent computations as DAGs. Each operation (matmul, ReLU, softmax) is a vertex. Data dependencies are edges. Forward pass = topological sort of this DAG. Backward pass = reverse topological sort. Understanding Chapter 22 literally explains how autograd works.

Internet Routing: BFS on the Network

The Internet is a graph of routers connected by physical links. Routing protocols like OSPF (Open Shortest Path First) use Dijkstra's algorithm — which is essentially BFS with a priority queue. Each router builds a "link-state database" (the graph), runs Dijkstra, and builds a shortest-path tree. Every packet you send follows BFS-optimal paths through the network.

Version Control: DAGs Everywhere

Git's commit history is a DAG. Each commit points to its parent(s). Merge commits have two parents. Branches are pointers to commits. When you run git log --graph, you are looking at a DAG. Topological sort of this DAG gives you a valid linear history. Finding the common ancestor of two branches (for merge resolution) is a graph reachability problem — specifically, lowest common ancestor in a DAG.

Garbage Collection: DFS Marking

The mark-and-sweep garbage collector in languages like Java and Python uses DFS to find reachable objects. Starting from "root" references (global variables, stack variables), DFS traverses the object reference graph. Every reachable object is "marked." After DFS completes, unmarked objects are unreachable (garbage) and can be freed. This is literally DFS on the heap — the most performance-critical graph algorithm in most programs.

Database Query Optimization

When you write a SQL query joining 5 tables, the query optimizer builds a join graph (tables = vertices, join conditions = edges). It finds the optimal join order using graph analysis. Cycle detection identifies redundant join conditions. Topological sort (for DAG-shaped joins) determines execution order. Connected components identify independent subqueries that can be parallelized.

Real-World Scale: Where These Algorithms Run

Company	System	Graph Size	Algorithm	Frequency
Google	Web crawler	50B+ pages, 1T+ links	BFS (breadth-first crawl)	Continuous
Facebook	Friend suggestions	3B users, 400B edges	BFS (2-hop exploration)	Per user request
npm/pip	Package install	~2M packages, ~50M deps	Topological sort + SCC	Per install
Linux kernel	Module loading	~6K modules, ~30K deps	Topological sort	At boot
Git	Merge base detection	Variable (commit DAG)	BFS/DFS on commit graph	Per merge
PostgreSQL	Query optimization	Join graph (tables)	DFS for join ordering	Per query
GCC	Function inlining	Call graph (functions)	SCC + reverse topo sort	Per compilation
JVM	Garbage collection	Heap object graph	DFS (mark phase)	Periodic

These are not toy examples. Every entry in this table is a production system running the exact algorithms from this chapter — BFS, DFS, topological sort, SCCs — at scale, multiple times per second. The algorithms are the same; only the graph size differs. Master them at the textbook scale, and you can deploy them at Google scale.

Full Example: Analyzing a Codebase

Imagine you are analyzing a codebase with module dependencies. Here is the full workflow using the algorithms from this chapter:

1. Build the Graph

Parse import statements. Each module = vertex. Each import = directed edge. Build adjacency list. O(V + E) where V = modules, E = imports.

↓

2. Check for Circular Dependencies

Run DFS cycle detection. If back edge found, report the cycle (modules that import each other). Fix the cycle before proceeding.

↓

3. Find Build Order

If no cycles (DAG), run topological sort. This gives a valid compilation order — every module is compiled after its dependencies.

↓

4. Find Module Groups

If cycles exist, run SCC decomposition. Each SCC = a group of mutually dependent modules that must be compiled together. The component DAG gives the compilation order of groups.

↓

5. Analyze Impact

To find which modules are affected by a change to module X: run BFS/DFS from X on the REVERSE graph (who imports X, who imports those, etc.). Everything reachable = everything that needs retesting.

System	Graph Type	Algorithm	What it Finds
Social network	Undirected	BFS	Shortest friend chains, components
Build system	Directed (DAG)	Topological sort	Compilation order
Web crawler	Directed	BFS	Page discovery order
OS kernel	Directed	DFS cycle detection	Deadlocks
Package manager	Directed	SCC + topo sort	Install order, circular deps
Compiler	Directed	SCC	Mutually recursive functions
Maps/navigation	Directed weighted	BFS (unweighted) / Dijkstra	Shortest route
Garbage collector	Directed	DFS (mark phase)	Reachable objects

The meta-pattern. Almost every real-world graph problem reduces to one of four primitives: (1) BFS for shortest paths and level-order, (2) DFS for structure discovery (cycles, components, ordering), (3) topological sort for dependency resolution, (4) SCC decomposition for mutual-reachability clustering. Master these four and you can solve any graph problem by recognizing which primitive applies.

Build System Simulator

Build System Dependency Resolver

A build graph of source files and libraries. Watch topological sort determine compilation order. Try adding a circular dependency to see it detected.

Question: A package manager encounters a dependency graph where packages A, B, and C form a cycle (A→B→C→A), and package D depends on A. How should the package manager handle this?

Refuse to install anything — circular dependencies are always an error. Install A first because D depends on it, then B, then C. Run SCC decomposition. {A, B, C} form one SCC — they must be installed together as a group. D depends on this group. Topologically sort the component DAG: install {A, B, C} group first, then D. Within the SCC, use heuristics (alphabetical or version-based) to break the cycle.

Chapter 8: BFS vs DFS

Both BFS and DFS run in O(V + E). Both visit every vertex and every edge exactly once. But they explore the graph in fundamentally different orders, which makes each one the right tool for different problems. This chapter is your decision guide.

The Core Difference

BFS uses a queue (FIFO). It explores all neighbors at distance k before any at distance k+1. It produces a shortest-path tree from the source. Think: expanding wavefront, concentric circles, "how far away is everything?"

DFS uses a stack (LIFO, or recursion). It plunges deep into one path before backtracking. It produces timestamps (discovery/finish) and a DFS forest. Think: maze exploration, "what is the structure of this graph?"

Property	BFS	DFS
Data structure	Queue (FIFO)	Stack (LIFO) / recursion
Exploration order	Level by level	Deep before wide
Produces	Shortest-path tree, distances	DFS forest, timestamps, edge types
Space	O(V) queue (can be O(width))	O(V) stack (can be O(height))
Shortest paths?	Yes (unweighted)	No
Cycle detection?	Not directly (needs parent check for undirected)	Yes (back edge = cycle)
Topological sort?	Yes (Kahn's algorithm)	Yes (reverse finish order)
SCCs?	No	Yes (Kosaraju's, Tarjan's)
Bipartiteness?	Yes (natural 2-coloring)	Yes (but BFS is more natural)
Connected components?	Yes	Yes

When to Use Which

Use BFS when:

You need shortest paths in an unweighted graph
You need level-order traversal (e.g., tree levels, BFS-level grouping)
You need to test bipartiteness
The graph is wide and shallow (BFS queue stays small)
You are doing multi-source BFS (e.g., "nearest exit" from every cell in a maze)

Use DFS when:

You need cycle detection in directed graphs
You need topological sort
You need strongly connected components
You need articulation points or bridges
You are solving backtracking problems (permutations, combinations, Sudoku)
The graph is narrow and deep (DFS stack stays small)

Space Comparison on Extreme Graphs

Both BFS and DFS use O(V) space in the worst case, but on specific graph shapes, one can be dramatically better than the other.

Wide tree (branching factor b, depth d): BFS queue holds all nodes at one level — O(b^d) at the widest level. DFS stack holds one root-to-leaf path — O(d). DFS wins.

Long chain (V nodes in a line): BFS queue holds at most 1-2 nodes at a time — O(1). DFS stack holds the entire chain — O(V). BFS wins.

Bidirectional BFS

When searching for a path between two specific vertices s and t, bidirectional BFS runs two BFS simultaneously — one from s, one from t. When the two frontiers meet, we have found the shortest path. This reduces the search space dramatically.

In a graph where each vertex has degree b and the shortest path has length d, regular BFS explores O(b^d) vertices. Bidirectional BFS explores O(2 × b^d/2) = O(b^d/2) — exponentially less. For the "six degrees of separation" problem on a social network (b=150, d=6), regular BFS might explore 150⁶ ≈ 10¹³ nodes, while bidirectional BFS explores 2 × 150³ ≈ 7 × 10⁶.

python
def bidirectional_bfs(graph, s, t):
    """Returns shortest path length from s to t, or -1."""
    if s == t:
        return 0
    front_s = {s}    # frontier expanding from s
    front_t = {t}    # frontier expanding from t
    visited_s = {s}  # all visited from s side
    visited_t = {t}  # all visited from t side
    dist = 0

    while front_s and front_t:
        # Always expand the smaller frontier (optimization)
        if len(front_s) > len(front_t):
            front_s, front_t = front_t, front_s
            visited_s, visited_t = visited_t, visited_s
        dist += 1
        next_front = set()
        for u in front_s:
            for v in graph[u]:
                if v in visited_t:
                    return dist  # frontiers met!
                if v not in visited_s:
                    visited_s.add(v)
                    next_front.add(v)
        front_s = next_front
    return -1

Iterative Deepening DFS (IDDFS)

What if you want BFS's shortest-path guarantee with DFS's memory efficiency? Iterative Deepening DFS runs DFS with depth limit 0, then 1, then 2, etc. It finds the shallowest goal like BFS but uses O(d) memory like DFS (where d is the solution depth). The "wasted" re-exploration at each depth is only a constant factor: the last iteration dominates.

IDDFS overhead is small. If each vertex has b neighbors and the solution is at depth d, BFS visits O(b^d) nodes. IDDFS visits b⁰ + b¹ + ... + b^d = O(b^d) — the same asymptotic cost. The last level b^d dominates the geometric series. You repeat work at shallow depths, but shallow depths are tiny. IDDFS is optimal for uninformed search in trees with unknown depth.

0-1 BFS: When Edges Have Weights 0 or 1

A special case worth knowing: if every edge weight is either 0 or 1, you can find shortest paths in O(V + E) using a deque instead of a priority queue. When processing edge (u, v): if weight = 0, push v to the front of the deque. If weight = 1, push v to the back. This maintains the invariant that the deque is sorted by distance.

python
def bfs_01(graph, source):
    """Shortest paths when edge weights are 0 or 1.
    graph[u] = [(v, weight), ...] where weight ∈ {0, 1}."""
    dist = {v: float('inf') for v in graph}
    dist[source] = 0
    dq = deque([source])

    while dq:
        u = dq.popleft()
        for v, w in graph[u]:
            if dist[u] + w < dist[v]:
                dist[v] = dist[u] + w
                if w == 0:
                    dq.appendleft(v)  # front — process soon
                else:
                    dq.append(v)       # back — process later

    return dist
# Useful for: grids where some moves are free (e.g., walking
# on ice = weight 0, walking on ground = weight 1)

The BFS spectrum. BFS with a plain queue handles unweighted graphs (all edges weight 1). BFS with a deque handles 0-1 weighted graphs. BFS with a priority queue becomes Dijkstra's algorithm (arbitrary non-negative weights). Each step up the complexity ladder adds a log V factor: O(V+E), O(V+E), O((V+E) log V). Knowing which variant to use for your specific weight set is a key competitive programming skill.

The Decision Tree

Use this flowchart when facing any graph problem in an interview. Start at the top and follow the branches.

Is it a graph problem?

Look for: connections, neighbors, reachability, ordering, grouping, paths. If yes → continue. If no → wrong chapter.

↓

Directed or undirected?

Prerequisites, one-way streets, follows → directed. Friendships, roads, connections → undirected. Grid → undirected.

↓

What are you looking for?

Shortest path → BFS (unweighted) or Dijkstra (weighted).
Connectivity/components → BFS, DFS, or Union-Find.
Cycle → DFS (directed: 3-color; undirected: parent check).
Ordering → Topological sort (Kahn's or DFS).
Clusters → SCC (directed) or connected components (undirected).
All paths → DFS backtracking.
Bipartite → BFS 2-coloring.

Performance on Different Graph Shapes

The theoretical runtime of both BFS and DFS is O(V + E), but constant factors and memory access patterns differ by graph shape.

Graph Shape	BFS behavior	DFS behavior
Linear chain (V nodes in a line)	Queue size: O(1). Memory-friendly.	Stack depth: O(V). Risk of stack overflow.
Star graph (one center, V-1 leaves)	Queue size: O(V) after processing center.	Stack depth: O(1). Memory-friendly.
Complete graph K_V	Queue: all V after first step. O(V²) edges.	Stack: O(V). O(V²) edges.
Binary tree (depth d)	Queue: up to 2^d at last level. Exponential space.	Stack: O(d). Logarithmic space. DFS wins.
Grid (n × n)	Queue: O(n) wavefront. Good cache locality.	Stack: O(n²) worst case. OK in practice.

In practice, BFS is usually better for shortest-path queries on grids. BFS explores in concentric rings, which has good spatial locality (neighboring cells in memory). DFS can zigzag across the grid, thrashing the cache. For connectivity queries, either works — but DFS is slightly simpler to implement recursively.

BFS vs DFS Side-by-Side

BFS vs DFS Exploration Comparison

Watch BFS (left, teal) and DFS (right, orange) explore the same graph simultaneously. Note the different vertex visit orders and the different trees they produce.

Summary: The BFS vs DFS Decision Matrix

When in doubt, use this rapid decision matrix:

// "I need..." → Algorithm
shortest path (unweighted) → BFS
shortest path (weighted) → Dijkstra (Ch 24)
cycle detection (directed) → DFS (three-color)
cycle detection (undirected) → DFS (parent check) or Union-Find
topological ordering → Kahn's (intuitive) or DFS reverse-finish
strongly connected components → Tarjan's (single pass) or Kosaraju's
connected components → BFS, DFS, or Union-Find
bipartiteness testing → BFS (2-coloring)
all paths (backtracking) → DFS
level-order traversal → BFS
bridges / articulation pts → DFS (low-link)

Question: You need to find the shortest path (fewest edges) from node S to node T in an unweighted graph with 1 million nodes. Which algorithm do you choose?

BFS from S. BFS discovers nodes in distance order, so the first time it reaches T is via the shortest path. DFS might find T quickly but via a long detour — it does not guarantee shortest paths. DFS from S — it uses less memory on large graphs. Either works; both find shortest paths in O(V + E).

Chapter 9: Interview Arsenal

Graph problems are among the most common in coding interviews. This chapter gives you the patterns, the LeetCode classics, and the debugging strategies.

The Graph Problem Identification Checklist

When you see a coding problem, ask these questions to determine if it is a graph problem and which algorithm to use:

Question	If YES, then...
Are there "things" with "connections"?	It is a graph problem. Build adjacency list.
Does the problem ask for "minimum number of steps/moves"?	BFS (shortest path in unweighted graph)
Does the problem ask "can you reach X from Y"?	BFS or DFS (either works)
Does the problem ask for "number of connected groups"?	DFS/BFS connected components
Does the problem ask for "ordering with dependencies"?	Topological sort
Does the problem ask "is there a cycle"?	DFS (back edge detection)
Does the problem ask "can you 2-color/bipartite"?	BFS bipartiteness check
Is the graph a grid (2D array)?	Each cell = vertex, 4-directional neighbors = edges

Classic LeetCode Patterns

Pattern 1: Number of Islands (LC 200) — Connected Components

Given a 2D grid of '1' (land) and '0' (water), count islands. Each island is a connected component of '1' cells. For each unvisited '1', BFS/DFS to mark all connected land cells as visited. Count the number of BFS/DFS calls.

python
def numIslands(grid):
    rows, cols = len(grid), len(grid[0])
    count = 0

    def dfs(r, c):
        if r < 0 or r >= rows or c < 0 or c >= cols:
            return
        if grid[r][c] != '1':
            return
        grid[r][c] = '0'  # mark visited
        dfs(r+1, c); dfs(r-1, c)
        dfs(r, c+1); dfs(r, c-1)

    for r in range(rows):
        for c in range(cols):
            if grid[r][c] == '1':
                count += 1
                dfs(r, c)
    return count

Pattern 2: Course Schedule (LC 207) — Cycle Detection / Topological Sort

Given n courses and prerequisite pairs, determine if you can finish all courses. This is: "does the prerequisite graph have a cycle?" Use DFS cycle detection or Kahn's algorithm.

python
def canFinish(numCourses, prerequisites):
    graph = [[] for _ in range(numCourses)]
    in_deg = [0] * numCourses
    for a, b in prerequisites:
        graph[b].append(a)  # b is prereq of a
        in_deg[a] += 1

    # Kahn's algorithm
    queue = deque(i for i in range(numCourses) if in_deg[i] == 0)
    count = 0
    while queue:
        u = queue.popleft()
        count += 1
        for v in graph[u]:
            in_deg[v] -= 1
            if in_deg[v] == 0:
                queue.append(v)
    return count == numCourses  # all courses processable = no cycle

Pattern 3: Word Ladder (LC 127) — BFS Shortest Path

Transform "hit" to "cog" by changing one letter at a time, each intermediate word must be in the dictionary. This is BFS on an implicit graph: each word is a vertex, two words are connected if they differ by one letter. BFS finds the shortest transformation sequence.

python
def ladderLength(beginWord, endWord, wordList):
    wordSet = set(wordList)
    if endWord not in wordSet:
        return 0
    queue = deque([(beginWord, 1)])
    visited = {beginWord}
    while queue:
        word, steps = queue.popleft()
        for i in range(len(word)):
            for c in 'abcdefghijklmnopqrstuvwxyz':
                newWord = word[:i] + c + word[i+1:]
                if newWord == endWord:
                    return steps + 1
                if newWord in wordSet and newWord not in visited:
                    visited.add(newWord)
                    queue.append((newWord, steps + 1))
    return 0
# Key insight: the graph is IMPLICIT — we never build adjacency
# lists. For each word, we try all 26*len single-char replacements.

Pattern 4: Clone Graph (LC 133) — BFS/DFS with Hash Map

Deep-copy a graph. BFS/DFS through the original, maintaining a hash map from original node to cloned node. When you visit a neighbor, check if it is already cloned; if not, clone it and enqueue/recurse.

Pattern 5: Rotting Oranges (LC 994) — Multi-Source BFS

All rotten oranges spread simultaneously. This is multi-source BFS: enqueue all rotten oranges at time 0, then BFS. Each level of BFS represents one minute. The answer is the maximum distance (or -1 if any fresh orange is unreachable).

Pattern 6: Graph Valid Tree (LC 261) — Undirected Cycle Detection

A graph is a valid tree if and only if it is connected AND has no cycles. Check: (1) E == V-1 (necessary for tree), (2) BFS/DFS from any vertex visits all V vertices (connected), (3) No cycle detected. Any two of these conditions imply the third.

The Implicit Graph Pattern

Many problems have graphs that are not given as adjacency lists. Instead, the graph is implicit in the problem structure. Recognizing the implicit graph is the hardest part.

Problem	Vertices	Edges	Algorithm
2D grid (islands, maze)	Each cell	4 adjacent cells	BFS/DFS
Word ladder	Each word	Words differing by 1 char	BFS
Puzzle (sliding, Rubik's)	Each state	Legal moves	BFS
Knight moves on chess board	Each square	8 L-shaped moves	BFS
Lock combinations (LC 752)	Each 4-digit combo	Turn one wheel ±1	BFS
Gene mutation (LC 433)	Each gene string	Single character mutations	BFS

The universal template for grid BFS/DFS. Almost every grid problem uses this pattern: (1) Identify the vertices (cells), (2) Define neighbors (4-directional, 8-directional, or custom), (3) Define "visited" (either modify the grid in-place or use a visited set), (4) Run BFS for shortest path or DFS for connectivity/backtracking. Once you recognize the grid as a graph, the algorithm is mechanical.

python
# Universal grid BFS template
def grid_bfs(grid, start_r, start_c):
    rows, cols = len(grid), len(grid[0])
    visited = [[False] * cols for _ in range(rows)]
    visited[start_r][start_c] = True
    queue = deque([(start_r, start_c, 0)])  # (row, col, dist)

    while queue:
        r, c, dist = queue.popleft()
        for dr, dc in [(-1,0),(1,0),(0,-1),(0,1)]:
            nr, nc = r + dr, c + dc
            if 0 <= nr < rows and 0 <= nc < cols:
                if not visited[nr][nc] and grid[nr][nc] != '#':
                    visited[nr][nc] = True
                    queue.append((nr, nc, dist + 1))
    # Adapt return value to the specific problem

Debug Scenarios

Interview debugging questions test whether you can systematically diagnose graph algorithm failures. Here are the most common.

Debug 1: "My BFS returns wrong shortest path distances"

Likely cause: Marking vertices as visited when dequeued instead of when enqueued. If you mark on dequeue, a vertex can be enqueued multiple times (once from each neighbor that discovers it). The first enqueue has the correct distance, but later enqueues overwrite it with longer distances. Fix: Always mark visited (set color to gray) at enqueue time, not at dequeue time.

Debug 2: "My DFS cycle detection has false positives on directed graphs"

Likely cause: Using a simple "visited" set instead of the three-color scheme. With just visited/unvisited, you cannot distinguish back edges (cycle) from cross/forward edges (no cycle). When you visit a neighbor that is already visited, it might be in a completely different DFS branch. Fix: Use three colors (white/gray/black). Only gray neighbors indicate cycles.

Debug 3: "My topological sort sometimes produces invalid orders"

Likely cause: Forgetting to reverse the DFS finish order. DFS appends vertices on finish, giving reverse topological order. If you return this list directly without reversing, dependencies point the wrong way. Fix: Return order[::-1] or prepend to a deque.

Debug 4: "Kosaraju's gives wrong SCCs"

Likely cause: Running the second DFS on G instead of G^T, or processing vertices in the wrong order (increasing finish time instead of decreasing). Fix: Step 2 must use the transpose graph. Step 3 must process vertices in decreasing finish time (pop from the finish stack).

Debug 5: "My graph algorithm TLEs on large inputs"

Likely causes: (1) Using adjacency matrix O(V²) instead of adjacency list O(V+E) for a sparse graph. (2) Not marking visited early enough (BFS enqueues same vertex multiple times). (3) Rebuilding the graph inside a loop. (4) Using slow set operations in Python (try deque instead of list for BFS queue). Fix: Profile to find the bottleneck. Usually it is representation or visited-set timing.

Common Interview Mistakes

Mistake	Why it is wrong	Fix
Forgetting to mark visited	Infinite loop on cyclic graphs	Mark visited BEFORE enqueuing (BFS) or recursing (DFS)
Using DFS for shortest path	DFS finds A path, not the shortest	Use BFS for unweighted shortest paths
O(V²) adjacency matrix for sparse graph	TLE on large inputs	Use adjacency list
Recursion limit on large DFS	Python default limit is 1000	Use iterative DFS with explicit stack, or sys.setrecursionlimit
Not handling disconnected graphs	BFS/DFS from one vertex misses other components	Loop over all vertices, start new BFS/DFS for unvisited ones

Union-Find: An Alternative for Connected Components

For the specific problem of connected components in undirected graphs, there is a competing data structure: Union-Find (Disjoint Set Union, DSU). Instead of BFS/DFS, you process edges one at a time, merging components. With path compression and union by rank, each operation takes nearly O(1) amortized (technically O(α(V)) where α is the inverse Ackermann function — practically a constant).

python
class UnionFind:
    def __init__(self, n):
        self.parent = list(range(n))
        self.rank = [0] * n
        self.count = n  # number of components

    def find(self, x):
        if self.parent[x] != x:
            self.parent[x] = self.find(self.parent[x])  # path compression
        return self.parent[x]

    def union(self, x, y):
        rx, ry = self.find(x), self.find(y)
        if rx == ry:
            return False  # already connected
        if self.rank[rx] < self.rank[ry]:
            rx, ry = ry, rx
        self.parent[ry] = rx  # union by rank
        if self.rank[rx] == self.rank[ry]:
            self.rank[rx] += 1
        self.count -= 1
        return True

# Usage: process edges to find connected components
uf = UnionFind(5)
uf.union(0, 1); uf.union(2, 3); uf.union(1, 3)
# Components: {0,1,2,3} and {4}. uf.count = 2

BFS/DFS vs Union-Find for connected components. BFS/DFS processes the entire graph at once: O(V + E). Union-Find processes edges incrementally: O(E α(V)). For static graphs, BFS/DFS is simpler. For dynamic graphs where edges are added one at a time (and you need the component count after each addition), Union-Find is superior — you do not need to re-run BFS from scratch. Classic example: "Number of Connected Components in an Undirected Graph" (LC 323) and "Redundant Connection" (LC 684).

Iterative DFS (Avoid Stack Overflow)

python
def dfs_iterative(graph, source):
    """Iterative DFS using explicit stack."""
    visited = set()
    stack = [source]
    order = []
    while stack:
        u = stack.pop()
        if u in visited:
            continue
        visited.add(u)
        order.append(u)
        for v in reversed(graph[u]):  # reversed to match recursive order
            if v not in visited:
                stack.append(v)
    return order

Interview Template: How to Approach Any Graph Problem

Here is the step-by-step template for graph problems in interviews. Follow this and you will never freeze.

Step 1: Identify the Graph

What are the vertices? What are the edges? Is it directed or undirected? Weighted or unweighted? Explicit (adjacency list given) or implicit (grid, states)?

↓

Step 2: Build the Representation

If explicit: parse input into adjacency list (dictionary of lists). If implicit: define a neighbors() function. For grids: direction array [(0,1),(0,-1),(1,0),(-1,0)].

↓

Step 3: Choose the Algorithm

Shortest path → BFS. Cycle detection → DFS. Ordering → topological sort. Clustering → SCCs or connected components. Backtracking → DFS.

↓

Step 4: Handle Edge Cases

Empty graph. Disconnected graph (loop over all vertices). Self-loops. Duplicate edges. Start == end.

↓

Step 5: State Complexity

Time: O(V + E) for BFS/DFS. Space: O(V) for visited set + queue/stack. For grids: V = rows × cols, E = 4V.

The "Graph in Disguise" Pattern

The hardest graph problems do not look like graph problems. The key insight is recognizing the hidden graph structure. Here are the disguises:

Disguise	Example Problem	Hidden Graph
String transformation	Word Ladder (LC 127)	Vertices = words, edges = 1-char difference
Matrix/grid	Number of Islands (LC 200)	Vertices = cells, edges = 4-adjacent
State machine	Open the Lock (LC 752)	Vertices = lock states, edges = single-wheel turns
Sequence ordering	Alien Dictionary (LC 269)	Vertices = characters, edges = ordering constraints
Prerequisite chain	Course Schedule (LC 207)	Vertices = courses, edges = prerequisites
Social network	Friend Circles (LC 547)	Vertices = people, edges = friendships
Equations	Evaluate Division (LC 399)	Vertices = variables, edges = ratios (weighted)

The recognition skill is what separates good from great. Any candidate can implement BFS. The differentiator is seeing that "minimum number of gene mutations" is BFS on a word graph, or that "evaluate a/b given a/c and c/d" is DFS on a weighted graph. Practice the recognition — the implementation is mechanical once you see the graph.

Coding Drills

Practice these patterns until they are muscle memory. Each one is a 5-minute coding drill that reinforces a key graph concept.

Drill 1: Build adjacency list from edge list

python
def build_adj_list(n, edges, directed=True):
    """Build adjacency list from edge list."""
    graph = [[] for _ in range(n)]
    for u, v in edges:
        graph[u].append(v)
        if not directed:
            graph[v].append(u)
    return graph

Drill 2: Count connected components

python
def count_components(n, edges):
    """Number of connected components in undirected graph."""
    graph = build_adj_list(n, edges, directed=False)
    visited = [False] * n
    count = 0
    for i in range(n):
        if not visited[i]:
            count += 1
            queue = deque([i])
            visited[i] = True
            while queue:
                u = queue.popleft()
                for v in graph[u]:
                    if not visited[v]:
                        visited[v] = True
                        queue.append(v)
    return count

Drill 3: Find all paths between two nodes (DFS backtracking)

python
def all_paths(graph, source, target):
    """All paths from source to target (LC 797)."""
    result = []
    def dfs(u, path):
        if u == target:
            result.append(path[:])
            return
        for v in graph[u]:
            path.append(v)
            dfs(v, path)
            path.pop()  # backtrack
    dfs(source, [source])
    return result
# Note: this is O(2^V * V) worst case — exponential!
# Only use when the graph is small or a DAG.

Drill 4: Check if path exists (simplest graph query)

python
def has_path(graph, source, target):
    """Does a path from source to target exist?"""
    visited = set([source])
    queue = deque([source])
    while queue:
        u = queue.popleft()
        if u == target:
            return True
        for v in graph[u]:
            if v not in visited:
                visited.add(v)
                queue.append(v)
    return False

Drill 5: Evaluate Division (LC 399, weighted graph DFS)

python
def calcEquation(equations, values, queries):
    """Given a/b=k, find a/c by DFS on weighted graph."""
    graph = {}
    for (a, b), v in zip(equations, values):
        graph.setdefault(a, []).append((b, v))
        graph.setdefault(b, []).append((a, 1.0 / v))

    def dfs(src, dst, visited):
        if src not in graph or dst not in graph:
            return -1.0
        if src == dst:
            return 1.0
        visited.add(src)
        for neighbor, weight in graph[src]:
            if neighbor not in visited:
                result = dfs(neighbor, dst, visited)
                if result != -1.0:
                    return weight * result
        return -1.0

    return [dfs(a, b, set()) for a, b in queries]
# This is DFS on a weighted graph where edge weights are ratios.
# a/b = 2.0, b/c = 3.0 → a/c = 2.0 * 3.0 = 6.0
# The path a → b → c has accumulated weight 2.0 * 3.0.

Drill 6: Topological sort one-liner (for quick coding)

python
def topo_sort(graph):
    """Compact DFS topological sort."""
    visited, order = set(), []
    def dfs(u):
        visited.add(u)
        for v in graph.get(u, []):
            if v not in visited:
                dfs(v)
        order.append(u)
    for u in graph:
        if u not in visited:
            dfs(u)
    return order[::-1]
# 10 lines, O(V+E), handles disconnected DAGs.
# Missing: cycle detection. Add gray-set for that.

Complexity Cheat Sheet

Algorithm	Time	Space	Key Property
BFS	O(V+E)	O(V)	Shortest path (unweighted)
DFS	O(V+E)	O(V)	Timestamps, edge classification
Topological Sort (DFS)	O(V+E)	O(V)	Reverse finish order
Topological Sort (Kahn's)	O(V+E)	O(V)	In-degree peeling, cycle detect
Kosaraju's SCC	O(V+E)	O(V+E)	Two DFS + transpose
Tarjan's SCC	O(V+E)	O(V)	Single DFS + low-link

Question: You are given a 2D grid representing a maze. You need to find the minimum number of steps from the start cell to the exit cell. Each step moves to an adjacent cell (up/down/left/right). Which algorithm and why?

BFS from the start cell. The grid is an implicit unweighted graph (each cell = vertex, each valid move = edge with weight 1). BFS finds shortest paths in unweighted graphs. The first time BFS reaches the exit, that is the minimum steps. DFS would find A path but not necessarily the shortest. DFS — it uses less memory on grids since the recursion depth is bounded. Dijkstra's algorithm — grids are weighted graphs.

Chapter 10: Connections

You now have the complete toolkit for elementary graph traversal. BFS for shortest paths, DFS for structural analysis, topological sort for ordering, SCCs for clustering. These are the foundation. Everything in the next three CLRS chapters builds directly on what you learned here.

What Comes Next

CLRS Chapter	Topic	Builds On
Chapter 23	Minimum Spanning Trees (Kruskal's, Prim's)	Graph representation, BFS-like exploration (Prim's). Kruskal's uses a different strategy (greedy on sorted edges) but the graph structure is the same.
Chapter 24	Single-Source Shortest Paths (Dijkstra, Bellman-Ford)	BFS is the unweighted special case of Dijkstra. When edges have weights, you need a priority queue instead of a FIFO queue. Bellman-Ford handles negative weights using V-1 relaxation passes.
Chapter 25	All-Pairs Shortest Paths (Floyd-Warshall, Johnson's)	Floyd-Warshall uses the adjacency matrix representation. Johnson's uses Bellman-Ford + Dijkstra. Both assume mastery of graph representation.
Chapter 26	Maximum Flow (Ford-Fulkerson, Edmonds-Karp)	Edmonds-Karp is literally BFS applied to find augmenting paths. Residual graphs use the adjacency list structure. Understanding BFS pathfinding is essential.

The Hierarchy of Graph Problems

Traversal (This Chapter)

BFS, DFS, topological sort, SCCs. The atoms of graph algorithms. O(V + E). Unweighted.

↓

Shortest Paths (Ch 24-25)

Dijkstra, Bellman-Ford, Floyd-Warshall. Add edge weights. O(V + E) log V to O(V³).

↓

Spanning Trees (Ch 23)

Kruskal's, Prim's. Find minimum-cost trees. O(E log V).

↓

Network Flow (Ch 26)

Ford-Fulkerson, Edmonds-Karp. Maximum flow, minimum cut. O(VE²).

↓

NP-Hard Graph Problems

Traveling salesman, graph coloring, maximum clique. No known polynomial algorithms. Approximation and heuristics.

Key Takeaways

What you can now do. (1) Represent any graph efficiently (adjacency list vs matrix, know when each is appropriate). (2) Find shortest paths in unweighted graphs with BFS (and prove it correct). (3) Detect cycles, classify edges, and compute timestamps with DFS (and explain the parenthesis theorem). (4) Topologically sort DAGs using both DFS reverse-finish and Kahn's BFS (and detect cycles when topological sort fails). (5) Decompose directed graphs into SCCs using both Kosaraju's and Tarjan's (and build the component DAG). (6) Recognize which algorithm to apply to any graph problem (the most valuable skill). These six skills unlock Chapters 23-26 and solve 80%+ of graph interview questions.

A Final Word on Mastery

Knowing the algorithms is necessary but not sufficient. Mastery means three things:

Recognition: Given a problem, you immediately see the graph structure and know which algorithm to apply. This comes from solving 30-50 graph problems across different disguises (grids, strings, states, dependencies).

Implementation speed: You can write BFS, DFS, topological sort, and SCC detection from memory in under 5 minutes each. This comes from repeated practice — write each one 10 times until it is muscle memory.

Debugging intuition: When your graph code gives wrong answers, you know the five most likely bugs (wrong visited timing, missing disconnected component handling, wrong edge direction, recursion limit, representation mismatch). This comes from making each mistake once and remembering it.

CLRS Chapter 22 Exercises Worth Doing

These exercises from the textbook are frequently adapted into interview questions. Work through them by hand before looking at solutions.

Exercise	Topic	What it tests
22.1-5	Square of a graph	G² has edge (u,v) if G has a path of length ≤ 2 from u to v. Build G² from adjacency list in O(VE).
22.1-6	Universal sink	A vertex with in-degree V-1 and out-degree 0. Find it in O(V) from adjacency matrix.
22.2-7	Bipartiteness	Prove: G is bipartite ⇔ no odd-length cycle. Use BFS coloring argument.
22.2-8	Tree diameter	Find the diameter of an unweighted tree using two BFS passes. Prove correctness.
22.3-12	Singly connected	A directed graph is singly connected if u↝v implies there is at most one simple path from u to v. Use DFS to test this.
22.4-5	Unique topological order	Determine if a DAG has a unique topological order (Hamiltonian path in DAG).
22.5-7	SCC semiconnected	A directed graph is semiconnected if for every pair u,v, either u↝v or v↝u. Test using SCC decomposition + component DAG.

Time Complexity Summary — The One Table

Every algorithm in this chapter runs in O(V + E) with adjacency lists. This is optimal — you must examine every vertex and every edge at least once. Here is the complete summary:

Algorithm	Time (adj list)	Time (adj matrix)	Space	Key data structure
BFS	O(V+E)	O(V²)	O(V)	Queue (FIFO)
DFS (recursive)	O(V+E)	O(V²)	O(V) stack	Call stack
DFS (iterative)	O(V+E)	O(V²)	O(V)	Explicit stack
Topological sort (DFS)	O(V+E)	O(V²)	O(V)	Stack + finish list
Topological sort (Kahn's)	O(V+E)	O(V²)	O(V)	Queue + in-degree array
SCC (Kosaraju's)	O(V+E)	O(V²)	O(V+E)	Two stacks + transpose graph
SCC (Tarjan's)	O(V+E)	O(V²)	O(V)	Stack + low-link array
Bipartite check	O(V+E)	O(V²)	O(V)	Queue + color array
Connected components	O(V+E)	O(V²)	O(V)	Queue/stack + visited

The O(V + E) pattern. Notice that EVERY algorithm in this chapter is O(V + E) with adjacency lists. This is not a coincidence. All of them are based on visiting each vertex and each edge exactly once. The only difference is the order of visitation (BFS: level-order, DFS: depth-order) and the bookkeeping (timestamps, colors, parent pointers, low-link values). Understand this pattern and you understand all of Chapter 22.

Beyond Chapter 22: Advanced Graph Topics

Topic	What it Does	Key Idea
Articulation points	Vertices whose removal disconnects the graph	DFS + low-link (similar to Tarjan's SCC)
Bridges	Edges whose removal disconnects the graph	DFS + low-link
Eulerian paths	Path visiting every EDGE exactly once	Exists iff 0 or 2 odd-degree vertices
Biconnected components	Maximal 2-connected subgraphs	DFS with articulation point detection
2-SAT	Boolean satisfiability with 2 literals per clause	Build implication graph, find SCCs
Network flow	Maximum flow through a network	BFS for augmenting paths (Edmonds-Karp)

The Relationship Map

Here is how every concept in this lesson connects to every other concept:

Concept	Depends On	Enables
Graph representation	Nothing (foundation)	Every algorithm
BFS	Queue, graph representation	Shortest paths, bipartiteness, connected components, Kahn's, Edmonds-Karp
DFS	Stack/recursion, graph representation	Timestamps, edge classification, cycle detection, topological sort, SCCs, articulation points
Timestamps	DFS	Parenthesis theorem, edge classification, topological sort
Edge classification	DFS, timestamps	Cycle detection (back edges), topological sort (no back edges = DAG)
Cycle detection	DFS (directed), DFS/BFS (undirected)	DAG verification, deadlock detection
Topological sort	DFS or BFS+in-degree, DAG verification	Dependency resolution, dynamic programming on DAGs, SCC component DAG processing
SCCs	DFS, transpose graph (Kosaraju) or low-link (Tarjan)	Component DAG, 2-SAT, mutual reachability queries

The dependency DAG of this lesson is itself a DAG. You could topologically sort the chapters: [0: Representation] → [1: BFS, 3: DFS] → [2: BFS Apps, 4: DFS Apps] → [5: Topo Sort] → [6: SCCs] → [7: Practice, 8: BFS vs DFS] → [9: Arsenal, 10: Connections]. This is not a coincidence — the chapter order IS a topological ordering of the concept dependency graph.

Study Plan for Interview Prep

Week	Focus	LeetCode Problems
Week 1	BFS basics: grids, shortest path	200 (Islands), 994 (Rotting Oranges), 542 (01 Matrix), 1091 (Shortest Path Binary Matrix)
Week 2	DFS basics: connected components, cycle detection	547 (Number of Provinces), 261 (Graph Valid Tree), 323 (Connected Components)
Week 3	Topological sort, course scheduling	207 (Course Schedule), 210 (Course Schedule II), 269 (Alien Dictionary)
Week 4	Advanced: bipartite, word ladder, clone graph	785 (Is Graph Bipartite), 127 (Word Ladder), 133 (Clone Graph), 399 (Evaluate Division)

The Big Picture: Why Graph Algorithms Matter

Graphs are the most versatile data structure in computer science. Almost every real-world system can be modeled as a graph — and almost every interesting question about that system reduces to a graph algorithm you learned in this lesson. When you see "connections," "dependencies," "reachability," or "ordering" in a problem statement, your first thought should be: what is the graph, and which traversal do I need?

Parting Advice

The four algorithms in this chapter — BFS, DFS, topological sort, and SCC decomposition — are the building blocks of all graph algorithms. Every more advanced algorithm (Dijkstra, Bellman-Ford, Prim's, Kruskal's, Ford-Fulkerson, A*) is built on top of one or more of these primitives. Master them thoroughly before moving to Chapter 23.

In interviews, the most common failure mode is not knowing the algorithm — it is not recognizing the problem as a graph problem. Train your "graph detector": whenever you see relationships, dependencies, reachability, or ordering, think graph. Then the algorithm choice is mechanical.

"The world is a graph. Most problems are about traversal."
— Jon Kleinberg

Final question: You need to determine if there is a path from server A to server B in a network of 50,000 servers with 200,000 bidirectional connections, AND you need the minimum number of hops. Which algorithm?

BFS from A. It answers both questions simultaneously: (1) if BFS reaches B, a path exists; (2) the distance d[B] is the minimum number of hops. The graph is sparse (200K edges vs 2.5B possible), so use adjacency lists. Runtime: O(V + E) = O(250,000). Any other approach is overkill. DFS from A, counting the path length. Floyd-Warshall for all-pairs shortest paths, then look up (A, B).