Union-Find — nearly O(1) connected component queries with path compression.
You are building a social network. Users join one at a time, and friendships form between pairs of users. At any moment, someone asks: "Are Alice and Bob in the same friend group?" meaning — is there a chain of friendships connecting them, however indirect?
The naive approach: every time someone asks, run a BFS or DFS from Alice. If you reach Bob, they are connected. Each query takes O(V + E) time — scanning through every user and every friendship. When your network has millions of users and billions of friendship edges, this is catastrophic. A single "are they connected?" query could take seconds.
But notice something: we do not need shortest paths. We do not need the actual chain of friends. We just need a yes/no answer: same group or different group? This is a much simpler question. Can we exploit that simplicity?
Yes. The Union-Find data structure (also called a disjoint-set or disjoint-set union, DSU) answers connectivity queries in nearly O(1) time — effectively constant for any practical input size. It supports two operations: Union (merge two groups when a friendship forms) and Find (which group does this person belong to?). Checking if Alice and Bob are in the same group is just comparing Find(Alice) == Find(Bob).
With two clever tricks — union by rank and path compression — each operation takes amortized O(α(n)) time, where α is the inverse Ackermann function. We will devote an entire chapter (Chapter 6) to understanding this function, but the punchline is: for all practical values of n (even n = 1080, the number of atoms in the observable universe), α(n) ≤ 4. Four pointer hops. That is effectively free.
The genius of Union-Find is that it achieves this bound without any complex machinery. No balanced trees, no hash tables, no heap operations. Just an array of parent pointers and two simple rules. The entire implementation is about 20 lines of code in any language. It is one of the rare algorithms where simplicity and optimality coincide perfectly.
The simulation below shows the difference. On the left, edges arrive one at a time, forming connected components. On the right, connectivity queries are answered. Watch how naive BFS/DFS restarts from scratch each query, while Union-Find answers instantly.
Click "Add Edge" to connect random nodes. Click "Query" to test if two random nodes are connected. Watch the step counts.
Notice the pattern: as the graph grows, BFS/DFS queries get slower (more edges to traverse). Union-Find stays flat — just a few pointer hops, regardless of graph size. For dynamic connectivity problems where edges arrive over time, Union-Find is the right tool.
| Operation | BFS/DFS | Union-Find (optimized) |
|---|---|---|
| Add edge | O(1) (just append) | O(α(n)) ≈ O(1) |
| Query: same component? | O(V + E) | O(α(n)) ≈ O(1) |
| Total for m operations | O(m · (V+E)) | O(m · α(n)) |
For n = 1,000,000 nodes and m = 10,000,000 operations, BFS/DFS might take 1013 steps. Union-Find takes about 4 × 107. That is a six-order-of-magnitude speedup. This is why Union-Find appears in Kruskal's MST algorithm, network connectivity, image segmentation, and dozens of interview problems.
Here is the roadmap: we will build disjoint sets from scratch. First the abstract interface (Chapter 1), then a linked-list implementation with its painful O(n) unions (Chapter 1-2), then the forest representation that changes everything (Chapter 3), union by rank (Chapter 4), path compression (Chapter 5), the mind-bending inverse Ackermann analysis (Chapter 6), real-world applications including a percolation simulator (Chapter 7), and interview patterns (Chapter 8-9).
Let us be concrete about what flows through Union-Find. The input is a stream of operations, each one of three types:
The key invariant: at any point, two elements x and y are in the same set if and only if Find(x) == Find(y). This is the only thing Union-Find guarantees. It does not tell you the path between x and y, or the size of the component (without augmentation), or which elements are in a given component. It answers one question: same or different?
This simplicity is its power. By narrowing the interface to a single yes/no question, we can answer it in nearly constant time. A data structure that tried to do more (maintain actual component lists, support deletion, answer distance queries) would necessarily be slower.
While the basic interface is minimal, you can augment Union-Find to track additional information. Common augmentations:
| Augmentation | How | Use case |
|---|---|---|
| Component size | Maintain size[] array, update in Union | "How big is this group?" |
| Component count | Maintain count variable, decrement in Union | "How many groups?" |
| Min/Max element | Track per-component min/max at root | "Smallest ID in this group?" |
| Component sum | Sum values at root, update in Union | "Total weight of group?" |
All augmentations maintain O(α(n)) per operation because they only add O(1) bookkeeping to Union. The key constraint: the augmented information must be mergeable in O(1). Size: sum two sizes. Count: decrement by 1. Min: take the smaller. These are all O(1) merge operations. If merging the auxiliary data takes O(k), then Union becomes O(k + α(n)).
Before we build anything, let us define exactly what we need. A disjoint-set data structure manages a collection S = {S1, S2, ..., Sk} of disjoint sets. "Disjoint" means no element belongs to more than one set: Si ∩ Sj = ∅ for i ≠ j. Each set has a representative — a distinguished member that identifies the set. Which member is the representative does not matter, as long as asking twice for the same (unmodified) set gives the same answer.
Three operations define the interface:
That is the entire interface. Three operations. The power comes from implementing them efficiently.
Before we build anything, let us trace through a concrete example to make the interface tangible. Suppose we are tracking friendships among 6 people (0 through 5):
Notice the transitive property: we never explicitly connected 0 and 2, but after Union(0,1), Union(2,3), and Union(0,3), they end up in the same set. Union-Find handles transitivity automatically — that is the whole point.
Now let us build it. We start with the simplest possible implementation: linked lists.
Each set is a linked list. Every node stores: (1) the element value, (2) a pointer to the next node, and (3) a pointer back to the head of the list (the representative). The list also has a tail pointer for appending.
MakeSet(x): create a single-node list. x points to itself as the head. O(1).
Find(x): follow x's head pointer. Return the head. O(1).
Union(x, y): append one list to the other. Say we append y's list to x's list. We set x's tail to point to y's head, then update every node in y's list to point to x's head as the new representative. This is the problem: updating all those head pointers takes O(|y's list|) time.
python class LinkedListNode: def __init__(self, val): self.val = val self.next = None self.head = self # representative pointer class LinkedListDisjointSet: def __init__(self): self.nodes = {} # val -> node self.tails = {} # head_val -> tail node self.sizes = {} # head_val -> set size def make_set(self, x): node = LinkedListNode(x) self.nodes[x] = node self.tails[x] = node # tail is itself self.sizes[x] = 1 def find(self, x): return self.nodes[x].head.val # O(1) — follow head pointer def union(self, x, y): hx, hy = self.find(x), self.find(y) if hx == hy: return # already same set # Append y's list to x's list self.tails[hx].next = self.nodes[hy] # Update ALL head pointers in y's list — THIS IS SLOW cur = self.nodes[hy] updates = 0 while cur: cur.head = self.nodes[hx] cur = cur.next updates += 1 # O(|y's list|) pointer updates self.tails[hx] = self.tails[hy] self.sizes[hx] += self.sizes[hy]
The worst case is devastating. Suppose we create n singleton sets and then union them sequentially: Union(1,2), Union(2,3), Union(3,4), ..., Union(n-1,n). If we always append the longer list to the shorter one (always updating the single-element set's pointer), each union is O(1). But that is the best case! Consider instead: Union(1,2), Union(1,3), Union(1,4), ..., Union(1,n) where we always append x's list (the big one) to y's list (the small one).
In this adversarial sequence, after Union(1,2) we have list [2,1] (updated 1 pointer). After Union(1,3) we append [2,1] to [3], updating 2 pointers. After Union(1,4) we append [3,2,1] to [4], updating 3 pointers. The i-th union updates i pointers. The total cost is 1 + 2 + 3 + ... + (n-1) = Θ(n²).
Let us compute this concretely. For n = 1000: the naive approach might perform up to 1 + 2 + ... + 999 = 499,500 pointer updates. For n = 1,000,000: up to ~5 × 1011 pointer updates. At 1 nanosecond per update, that is 500 seconds — almost 10 minutes for a million elements. Unacceptable.
Let us count precisely for the adversarial sequence with n = 8. Create sets {0}..{7}, then union in the worst order:
For n elements, the worst case total is n(n-1)/2 = Θ(n²). The average cost per union is Θ(n). This is why we need the weighted-union heuristic (Chapter 2) — it ensures we always pay the smaller cost by appending the shorter list to the longer one.
The simulation below shows the linked-list union in action. Watch how appending a long list forces a walk through every node to update representatives.
Click "Union" to merge two sets. Watch the head-pointer updates cascade through the appended list.
The naive linked-list union is O(n) per operation because we might append a long list to a short one, updating many pointers for no good reason. The fix is simple: always append the shorter list to the longer one. This is called the weighted-union heuristic.
Why does this help? Think about it from a single element's perspective. Each time your head pointer gets updated, it means your list was the shorter one and got appended to a longer list. After the merge, your new list is at least twice as long as your old list (because the other list was at least as long). So your list length at least doubles with every pointer update.
Starting from a list of length 1, how many times can it double before reaching length n? At most log2 n times. So each element's head pointer is updated at most log2 n times across all operations. For n elements, the total number of pointer updates across all unions is at most n · log2 n.
Let us trace through a concrete example. Start with 8 singleton sets {1}, {2}, ..., {8}. Perform these unions:
With weighted union: 12 updates. Naive sequential: up to 28. The savings grow dramatically with n. For n = 1,000,000: naive worst case is ~5 × 1011, weighted union is ~2 × 107. That is a 25,000x improvement.
The proof is a beautiful counting argument. Instead of analyzing individual operations, we ask: how many times total can head pointers be updated? Each time element x's pointer is updated, x's set at least doubles in size (because it was the shorter set being appended to a longer one). Starting from size 1, a set can double at most log2 n times before reaching size n. So each of the n elements has its pointer updated at most log2 n times. Total pointer updates ≤ n log2 n. Adding m constant-time Find operations gives O(m + n log n).
python def union_weighted(self, x, y): hx, hy = self.find(x), self.find(y) if hx == hy: return # Always append SHORTER to LONGER if self.sizes[hx] < self.sizes[hy]: hx, hy = hy, hx # swap so hx is the larger set # Append hy's list to hx's list self.tails[hx].next = self.nodes[hy] cur = self.nodes[hy] while cur: cur.head = self.nodes[hx] # update head pointer cur = cur.next self.tails[hx] = self.tails[hy] self.sizes[hx] += self.sizes[hy]
O(m + n log n) is good — dramatically better than the naive Θ(n²). But we can do much better still. The linked-list representation is fundamentally limited by a structural constraint: every node must store a direct pointer to the list head. This makes Find O(1) but forces Union to walk the entire appended list to update those pointers. There is no way around it with this representation.
What if we flipped the trade-off entirely? Instead of every node pointing directly to the representative (making Find trivial but Union expensive), what if nodes pointed to their parent in a tree? Union would just change one parent pointer. Find would walk up the tree to the root. The root is the representative.
This reversal of the cost structure is the key insight. It sounds like we are trading one problem for another — and initially, we are. But the tree representation is amenable to optimizations (union by rank and path compression) that the linked-list representation simply cannot support. These optimizations bring the amortized cost down from O(log n) to O(α(n)). That is the forest representation, coming in Chapter 3.
Watch the cumulative pointer updates as 16 elements are merged. Weighted union (teal) vs naive (orange).
Linked lists have an annoying asymmetry: Find is O(1) because every node stores a direct pointer to the head, but Union is expensive because we must update all those pointers. What if we used a different representation that made Union trivial?
Here is the idea: represent each set as a rooted tree. Each node stores a parent pointer (not a head pointer). The root of the tree is the representative — its parent pointer points to itself. A collection of such trees is called a disjoint-set forest.
MakeSet(x): create a tree with a single node x. x is its own parent. O(1).
Find(x): follow parent pointers from x up to the root. Return the root. O(depth of x).
Union(x, y): find the roots of x and y. Make one root the child of the other. O(1) after the two Finds.
python class NaiveForest: def __init__(self, n): self.parent = list(range(n)) # parent[i] = i means i is a root def make_set(self, x): self.parent[x] = x # x is its own parent (root) def find(self, x): while self.parent[x] != x: # walk up to root x = self.parent[x] return x # O(depth) def union(self, x, y): rx, ry = self.find(x), self.find(y) if rx != ry: self.parent[ry] = rx # make ry child of rx — O(1)!
Union is now O(1) (just change one pointer) — a dramatic improvement over linked lists. But we have shifted the cost: Find is now O(depth) instead of O(1). In the worst case, trees can degenerate into long chains. If we union 1→2, then 2→3, then 3→4, ..., we get a chain of depth n-1. Find on the bottom element walks the entire chain: O(n).
The most striking thing about the forest representation is how little memory it uses. The entire data structure is a single array:
python parent = list(range(n)) # parent[i] = i means i is a root # That's the entire data structure. One array. n integers. # Find: chase parent pointers. Union: set one root's parent.
No pointers, no nodes, no dynamic memory allocation, no linked-list traversal, no garbage collection. Just an array of integers that fits in L1 cache for reasonable n. This is why Union-Find is so fast in practice — the memory access pattern is cache-friendly. Each Find chases parent pointers, which are all in the same contiguous array.
Compare this to the linked-list implementation: each node is a separate object on the heap, connected by pointers that can scatter across memory. Cache misses dominate the runtime. The forest representation, stored in a flat array, avoids this entirely.
On modern CPUs, an L1 cache miss costs ~4 ns while a cache hit costs ~1 ns. For a tree of depth 20, the linked-list version might cause 20 cache misses (80 ns), while the array version likely hits in L1 for the entire traversal (20 ns). This 4x constant-factor improvement compounds over millions of operations.
This is also why Union-Find is a favorite in competitive programming: the entire implementation is 15-20 lines of code, uses two arrays, and beats any alternative for dynamic connectivity problems.
Let us trace all three operations on a concrete example with n = 6 elements, showing exactly what happens to the parent array:
Notice how path compression on Find(3) saved work for the subsequent Union(3,5) — Find(3) returned 0 immediately instead of traversing 3→2→0. This is the self-optimizing property: each Find makes future operations cheaper.
The simulation below lets you build a disjoint-set forest interactively. Create sets, union them, and run Find to see the tree structure. Notice how naive unions can create deep chains.
Click nodes to select, then "Union" to merge, or "Find" to highlight the path to the root. Watch the tree structure evolve.
Play with the simulation. Create 8 nodes, then union them in a chain: union(0,1), union(1,2), union(2,3), etc. — always attaching the new node at the bottom. You will see the tree degenerate into a linked list. Now Find on node 7 walks 7 edges. This motivates union by rank: keep the trees shallow.
The forest representation looks worse at first glance: Find is O(depth) instead of O(1). But it has two critical advantages that the linked-list representation lacks.
First, Union is trivially O(1) — just change one pointer. No walking through a list. The linked list must walk the entire appended list to update head pointers; the forest just redirects one root.
Second, and more importantly, the forest representation is amenable to optimization. We can add two heuristics — union by rank and path compression — that together bring the amortized cost down to O(α(n)) per operation. This is impossible with linked lists because their structure is too rigid. In a linked list, every node must point to the head; there is no room for cleverness. In a tree, the path from node to root is flexible — we can reshape it.
The entire data structure fits in a single array. For n elements, we need exactly one array of size n: parent[]. Element i's parent is parent[i]. If parent[i] == i, then i is a root. That is it. No pointers, no linked list nodes, no dynamic memory allocation. Just an array of integers. This is why Union-Find is so cache-friendly and fast in practice.
Chapter 3 showed that naive forest unions can create chains of depth n-1. The problem is tree depth: if we carelessly make one root the child of the other, we might attach a tall tree under a short tree, increasing maximum depth unnecessarily. Imagine a tree of height 10 being attached under a tree of height 3 — the result has height 11. But if we had attached the height-3 tree under the height-10 tree, the result would still have height 10. Same merge, but the max depth difference is 1 vs 11.
The fix: always attach the shorter tree under the root of the taller tree. This is called union by rank.
We store a rank for each node — an upper bound on the height of the subtree rooted at that node. Initially, every node has rank 0. When we union two trees:
python class UnionByRank: def __init__(self, n): self.parent = list(range(n)) self.rank = [0] * n # upper bound on height def find(self, x): while self.parent[x] != x: x = self.parent[x] return x def union(self, x, y): rx, ry = self.find(x), self.find(y) if rx == ry: return if self.rank[rx] < self.rank[ry]: self.parent[rx] = ry # attach shorter under taller elif self.rank[rx] > self.rank[ry]: self.parent[ry] = rx else: self.parent[ry] = rx self.rank[rx] += 1 # only case where rank increases
With union by rank, a tree with root rank r has at least 2r nodes. Why? Prove it by induction. Base case: rank 0 tree has 1 = 20 nodes. Inductive step: rank r is only created by merging two rank r-1 trees. Each has at least 2r-1 nodes. Together: 2r-1 + 2r-1 = 2r nodes.
Since a tree with n nodes has at most 2r ≤ n, the maximum rank is log2 n. Therefore Find takes O(log n) — the tree height is at most logarithmic. This alone brings the total time for m operations down to O(m log n).
Let us verify with concrete numbers. For n = 1,000,000: max rank ≤ log2 106 ≈ 20. Every Find visits at most 20 nodes. For 10 million operations: at most 200 million node visits. At ~1 ns per node visit: about 0.2 seconds. Compare to naive forest: up to 1013 node visits, or about 10,000 seconds. Union by rank alone gives a 50,000x speedup.
But wait — it gets even better. With rank, the trees are not just shallow but also wide. A rank-4 tree with 16 nodes has at most 4 levels, so on average each level has 4 nodes. This means most elements are close to the root already. Path compression (next chapter) will exploit this by moving even the few deep nodes directly to the root.
The simulation below shows the dramatic difference between naive union (which can degenerate to a chain) and union by rank (which stays balanced). Both start with 16 nodes and perform the same unions.
Left: naive union creates chains. Right: union by rank keeps trees shallow. Watch the max depth.
After all 15 unions on 16 elements, the naive tree can reach depth 15 (a chain). The union-by-rank tree has depth at most 4 (since log2 16 = 4). That is the difference between O(n) and O(log n) per Find.
| Strategy | Find | Union | m operations total |
|---|---|---|---|
| Naive forest | O(n) | O(1) | O(m · n) |
| Union by rank | O(log n) | O(log n)* | O(m log n) |
*Union calls Find twice, so its cost is dominated by Find.
O(m log n) is a huge improvement over O(m · n). But we are not done. Path compression will bring this down to O(m · α(n)), which is effectively O(m).
Several properties of ranks are worth memorizing — they appear in the full amortized analysis and in interview discussions:
Property 1: Ranks only increase along paths to the root. If y is an ancestor of x, then rank[y] > rank[x]. This is because a node's rank only increases when it is a root during a Union operation, and once a node becomes a non-root, its rank never changes again.
Property 2: A node with rank r has at least 2r descendants. We proved this above by induction. Consequence: at most n/2r nodes can have rank r. For rank r = log n, there is at most one such node.
Property 3: The maximum rank is at most ⌊log2 n⌋. Since a rank-r tree has at least 2r nodes and we have only n nodes total, r ≤ log2 n.
Property 4: Ranks never decrease. Once assigned, a node's rank only increases (and only while it is a root). Path compression changes parent pointers but never touches ranks. This is a deliberate design choice — maintaining accurate heights after compression would cost more than it saves.
Instead of rank, you can track the size (number of elements) of each tree. Attach the smaller tree under the larger tree's root. This gives the same O(log n) height bound. Some implementations prefer size because it is useful for answering "how many elements are in this set?" without extra traversal.
python class UnionBySize: def __init__(self, n): self.parent = list(range(n)) self.size = [1] * n # number of elements in tree def find(self, x): if self.parent[x] != x: self.parent[x] = self.find(self.parent[x]) return self.parent[x] def union(self, x, y): rx, ry = self.find(x), self.find(y) if rx == ry: return False if self.size[rx] < self.size[ry]: rx, ry = ry, rx # rx is now the larger tree self.parent[ry] = rx self.size[rx] += self.size[ry] return True def get_size(self, x): return self.size[self.find(x)] # size of x's component
In interviews, union by size is slightly more practical because you get component sizes for free. In CLRS, the formal analysis uses rank. Both achieve O(m · α(n)) when combined with path compression.
Let us trace the construction of a tree with the maximum possible rank for 16 elements. We need to merge equal-rank trees at every step:
This is the worst case for union by rank — every merge is between equal-rank trees. In practice, many merges are between unequal-rank trees (which do not increase the rank), so actual depths are usually much less than log n.
We have union by rank keeping trees at O(log n) height. Good, but not great. For n = 1,000,000, log2 n ≈ 20. Every Find walks up to 20 hops. Over 10 million operations, that is 200 million pointer traversals. Can we do better?
Yes. Path compression makes trees flatter over time. The idea is beautifully simple: during Find(x), after walking up to the root, make every node along the path point directly to the root. Future Finds on any of these nodes will take O(1).
Think of it like asking for directions. You start at node x and ask: "Who is my representative?" You walk through intermediate nodes, each saying "I don't know, ask my parent." You finally reach the root. On your way back, you tell every node you visited: "The root is directly above you. Skip the middlemen." Next time anyone asks, they jump straight to the root.
python class UnionFind: def __init__(self, n): self.parent = list(range(n)) self.rank = [0] * n def find(self, x): # Path compression: make every node point directly to root if self.parent[x] != x: self.parent[x] = self.find(self.parent[x]) # recursive return self.parent[x] def union(self, x, y): rx, ry = self.find(x), self.find(y) if rx == ry: return if self.rank[rx] < self.rank[ry]: self.parent[rx] = ry elif self.rank[rx] > self.rank[ry]: self.parent[ry] = rx else: self.parent[ry] = rx self.rank[rx] += 1
The recursive Find is elegant: it walks to the root, then on the way back (unwinding recursion), sets every node's parent to the root. Here is the iterative version for those who prefer it (and for avoiding stack overflow on very deep trees):
python def find_iterative(self, x): # Phase 1: find root root = x while self.parent[root] != root: root = self.parent[root] # Phase 2: compress — point everyone to root while self.parent[x] != root: next_x = self.parent[x] self.parent[x] = root x = next_x return root
Let us trace through a detailed example. Start with this tree (built by union by rank):
That first Find(7) cost 7 steps but restructured the tree so that ALL subsequent Finds on ANY of those nodes cost just 1 step. The upfront work pays for itself many times over. This is the essence of amortized analysis: individual operations may be expensive, but the average over many operations is cheap.
The simulation below is the showcase. Build a tree, then run Find on deep nodes. Watch path compression flatten the tree in real time. Every node on the path jumps directly to the root. Subsequent Finds on those nodes are instant.
Build a deep tree, then click "Find" on a deep node. Watch all nodes on the path jump to point directly at the root. Run Find again — it is instant.
Full path compression (every node points to root) is the strongest. There are two weaker variants that are simpler but nearly as effective:
Path splitting: during Find, make every node point to its grandparent. Each node on the path moves up one level. No second pass needed — done in a single upward walk.
Path halving: during Find, make every other node point to its grandparent. Half the nodes skip a level. Even simpler, still achieves O(m · α(n)).
python def find_path_splitting(self, x): while self.parent[x] != x: self.parent[x] = self.parent[self.parent[x]] # point to grandparent x = self.parent[x] # advance to (old) grandparent return x def find_path_halving(self, x): while self.parent[x] != x: self.parent[x] = self.parent[self.parent[x]] # point to grandparent x = self.parent[x] # skip ahead (halving) return x
Wait — path splitting and path halving look identical in code? Yes! Both do "point to grandparent, then advance." The conceptual difference is: path splitting updates every node (making each point to its grandparent), while path halving updates every other node. But since "advance to the new parent" after redirecting to grandparent is the same in both cases, the code is literally the same loop. The formal analysis differs slightly, but both achieve O(m · α(n)).
In competitive programming and interviews, use the recursive version with full path compression. It is the shortest to write and the easiest to get right:
Here is the subtle point: union by rank alone gives O(log n) per operation. Path compression alone also gives O(log n) amortized per operation. So why does combining them give O(α(n)) instead of O(log n)?
The intuition: union by rank ensures that trees never get too deep in the first place (O(log n) height). Path compression then flattens these already-shallow trees even further. After a few Find operations, most paths are nearly flat. The two heuristics are synergistic: rank keeps trees manageable so that compression has less work to do, and compression prevents the modest depth from being re-traversed repeatedly.
Tarjan and van Leeuwen proved (1984) that the amortized cost per operation is Θ(α(n)). Not O(α(n)), but Θ(α(n)) — tight both above and below. This is one of the most celebrated results in the analysis of algorithms.
Let us summarize the entire progression from naive to optimal with a concrete scenario: 1 million elements, 10 million operations.
| Strategy | Worst-case total operations | Time at 1ns/op |
|---|---|---|
| Naive linked list | ~5 × 1011 | ~500 seconds |
| Weighted union (linked list) | ~2 × 108 | ~0.2 seconds |
| Naive forest | ~1013 | ~10,000 seconds |
| Union by rank | ~2 × 108 | ~0.2 seconds |
| Rank + path compression | ~4 × 107 | ~0.04 seconds |
From 10,000 seconds down to 0.04 seconds. A factor of 250,000. That is the power of the right data structure with the right heuristics. And the implementation is 20 lines of Python.
| Strategy | Find (amortized) | Total for m ops |
|---|---|---|
| Naive forest | O(n) | O(m · n) |
| Union by rank only | O(log n) | O(m log n) |
| Path compression only | O(log n) amortized | O(m log n) |
| Both combined | O(α(n)) | O(m · α(n)) |
We keep saying O(α(n)) is "effectively O(1)." Let us make this precise. What is α, and why is it so absurdly slow-growing?
Start with the Ackermann function A(i, j), defined recursively:
Level 0 is just "add 1." Level 1 turns out to be "add 2" (roughly). Level 2 is "multiply by 2" (roughly). Level 3 is "exponentiate" (roughly). Level 4 is "tower of powers" (tetration). Each level applies the previous level's operation a number of times, creating a hierarchy of increasingly explosive growth.
Let us compute some values by hand:
A(4, 2) already has about 19,729 digits. A(4, 4) is so large that writing it in decimal would require more digits than there are atoms in the observable universe — by an incomprehensible margin. The Ackermann function does not merely grow fast. It defines a boundary beyond which "fast-growing" needs new words.
Let us be very precise about the formal definition. In CLRS, the inverse Ackermann function is defined as:
where Ak is the k-th row of the Ackermann-like function defined with specific initial conditions. Different papers use slightly different definitions (some use A(k,k) instead of Ak(1)), but they all agree on the essential property: α grows so slowly that it is effectively constant.
Here is α(n) for various values of n:
| n | α(n) | Context |
|---|---|---|
| 1 | 0 | A single element |
| 2-3 | 1 | A pair or triple |
| 4-7 | 2 | A handful |
| 8-2047 | 3 | Up to 2K elements |
| 2048 to A(4,4) | 4 | Everything you will ever see |
| > A(4,4) | 5 | Will never happen in practice |
The gap between α(n) = 4 and α(n) = 5 is absurd. α(n) = 4 covers every number up to A(4,4), which is a tower of 2s that is 65,536 levels tall. To give a sense of scale: a tower of 2s just 5 levels tall is 22222 = 265536, a number with about 19,729 digits. A tower of 2s 65,536 levels tall is... there are no words. The number of digits in the number of digits in the number would itself have more digits than atoms in the observable universe.
Your computer, your data center, every computer that will ever exist — they all live in the α(n) = 4 regime. The difference between O(m · α(n)) and O(m) is, practically speaking, nothing. This is why in most contexts (textbooks, interviews, practical discussions) we say Union-Find is "O(1) amortized" and leave the α(n) as a footnote for the theoretically curious.
The simulation below visualizes the growth of the Ackermann function. Watch how quickly it escapes any reasonable bound.
Each level grows incomprehensibly faster than the last. Heights capped for visualization — actual values overflow the universe.
Let us build intuition for why Ackermann grows so fast by thinking about levels of recursion:
Each level applies the previous level's operation a number of times. Level 0 is addition. Iterating addition gives multiplication (level 1). Iterating multiplication gives exponentiation (level 2). Iterating exponentiation gives tetration (level 3). Iterating tetration gives... something so fast that it has no standard name outside of Knuth's up-arrow notation.
The key realization: the Ackermann function exhausts the entire hierarchy of standard operations. No polynomial, exponential, or tower function can keep up with it. And yet its inverse grows so slowly that it barely moves — α(n) = 4 covers more values than you will ever encounter.
This makes the Union-Find bound remarkable: the data structure is nearly optimal (proven by Tarjan's lower bound), and the gap between "optimal" and "O(1)" is so small that it disappears for all practical purposes.
This bound is not just an upper bound — it is tight. Tarjan proved in 1975 that any sequence of m Union-Find operations on n elements requires Ω(m · α(n)) time in the worst case using pointer-based data structures. The proof constructs an adversarial sequence of operations that forces any pointer-based scheme to do α(n) work per operation on average. You cannot do better within this model. The O(α(n)) per operation is optimal.
For comparison, here is the full hierarchy of how "close to O(1)" various Union-Find strategies get:
| Strategy | Amortized per-op | m ops on n elements |
|---|---|---|
| Naive linked list | O(n) | O(m · n) |
| Weighted union (linked list) | O(log n) | O(m + n log n) |
| Naive forest | O(n) | O(m · n) |
| Union by rank only | O(log n) | O(m log n) |
| Path compression only | O(log n) amortized | O(m log n) |
| Both combined | O(α(n)) | O(m · α(n)) |
| Lower bound (Tarjan) | Ω(α(n)) | Ω(m · α(n)) |
Union-Find appears everywhere in computer science. Let us survey the major applications, then build the showcase: a percolation simulator.
Kruskal's algorithm finds the Minimum Spanning Tree of a weighted graph. Sort edges by weight, then process each edge: if it connects two different components, add it to the MST. If it connects nodes already in the same component, skip it (it would form a cycle). The "same component" check is exactly Union-Find.
python def kruskal(n, edges): edges.sort(key=lambda e: e[2]) # sort by weight uf = UnionFind(n) mst = [] for u, v, w in edges: if uf.find(u) != uf.find(v): # different components? uf.union(u, v) # merge them mst.append((u, v, w)) # add edge to MST return mst # Time: O(E log E) for sorting + O(E · α(V)) for union-find # Total: O(E log E) since sorting dominates
Without Union-Find, the cycle check would require a DFS per edge: O(V + E) per edge, O(E(V+E)) total. With Union-Find: O(E · α(V)) for all cycle checks combined — essentially free compared to the sort.
Let us trace Kruskal's on a small example with 5 nodes and 7 edges:
Each cycle check (the "SAME" cases) was detected instantly by Union-Find. Without it, we would need to run BFS/DFS to check if adding edge (0,2) creates a cycle — traversing through 0→1→3→2 to discover they are already connected.
Edges sorted by weight. Green = MST edge. Red dashed = rejected (cycle). Yellow dashed = next candidate.
Any problem where elements are grouped over time and you need to query group membership: social networks (friend groups), computer networks (connected machines), image processing (connected pixel regions). Union-Find handles the incremental version perfectly: edges arrive but never leave.
Concrete example: you are monitoring a network of 10,000 servers. Connections come online over time. At any moment, an operator asks: "Can server A reach server B?" With Union-Find, the answer takes O(α(10000)) ≤ 4 pointer hops. Without it, you would need to run BFS/DFS through potentially thousands of network links.
For the fully dynamic version (edges can be added AND deleted), Union-Find does not suffice — you need link-cut trees (Sleator-Tarjan, 1983) for O(log n) per operation, or offline algorithms that process deletions in reverse as insertions.
Given an image of W × H pixels, group adjacent pixels with similar colors into regions. The algorithm:
python def segment_image(pixels, W, H, threshold): uf = UnionFind(W * H) for y in range(H): for x in range(W): for dx, dy in [(1,0),(0,1)]: nx, ny = x + dx, y + dy if nx < W and ny < H: diff = color_distance(pixels[y][x], pixels[ny][nx]) if diff < threshold: uf.union(y*W+x, ny*W+nx) return uf # each component = one segment
For a 1920×1080 image, this processes ~4 million pixel pairs. With Union-Find: ~4M × O(α(2M)) operations — effectively 16 million simple pointer operations, done in milliseconds. Without it (using BFS per pair): potentially hours. This is a simplified version of Felzenszwalb's algorithm (2004), which additionally considers the internal variation within segments to set adaptive thresholds.
Given pairs of equivalent items (e.g., "accounts with the same email belong to the same person"), Union-Find groups all transitively equivalent items. If Alice has emails {a, b} and Bob has emails {b, c}, they share email b, so they are the same person — and all three emails belong to one account. This is the "Accounts Merge" problem on LeetCode (LC 721) — one of the most common interview applications of Union-Find.
The pattern: create a Union-Find over the items (emails, usernames, etc.). For each equivalence pair, call Union. Then group by Find to collect equivalence classes. The transitive closure is handled automatically — if a~b and b~c, then Find(a) == Find(b) == Find(c) after the unions.
Least Common Ancestor (Offline). Tarjan's offline LCA algorithm uses Union-Find to answer all LCA queries in O(n + q · α(n)) time, where q is the number of queries. During a DFS traversal, when you finish processing a subtree rooted at v, you union v with its parent. To answer "LCA(u, v)?", wait until both u and v have been visited. At that point, Find(u) gives the LCA. This is one of the most elegant applications of Union-Find — using it during a tree traversal to track ancestry relationships.
Minimum Spanning Forest. Kruskal's algorithm generalizes naturally to disconnected graphs. Process all edges sorted by weight, and use Union-Find for cycle detection. The result is a minimum spanning forest — a minimum spanning tree for each connected component. The algorithm does not need to know how many components there are; Union-Find handles it automatically.
Maze Generation. Start with a grid where every cell is walled off from its neighbors. Randomly knock down walls between adjacent cells. Before knocking down a wall, check if the two cells are already connected (using Union-Find). If they are, skip this wall (knocking it down would create a cycle). If they are not, knock it down and union the cells. The result is a random maze with exactly one path between any two cells — a spanning tree of the grid graph. The Union-Find version is equivalent to randomized Kruskal's.
The showcase application. Imagine a grid of cells, initially all blocked. You randomly open cells one at a time. Question: does an open path exist from the top row to the bottom row? This is percolation — a fundamental problem in physics (fluid flow through porous material), material science, and epidemiology (disease spread).
Model it with Union-Find: each open cell is an element. When a cell opens, union it with any open neighbors. Add two virtual nodes: one connected to all open cells in the top row, one connected to all open cells in the bottom row. The system percolates when the two virtual nodes are in the same set.
The simulation below lets you watch percolation happen. Cells open randomly. Blue = open, dark = blocked. When a path forms from top to bottom, the system percolates and the connected path lights up green.
Open cells randomly until the grid percolates (top-to-bottom path). The threshold is around 59.3%.
Try running "Run Until Percolation" several times with a 20x20 grid. You will notice it consistently percolates when about 55-65% of cells are open — clustering tightly around the theoretical threshold of 59.3%. This is a beautiful example of a phase transition: a tiny change in the fraction of open cells flips the system from "almost certainly blocked" to "almost certainly percolating."
The percolation simulation uses a clever optimization: virtual top and bottom nodes. Instead of checking whether any top-row cell is connected to any bottom-row cell (which would require O(n) Find calls), we add two virtual nodes:
When a cell in the top row opens, union it with virtual_top. When a cell in the bottom row opens, union it with virtual_bottom. To check percolation: just check if virtual_top and virtual_bottom are connected. One single Find pair instead of scanning all top-row and bottom-row cells. This is a common Union-Find pattern: add sentinel nodes to simplify boundary conditions.
The percolation problem is central to computational physics. Scientists need to simulate millions of random configurations to estimate the percolation threshold. Each configuration opens ~60% of an n × n grid and checks connectivity. For n = 1000, that is 600,000 cell openings per trial, each requiring neighbor unions. Without Union-Find, each connectivity check would require BFS over potentially hundreds of thousands of cells. With Union-Find, the entire trial completes in milliseconds. This is how Monte Carlo simulations of phase transitions become computationally feasible.
python import random def estimate_threshold(n, trials=1000): """Monte Carlo estimation of percolation threshold on n×n grid.""" total_frac = 0.0 for _ in range(trials): uf = UnionFind(n * n + 2) # +2 for virtual top/bottom top, bot = n * n, n * n + 1 grid = [False] * (n * n) order = list(range(n * n)) random.shuffle(order) for count, cell in enumerate(order): grid[cell] = True r, c = divmod(cell, n) for dr, dc in [(-1,0),(1,0),(0,-1),(0,1)]: nr, nc = r + dr, c + dc if 0 <= nr < n and 0 <= nc < n: if grid[nr * n + nc]: uf.union(cell, nr * n + nc) if r == 0: uf.union(cell, top) if r == n - 1: uf.union(cell, bot) if uf.connected(top, bot): total_frac += (count + 1) / (n * n) break return total_frac / trials # estimate_threshold(100, 1000) ≈ 0.593
Running 1000 trials on a 100×100 grid takes about 2 seconds with Union-Find. Without it (using BFS per step), the same simulation would take about 20 minutes. That is the difference between a quick experiment and an overnight job.
Union-Find is one of the most interview-friendly data structures because: (1) it is simple to implement from scratch, (2) recognizing when to use it is a skill that impresses interviewers, and (3) many seemingly complex graph problems reduce to union-find in disguise.
This is the production-grade UnionFind class you should have memorized. It handles all edge cases and is ready for any interview or contest.
python class UnionFind: def __init__(self, n): self.parent = list(range(n)) self.rank = [0] * n self.count = n # number of disjoint sets def find(self, x): if self.parent[x] != x: self.parent[x] = self.find(self.parent[x]) return self.parent[x] def union(self, x, y): rx, ry = self.find(x), self.find(y) if rx == ry: return False # already connected if self.rank[rx] < self.rank[ry]: rx, ry = ry, rx self.parent[ry] = rx if self.rank[rx] == self.rank[ry]: self.rank[rx] += 1 self.count -= 1 return True # successfully merged def connected(self, x, y): return self.find(x) == self.find(y)
Let us walk through this template line by line, because getting the details wrong in an interview is easy:
Line 1: __init__. parent[i] = i makes every element its own root. rank[i] = 0 because a single node has height 0. count = n because we start with n singleton sets.
Lines 5-7: find with path compression. The recursion is the critical trick. The call self.find(self.parent[x]) recursively finds the root, and then self.parent[x] = ... assigns that root as x's direct parent. Every node on the path gets compressed. In Python, the recursion depth limit is 1000 by default — for very large inputs, use the iterative version or increase sys.setrecursionlimit.
Lines 10-14: union by rank. We swap so rx always has the higher rank. Then ry becomes rx's child. Rank only increases when equal — this is the case where the tree actually gets taller. The return value (True/False) is essential for cycle detection problems.
Key implementation details to remember:
1. Number of Connected Components. Given n nodes and a list of edges, how many components? Answer: make n sets, union along each edge, return uf.count.
python def count_components(n, edges): uf = UnionFind(n) for u, v in edges: uf.union(u, v) return uf.count
2. Number of Islands (2D grid). Given a grid of '1' (land) and '0' (water), count islands. Classic BFS/DFS problem, but Union-Find works too: treat each land cell as a node, union with adjacent land cells.
python def num_islands(grid): if not grid: return 0 rows, cols = len(grid), len(grid[0]) uf = UnionFind(rows * cols) water = 0 for r in range(rows): for c in range(cols): if grid[r][c] == '0': water += 1 continue for dr, dc in [(1,0),(0,1)]: # right and down only nr, nc = r + dr, c + dc if nr < rows and nc < cols and grid[nr][nc] == '1': uf.union(r * cols + c, nr * cols + nc) return uf.count - water
The key insight: we index the 2D grid as a 1D array. Cell (r, c) maps to index r * cols + c. We only union rightward and downward to avoid double-processing. We subtract water cells from the total count. This runs in O(rows × cols × α(rows × cols)) — effectively linear in the number of cells.
3. Redundant Connection (LC 684). Given a tree with one extra edge (creating exactly one cycle), find the extra edge. Process edges in order; the first edge that connects two already-connected nodes is the answer.
python def find_redundant(edges): n = len(edges) uf = UnionFind(n + 1) # 1-indexed nodes for u, v in edges: if not uf.union(u, v): # already connected = cycle! return [u, v]
This is a beautiful example of cycle detection: a tree with n nodes has exactly n-1 edges. If you are given n edges (one too many), one edge must create a cycle. Union-Find detects it on the spot. The key insight is that union returns False precisely when the edge is redundant — both endpoints already belong to the same connected component, meaning a path between them already exists.
Variant: Redundant Connection II (LC 685) is the directed version — much harder because you must handle two cases (node with two parents, or a cycle). The solution uses Union-Find with careful case analysis. This is a Google favorite interview question.
4. Accounts Merge (LC 721). Given accounts where each account is [name, email1, email2, ...], merge accounts that share any email. The trick: create a Union-Find over emails (not accounts). For each account, union all its emails together. Then group emails by their root representative.
python from collections import defaultdict def accounts_merge(accounts): # Map each email to an integer ID email_to_id = {} email_to_name = {} idx = 0 for name, *emails in accounts: for email in emails: if email not in email_to_id: email_to_id[email] = idx idx += 1 email_to_name[email] = name uf = UnionFind(idx) for name, *emails in accounts: for i in range(1, len(emails)): uf.union(email_to_id[emails[0]], email_to_id[emails[i]]) # Group by root groups = defaultdict(list) for email, eid in email_to_id.items(): groups[uf.find(eid)].append(email) return [[email_to_name[g[0]]] + sorted(g) for g in groups.values()]
5. Earliest Time Everyone Knows Each Other (LC 1101). Given friendship logs [(timestamp, u, v), ...], find the earliest time when all n people are in one connected component. Process friendships in chronological order, union on each. When uf.count == 1, return the timestamp.
python def earliest_all_connected(logs, n): logs.sort() # sort by timestamp uf = UnionFind(n) for t, u, v in logs: uf.union(u, v) if uf.count == 1: return t # everyone connected! return -1 # never fully connected
| Operation | Time (amortized) | Notes |
|---|---|---|
| MakeSet | O(1) | Initialize parent[x]=x, rank[x]=0 |
| Find | O(α(n)) | With path compression |
| Union | O(α(n)) | Two Finds + O(1) pointer update |
| Connected | O(α(n)) | Two Finds + comparison |
| Space | O(n) | Two arrays: parent[] and rank[] |
For m operations on n elements: O(n + m · α(n)) total. Since α(n) ≤ 4 for all practical n, this is O(n + 4m) = O(n + m). Linear in the number of operations. You cannot ask for more.
cpp struct DSU { vector<int> parent, rank_; int components; DSU(int n) : parent(n), rank_(n, 0), components(n) { iota(parent.begin(), parent.end(), 0); // parent[i] = i } int find(int x) { if (parent[x] != x) parent[x] = find(parent[x]); // path compression return parent[x]; } bool unite(int a, int b) { a = find(a); b = find(b); if (a == b) return false; if (rank_[a] < rank_[b]) swap(a, b); parent[b] = a; if (rank_[a] == rank_[b]) rank_[a]++; components--; return true; } };
In C++, use iota to initialize parent[i] = i. Name the method unite not union since union is a reserved keyword in C/C++. The rest is identical to the Python version.
java class UnionFind { int[] parent, rank; int count; UnionFind(int n) { parent = new int[n]; rank = new int[n]; count = n; for (int i = 0; i < n; i++) parent[i] = i; } int find(int x) { if (parent[x] != x) parent[x] = find(parent[x]); // path compression return parent[x]; } boolean union(int a, int b) { a = find(a); b = find(b); if (a == b) return false; if (rank[a] < rank[b]) { int t = a; a = b; b = t; } parent[b] = a; if (rank[a] == rank[b]) rank[a]++; count--; return true; } }
The Java version is essentially the same as C++ — allocate arrays in the constructor, path compression in find, union by rank in union. Java's default stack depth handles recursion up to ~5000 frames, which is sufficient for most inputs (since rank never exceeds ~20 for n ≤ 106).
Build and query a union-find structure. Enter node IDs (0-15) and perform operations.
Union-Find does not exist in a vacuum. It is a building block for graph algorithms, a tool for dynamic connectivity, and a gateway to deeper ideas in amortized analysis.
| Concept | Key Idea | Complexity |
|---|---|---|
| Linked-list representation | Head pointers for O(1) Find | O(n) Union |
| Weighted union heuristic | Append short to long | O(log n) amortized |
| Forest representation | Parent pointers, tree structure | O(n) Find worst case |
| Union by rank | Attach short tree under tall | O(log n) Find |
| Path compression | Flatten on every Find | O(α(n)) combined |
| Inverse Ackermann | α(n) ≤ 4 for all practical n | Optimal (Tarjan '75) |
| Algorithm/Problem | Role of Union-Find |
|---|---|
| Kruskal's MST (CLRS Ch 23) | Cycle detection during edge processing |
| Connected components (CLRS Ch 22) | Alternative to BFS/DFS for incremental connectivity |
| LCA (offline Tarjan's algorithm) | Merging subtrees during DFS |
| Percolation | Detecting top-to-bottom connectivity |
| Image segmentation | Merging similar adjacent pixels |
| Network redundancy | Finding redundant connections (cycles) |
Union-Find handles incremental connectivity: you can add edges (union) but not remove them. If you need to delete edges and maintain connectivity, you need more advanced structures: link-cut trees (Sleator and Tarjan, 1983) support fully dynamic connectivity in O(log n) per operation. For offline problems where all queries are known in advance, offline deletion via time-reversed union-find can help.
Union-Find also does not support finding the actual path between two nodes — only whether a path exists. For path queries, use BFS/DFS or a shortest-path algorithm.
Union-Find is not always the right tool. Here are common mistakes:
A good rule of thumb: if the problem says "edges are added over time" or "groups merge over time" or "find if two elements are equivalent," think Union-Find. If the problem says "find the shortest path" or "edges can be removed" or "directed graph," think other tools.
In practice, there are several ways to implement Union-Find, depending on your needs:
| Variant | When to use | Extra features |
|---|---|---|
| Rank + path compression | CLRS-standard, proofs | Optimal bound proven |
| Size + path compression | Interviews, contests | Component sizes for free |
| Rank + path halving | Iterative (no stack overflow) | Safe for n > 106 |
| Weighted quick-union | Sedgewick/Princeton style | Same as size variant |
| Rollback-capable | Offline/divide-and-conquer | Undo unions (no compression) |
The rollback variant is worth noting: if you need to undo unions (for divide-and-conquer on queries), you cannot use path compression (it is destructive). Instead, use union by rank without compression, and maintain a stack of changes. Each union pushes the old parent/rank values; rollback pops and restores them. The cost is O(log n) per operation instead of O(α(n)), but the undo capability is essential for offline dynamic connectivity.
python class RollbackUF: """Union-Find with undo capability. No path compression.""" def __init__(self, n): self.parent = list(range(n)) self.rank = [0] * n self.history = [] # stack of (node, old_parent, old_rank) def find(self, x): while self.parent[x] != x: # NO path compression! x = self.parent[x] return x def union(self, a, b): a, b = self.find(a), self.find(b) if a == b: return if self.rank[a] < self.rank[b]: a, b = b, a self.history.append((b, self.parent[b], self.rank[a])) self.parent[b] = a if self.rank[a] == self.rank[b]: self.rank[a] += 1 def rollback(self): if not self.history: return b, old_parent, old_rank_a = self.history.pop() a = self.parent[b] self.rank[a] = old_rank_a self.parent[b] = old_parent
Union-Find is a masterclass in algorithm design. Two observations — "keep trees short" and "flatten paths as you go" — combine to produce a data structure that is asymptotically optimal and trivial to implement. Tarjan's 1975 analysis showed the bound is tight: no pointer-based structure can do better. And yet the implementation fits in 20 lines of Python.
The Union-Find data structure has a rich history in algorithm design:
| Year | Contribution | Complexity |
|---|---|---|
| 1964 | Galler and Fischer: first forest-based union-find | Introduced the idea |
| 1973 | Hopcroft and Ullman: union by rank analysis | O(m log* n) |
| 1975 | Tarjan: tight bound using inverse Ackermann | O(m · α(n)) |
| 1984 | Tarjan and van Leeuwen: simplified proof | Θ(m · α(n)) |
| 1989 | Fredman and Saks: lower bound for cell-probe model | Ω(m · α(n)) |
The 1975 paper by Robert Tarjan is particularly notable. It introduced the use of the Ackermann function in algorithm analysis — a connection that seemed absurd at the time. A function that grows faster than any computable function in the "fast-growing hierarchy" appeared in the analysis of a 10-line algorithm. This connection between combinatorics and logic (the Ackermann function originated in mathematical logic as an example of a total computable function that is not primitive recursive) remains one of the most surprising results in theoretical computer science.
If you want to go deeper: