Heapsort & Priority Queues

Chapter 0: The Problem

You are building a task scheduler for an operating system. Hundreds of processes compete for CPU time, each with a priority number. Every millisecond, the scheduler must answer one question: which process has the highest priority right now? And processes keep arriving, being promoted, and completing. The data structure holding them must be fast at finding the maximum, fast at inserting, and fast at removing the top element.

Let us see how badly the obvious data structures fail at this job.

This is not an academic exercise. Every time you drag a file into a download queue, every time your music app shuffles to the "most popular" track, every time a game engine decides which enemy to process first — a priority queue is running underneath. The data structure we will build in this lesson is one of the most practically useful structures in all of computer science.

The Three Contenders

Sorted array. Finding the max is trivial: it is the last element, O(1). But every time a new process arrives, you must find its correct position (binary search, O(log n)) and then shift everything after it to make room. That shift is O(n). For a scheduler handling 10,000 processes, that is 10,000 memory moves on every insertion.

Unsorted array. Insertion is trivial: just append at the end, O(1). But finding the max requires scanning every element, O(n). The scheduler cannot afford a linear scan every millisecond.

The mystery data structure. What if we could get O(1) to peek at the max, O(log n) to insert, and O(log n) to extract the max? That would be O(log n) for every important operation. For 10,000 processes, that is about 13 operations instead of 10,000. This structure exists. It is called a binary heap.

The core trade-off. Sorted arrays are great at queries but terrible at updates. Unsorted arrays are great at updates but terrible at queries. A heap is the sweet spot: it maintains just enough order to answer "what is the max?" instantly, while keeping updates logarithmic. It achieves this by relaxing the sorting requirement — instead of requiring every element to be in the right position relative to all others, it only requires each node to be larger than its children.

Think of it this way. A sorted array is like a library where every book is alphabetized — finding a title is fast, but inserting a new book means shifting half the shelf. An unsorted array is like a pile of books on the floor — tossing in a new one is instant, but finding the one you want means digging through the pile. A heap is like a tournament bracket — the champion is always at the top, and when a new contender arrives, they only need to beat a few opponents (O(log n)) to find their proper level.

Data Structure Comparison

Click each operation to see how many steps each data structure needs for n=16 elements. The bars show the relative cost.

The table below summarizes the asymptotic costs. Look at the heap column — every operation is at worst O(log n), and finding the max is O(1).

Operation	Sorted Array	Unsorted Array	Binary Heap
Insert	O(n)	O(1)	O(log n)
Find Max	O(1)	O(n)	O(1)
Extract Max	O(1)	O(n)	O(log n)
Increase Key	O(n)	O(1)	O(log n)

Hand Calculation: Why Linear Search Hurts

Let us make the pain concrete. Suppose you have n = 1,000 processes in an unsorted array and the scheduler runs 1,000 cycles per second. Each cycle requires finding the max (O(n) = 1,000 comparisons) and possibly an insert (O(1)). Per second: 1,000 × 1,000 = 1,000,000 comparisons just for find-max.

With a heap: each cycle does find-max in O(1) and extract-max in O(log 1000) ≈ 10 comparisons. Per second: 1,000 × 10 = 10,000 comparisons. That is a 100x improvement. At 10,000 processes, the unsorted array does 100,000,000 comparisons per second. The heap does 140,000. The gap grows linearly with n.

The exact crossover. For n = 7 or fewer elements, a sorted array actually beats a heap because the constant factors for array shifts are tiny and the cache behavior is perfect. In practice, heaps win starting around n = 10-20. For any non-trivial scheduler (hundreds or thousands of processes), there is no contest.

Why Not Just Use a Balanced BST?

A balanced binary search tree (like a red-black tree) can do all of these operations in O(log n) too. So why bother with heaps? Three reasons:

1. Cache locality. A heap is stored as a contiguous array. A BST is stored as scattered nodes linked by pointers. On modern hardware, walking a contiguous array is dramatically faster because the CPU cache can prefetch adjacent elements. For a heap of 10,000 elements, the entire structure fits in a few cache lines. A BST scatters those same elements across memory with pointer chasing at every step.

2. Constant factors. Heap operations are simple — compare and swap. BST operations require maintaining balance invariants (rotations, color flips). The constant factors in heap operations are roughly 2-5x smaller.

3. Build time. Building a heap from an unordered array takes O(n). Building a balanced BST takes O(n log n). If you have a batch of data and need to start scheduling immediately, the heap wins.

Concept check: You have an unsorted array of 1,000,000 elements and you need to repeatedly extract the maximum. You will perform exactly 10 extract-max operations. Which data structure should you use?

Sort the array first, then pop from the end (O(n log n) sort + 10 × O(1) pops) Use a heap: build it in O(n), then 10 extract-max operations at O(log n) each Scan the array 10 times, each time finding and removing the max (10 × O(n))

Why answer B? Building the heap is O(n) = 1,000,000 operations. Then 10 extractions cost 10 × O(log n) = 10 × 20 = 200. Total: ~1,000,200. Sorting costs O(n log n) = 20,000,000. Scanning 10 times costs 10,000,000. The heap wins by an order of magnitude when you only need a few extractions.

The Road Ahead

In this lesson, we will build the binary heap from the ground up:

Ch 1: The Structure

Complete binary tree in an array. Parent/child formulas. Max-heap property.

↓

Ch 2: MAX-HEAPIFY

The fundamental repair operation. Bubble a violating element down.

↓

Ch 3: BUILD-MAX-HEAP

Convert any array into a max-heap in O(n). The O(n) proof.

↓

Ch 4: Heapsort

Build heap + extract-max repeatedly = O(n log n) in-place sort.

↓

Ch 5: Priority Queues

THE killer application. Insert, extract-max, increase-key. Interactive CPU scheduler.

↓

Ch 6-9: Variations, Applications, Interview Arsenal

Min-heaps, d-ary heaps, top-K, median maintenance, merge K lists, LeetCode patterns.

By the end, you will be able to implement a heap from scratch in under 15 minutes, recognize heap problems in interviews instantly, and understand why every major standard library includes a heap implementation.

Chapter 1: Binary Heap Structure

A binary heap is a complete binary tree stored in a flat array. That single sentence contains two ideas we need to unpack carefully.

Complete Binary Tree

A complete binary tree is a binary tree where every level is fully filled except possibly the last, and the last level is filled from left to right. No gaps, no holes. This shape constraint is what makes the array representation possible.

Think of it like filling seats in a theater. You fill row 1 completely, then row 2, then row 3. You never start a new row until the previous one is full. And within the last row, you fill from the left.

The Array Trick

Because the tree has no gaps, we can store it level by level in an array. The root goes in position 1 (we leave index 0 unused for cleaner math). The root's children go in positions 2 and 3. Their children go in 4, 5, 6, 7. And so on.

This gives us three beautiful formulas. For a node at index i:

Parent(i) = ⌊i / 2⌋

Left(i) = 2i

Right(i) = 2i + 1

No pointers. No linked list overhead. Just integer arithmetic. Left child is 2*i, right child is 2*i+1, parent is i//2. These three formulas are the entire navigation system for a heap.

Why 1-indexed? With 1-based indexing, the formulas are clean: left=2i, right=2i+1, parent=i/2. With 0-based indexing (as in most programming languages), the formulas become left=2i+1, right=2i+2, parent=(i-1)/2. Both work — CLRS uses 1-based, Python implementations typically use 0-based. We will show both throughout this lesson.

Hand Calculation: Navigating the Heap

Let us trace through a concrete heap to make the formulas stick. Consider the array (1-indexed): A = [_, 90, 70, 80, 30, 60, 50, 40, 10, 20, 55].

The tree looks like this:

90 (index 1)
/ \
70 80 (indices 2, 3)
/ \ / \
30 60 50 40 (indices 4, 5, 6, 7)
/ \ /
10 20 55 (indices 8, 9, 10)

Let us verify the formulas:

Node 60 is at index 5. Parent(5) = floor(5/2) = 2. A[2] = 70. Correct — 70 is 60's parent. Left(5) = 10. A[10] = 55. Right(5) = 11 — out of bounds, so node 5 has only one child.

Node 80 is at index 3. Parent(3) = floor(3/2) = 1. A[1] = 90. Correct. Left(3) = 6. A[6] = 50. Right(3) = 7. A[7] = 40. Both are less than 80. Max-heap property holds.

Node 30 is at index 4. Left(4) = 8. A[8] = 10. Right(4) = 9. A[9] = 20. Both less than 30. Good. Is 30 less than its parent? Parent(4) = 2, A[2] = 70. Yes, 30 ≤ 70. Max-heap property holds.

Bit-shifting trick. On a CPU, multiplying by 2 is a left bit-shift (i << 1), dividing by 2 is a right bit-shift (i >> 1). So Parent, Left, and Right are single-instruction operations. This is why heap navigation is effectively "free" in terms of compute cost — it is the cheapest possible tree traversal.

The Max-Heap Property

A max-heap satisfies one rule: every node is greater than or equal to its children.

A[Parent(i)] ≥ A[i] for every node i except the root

This means the root always holds the maximum element. That is why FIND-MAX is O(1) — just look at A[1].

Crucially, a max-heap is NOT a sorted array. The element at index 2 is not necessarily the second-largest. All we know is that it is less than or equal to the root. Siblings have no ordering constraint between them. The max-heap property only constrains parent-child relationships, not sibling relationships.

Heap Height

A complete binary tree with n nodes has height:

h = ⌊log₂ n⌋

Level 0 (the root) has 1 node. Level 1 has 2 nodes. Level k has 2^k nodes. The total up to level h is 2^h+1 - 1. For n = 15, h = 3. For n = 1,000,000, h = 19. This is why heap operations are fast — even a million-element heap is only 20 levels deep.

Leaves vs internal nodes. In an n-element heap, the leaves are the nodes at indices ⌊n/2⌋ + 1 through n. The internal nodes (which have at least one child) are at indices 1 through ⌊n/2⌋. Exactly half the nodes are leaves. This fact will be critical when we build a heap in Chapter 3.

Why the Shape Constraint Matters

The completeness constraint is not arbitrary — it is what makes the array representation possible and what guarantees the height is O(log n). Consider what happens if we drop the constraint:

Without completeness: A binary tree with n nodes could be a degenerate chain (like a linked list), with height n-1. Then MAX-HEAPIFY would be O(n), not O(log n), and we would lose all the performance guarantees.

With completeness: The tree is as "wide" as possible at every level. Level k has exactly 2^k nodes (except possibly the last level). This means the height is always ⌊log₂ n⌋. For n = 1,000,000, that is 19 levels. For n = 1,000,000,000, that is 29 levels. The height grows incredibly slowly.

Exact node counts by level. For a heap with n = 10 nodes: Level 0 has 1 node (the root). Level 1 has 2 nodes. Level 2 has 4 nodes. Level 3 has 3 nodes (the last level, partially filled). Total: 1 + 2 + 4 + 3 = 10. The last level fills from left to right, so nodes 8, 9, 10 are the leftmost positions at level 3.

Verifying the Max-Heap Property

Given an array, how do you check whether it is a valid max-heap? Simple: check every parent-child relationship.

python
def is_max_heap(A, n):
    """Check if A[1..n] (1-indexed) is a valid max-heap."""
    for i in range(1, n + 1):
        l = 2 * i
        r = 2 * i + 1
        if l <= n and A[l] > A[i]:
            return False
        if r <= n and A[r] > A[i]:
            return False
    return True

# Test:
A = [None, 90, 70, 80, 30, 60, 50, 40]
print(is_max_heap(A, 7))  # True
A[3] = 100  # break the property
print(is_max_heap(A, 7))  # False (A[3]=100 > A[1]=90)

Interactive Heap Explorer

Enter a value and click Insert to add it to the heap. Click any node to highlight it and see its parent/children relationships. The array view and tree view stay synchronized.

Heap size: 0

python
# Binary heap with 1-based indexing (CLRS style)
class MaxHeap:
    def __init__(self):
        self.A = [None]  # index 0 unused
        self.heap_size = 0

    def parent(self, i):
        return i // 2

    def left(self, i):
        return 2 * i

    def right(self, i):
        return 2 * i + 1

# 0-based indexing (Python convention)
# parent = (i - 1) // 2
# left   = 2 * i + 1
# right  = 2 * i + 2

How Many Valid Max-Heaps Exist?

Given n distinct elements, how many different valid max-heaps can you build? This is a fascinating combinatorial question.

For n=3 with elements {1, 2, 3}: the root must be 3 (the max). Then 1 and 2 can go left/right in either order. So there are 2 valid max-heaps: [3, 1, 2] and [3, 2, 1].

For n=7 with elements {1, ..., 7}: the root is 7. The left subtree gets 3 elements (including its subtree's max), and the right subtree gets 3 elements. The count grows quickly — there are 80 valid max-heaps on 7 elements.

The formula involves choosing which elements go to the left vs right subtree, then recursing. This is related to the number of topological sorts of the heap's partial order. The exact count is:

f(n) = C(n-1, n_L) · f(n_L) · f(n_R)

Where n_L and n_R are the sizes of the left and right subtrees, and C(n-1, n_L) is the binomial coefficient for choosing which elements go left.

python
from math import comb

def count_heaps(n):
    """Count the number of distinct max-heaps on n elements."""
    if n <= 1: return 1
    # Compute left subtree size for a complete binary tree
    h = n.bit_length() - 1  # height of tree
    max_last = 2**h         # max nodes on last level
    actual_last = n - (2**h - 1)  # actual nodes on last level
    left_last = min(actual_last, max_last // 2)
    nL = (2**(h-1) - 1) + left_last
    nR = n - 1 - nL
    return comb(n - 1, nL) * count_heaps(nL) * count_heaps(nR)

print(count_heaps(3))   # 2
print(count_heaps(7))   # 80
print(count_heaps(10))  # 3024

Concept check: In a max-heap stored in a 1-indexed array A = [_, 90, 70, 80, 30, 60, 50, 40], what is the parent of the node with value 60 (at index 5)?

A[1] = 90 (the root) A[2] = 70 (because Parent(5) = floor(5/2) = 2) A[3] = 80 (because 5 is odd, so it's a right child)

Chapter 2: MAX-HEAPIFY

Imagine you have a mostly-valid max-heap, but one node is too small — it violates the heap property because it is less than one of its children. MAX-HEAPIFY fixes this by "bubbling down" the violating node until it finds its rightful place.

This is the fundamental repair operation. Everything else — building heaps, heapsort, priority queue operations — is built on top of MAX-HEAPIFY.

The Algorithm

Given a node at index i whose left and right subtrees are already valid max-heaps (but A[i] might be smaller than its children):

1. Compare

Find the largest among A[i], A[Left(i)], and A[Right(i)].

↓

2. Check

If A[i] is already the largest, STOP — the heap property holds at i.

↓

3. Swap

Swap A[i] with the larger child. Now the node at i is correct, but the child we swapped into might violate the heap property in ITS subtree.

↓

4. Recurse

Call MAX-HEAPIFY on the child position we swapped with. Repeat until the element settles into place or reaches a leaf.

Worked Example

Start with array: [_, 4, 14, 7, 10, 8, 3, 2]. The root (4) violates the max-heap property because its child 14 is larger. Let us trace MAX-HEAPIFY(A, 1):

Step 1: Compare A[1]=4, A[2]=14, A[3]=7. Largest is A[2]=14. Swap A[1] and A[2]. Array becomes [_, 14, 4, 7, 10, 8, 3, 2].

Step 2: Now at index 2. Compare A[2]=4, A[4]=10, A[5]=8. Largest is A[4]=10. Swap A[2] and A[4]. Array becomes [_, 14, 10, 7, 4, 8, 3, 2].

Step 3: Now at index 4. A[4]=4 has no children within heap bounds (Left(4)=8 > heap_size=7). The element is at a leaf. STOP. Final array: [_, 14, 10, 7, 4, 8, 3, 2]. Valid max-heap.

MAX-HEAPIFY Step-Through

The root node violates the max-heap property. Click "Step" to watch it bubble down. Click "Reset" to try with a new random violation.

Click Step to begin

Runtime Analysis

At each level, MAX-HEAPIFY does O(1) work (one comparison and one swap). It descends at most h levels, where h = floor(log₂ n). Therefore:

T(MAX-HEAPIFY) = O(log n) = O(h)

More precisely, T(n) satisfies T(n) ≤ T(2n/3) + O(1). The worst case is when the bottom level is half full, making the left subtree have size at most 2n/3. By the master theorem (case 2), this solves to T(n) = O(log n).

python
def max_heapify(A, i, heap_size):
    """Fix the max-heap property at node i.
    Assumes left and right subtrees are valid max-heaps."""
    l = 2 * i        # left child (1-indexed)
    r = 2 * i + 1    # right child

    # Find the largest among A[i], A[l], A[r]
    largest = i
    if l <= heap_size and A[l] > A[largest]:
        largest = l
    if r <= heap_size and A[r] > A[largest]:
        largest = r

    if largest != i:
        A[i], A[largest] = A[largest], A[i]  # swap
        max_heapify(A, largest, heap_size)   # recurse

The precondition matters. MAX-HEAPIFY only works when both subtrees of node i are already valid max-heaps. If the left or right subtree is also messed up, bubbling down the root will not fix those problems. This precondition is what dictates the bottom-up order in BUILD-MAX-HEAP (next chapter).

The Iterative Version

The recursive version is elegant but uses O(log n) stack space. In production code, an iterative version avoids stack overhead:

python
def max_heapify_iterative(A, i, heap_size):
    """Iterative version — O(1) extra space."""
    while True:
        l = 2 * i
        r = 2 * i + 1
        largest = i
        if l <= heap_size and A[l] > A[largest]:
            largest = l
        if r <= heap_size and A[r] > A[largest]:
            largest = r
        if largest == i:
            break  # heap property restored
        A[i], A[largest] = A[largest], A[i]
        i = largest  # continue at the child position

Correctness Argument

Loop invariant: At the start of each iteration, the subtree rooted at i satisfies the max-heap property everywhere except possibly at i itself. A[i] might be smaller than one of its children, but both children's subtrees are valid max-heaps.

Maintenance: We find the largest among A[i], A[left], A[right]. If A[i] is already largest, the heap property holds at i and we stop. Otherwise, we swap A[i] with the larger child. Now position i is fixed (it holds the largest value), but the child position might be broken. The child's subtree was a valid max-heap before, and we just placed a potentially-smaller value at its root — this is exactly the situation MAX-HEAPIFY is designed to fix. We recurse (or loop) on that child.

Termination: Each iteration moves i to a child, increasing the depth by 1. The depth is bounded by h = floor(log₂ n), so the loop terminates after at most h iterations.

What Can Go Wrong

Off-by-one errors. The most common bug is using l < heap_size instead of l <= heap_size (for 1-indexed), or l <= heap_size instead of l < heap_size (for 0-indexed). Always trace through a small example to verify bounds.

Forgetting the size parameter. MAX-HEAPIFY must respect the current heap_size, not the array length. In heapsort, the heap shrinks while the array stays the same size. If you compare against len(A) instead of heap_size, you will corrupt the sorted portion of the array.

Tracing MAX-HEAPIFY on a Larger Example

Array: [_, 2, 16, 14, 12, 8, 7, 10, 4, 6, 3]. Call MAX-HEAPIFY(A, 1, 10).

Level 0: i=1. A[1]=2. Left=2 (A[2]=16), Right=3 (A[3]=14). Largest=2 (value 16). Swap A[1] and A[2].

Array: [_, 16, 2, 14, 12, 8, 7, 10, 4, 6, 3]

Level 1: i=2. A[2]=2. Left=4 (A[4]=12), Right=5 (A[5]=8). Largest=4 (value 12). Swap A[2] and A[4].

Array: [_, 16, 12, 14, 2, 8, 7, 10, 4, 6, 3]

Level 2: i=4. A[4]=2. Left=8 (A[8]=4), Right=9 (A[9]=6). Largest=9 (value 6). Swap A[4] and A[9].

Array: [_, 16, 12, 14, 6, 8, 7, 10, 4, 2, 3]

Level 3: i=9. A[9]=2. Left=18, Right=19. Both out of bounds. STOP.

The value 2 sank from the root (level 0) all the way to level 3 — 3 swaps, which is the height of this 10-element heap. This is the worst case: the element travels the full height.

Concept check: You call MAX-HEAPIFY on a node at depth 2 in a heap of height 5. What is the maximum number of swaps this call can perform?

2 (one for each level below depth 2, down to depth 4) 3 (from depth 2, the node can sink to depth 3, 4, or 5 — three swaps) 5 (it always traverses the full height)

Chapter 3: BUILD-MAX-HEAP

You have a completely unordered array: [4, 1, 3, 2, 16, 9, 10, 14, 8, 7]. How do you turn this into a valid max-heap? The answer is deceptively simple: call MAX-HEAPIFY on every internal node, working from the bottom up.

Why Bottom-Up?

MAX-HEAPIFY at node i requires that the left and right subtrees are already valid heaps. If we start at the bottom, the leaves (indices ⌊n/2⌋+1 through n) are trivially valid heaps — a single node is always a valid heap. Then we process ⌊n/2⌋, then ⌊n/2⌋-1, and so on up to the root. By the time we reach any node, its children have already been heapified.

1. Start at last internal node

i = ⌊n/2⌋. Skip the leaves — they are already valid heaps.

↓

2. MAX-HEAPIFY(A, i)

Fix the heap property at this node. Its children's subtrees are already valid.

↓

3. Decrement i

Move to the next node to the left. Process all internal nodes.

↻ repeat until i = 1

↓

4. Done

After processing the root, the entire array is a valid max-heap.

Worked Example

Input: A = [_, 4, 1, 3, 2, 16, 9, 10, 14, 8, 7] (1-indexed, n=10)

Leaves are indices 6-10. Internal nodes are 5, 4, 3, 2, 1. We process right to left:

i=5: A[5]=16. Left(5)=10: A[10]=7. Right(5)=11: out of bounds. 16 > 7. No swap needed.

i=4: A[4]=2. Left(4)=8: A[8]=14. Right(4)=9: A[9]=8. Largest is 14. Swap 2 and 14. Array: [_, 4, 1, 3, 14, 16, 9, 10, 2, 8, 7]. Now heapify at index 8: A[8]=2, no children in range. Done.

i=3: A[3]=3. Left(3)=6: A[6]=9. Right(3)=7: A[7]=10. Largest is 10. Swap 3 and 10. Array: [_, 4, 1, 10, 14, 16, 9, 3, 2, 8, 7]. Heapify at 7: A[7]=3, no children in range. Done.

i=2: A[2]=1. Left(2)=4: A[4]=14. Right(2)=5: A[5]=16. Largest is 16. Swap 1 and 16. Array: [_, 4, 16, 10, 14, 1, 9, 3, 2, 8, 7]. Heapify at 5: A[5]=1. Left(5)=10: A[10]=7. 7 > 1. Swap. Array: [_, 4, 16, 10, 14, 7, 9, 3, 2, 8, 1]. Done.

i=1: A[1]=4. Left(1)=2: A[2]=16. Right(1)=3: A[3]=10. Largest is 16. Swap 4 and 16. Array: [_, 16, 4, 10, 14, 7, 9, 3, 2, 8, 1]. Heapify at 2: A[2]=4. Left(2)=4: A[4]=14. Right(2)=5: A[5]=7. Swap 4 and 14. Array: [_, 16, 14, 10, 4, 7, 9, 3, 2, 8, 1]. Heapify at 4: A[4]=4. Left(4)=8: A[8]=2. Right(4)=9: A[9]=8. Swap 4 and 8. Array: [_, 16, 14, 10, 8, 7, 9, 3, 2, 4, 1]. Heapify at 9: leaf. Done.

Final max-heap: [_, 16, 14, 10, 8, 7, 9, 3, 2, 4, 1].

BUILD-MAX-HEAP Animation

Watch an unordered array transform into a valid max-heap. Internal nodes are processed right-to-left (bottom-up). Red nodes are being heapified.

Processing node 5 of 5

The O(n) Proof — This Is NOT O(n log n)

The naive analysis says: we call MAX-HEAPIFY n/2 times, each costing O(log n). So the total is O(n log n). But this is a loose upper bound. The tight analysis is O(n). Here is why.

MAX-HEAPIFY at a node of height h costs O(h) — it sinks at most h levels. Not all nodes have the same height. In fact, most nodes are near the bottom and have small heights.

Height h	Number of nodes at height h	Cost per node	Total cost at height h
0 (leaves)	⌈n/2⌉ ≈ n/2	O(0) = skip	0
1	⌈n/4⌉ ≈ n/4	O(1)	n/4
2	⌈n/8⌉ ≈ n/8	O(2)	n/4
3	⌈n/16⌉ ≈ n/16	O(3)	3n/16
h	⌈n/2^h+1⌉	O(h)	hn/2^h+1

The total cost is:

T = ∑_h=0^{⌊log n⌋} ⌈n / 2^h+1⌉ · O(h) = O(n ∑_h=0^∞ h / 2^h) = O(n · 2) = O(n)

The sum ∑ h/2^h from h=0 to ∞ converges to exactly 2. (This is the derivative of the geometric series evaluated at x=1/2.) So BUILD-MAX-HEAP is Θ(n), not Θ(n log n).

The intuition. Half the nodes are leaves and do zero work. A quarter of the nodes are one level above leaves and do one swap. An eighth are two levels up and do two swaps. The work decreases geometrically as the height increases. The few nodes at the top that have to sink far are vastly outnumbered by the many nodes near the bottom that barely move.

Deriving the Sum ∑ h/2^h

This is a standard trick that appears in algorithm analysis. We want S = ∑_h=0^∞ h/2^h = 0 + 1/2 + 2/4 + 3/8 + 4/16 + ...

Start with the geometric series: ∑ x^h = 1/(1-x) for |x| < 1.

Take the derivative with respect to x: ∑ h · x^h-1 = 1/(1-x)².

Multiply both sides by x: ∑ h · x^h = x/(1-x)².

Evaluate at x = 1/2: S = (1/2)/(1 - 1/2)² = (1/2)/(1/4) = 2.

Therefore BUILD-MAX-HEAP does at most ∑ (n/2^h+1) · h = (n/2) · ∑ h/2^h = (n/2) · 2 = n total swaps. This is a tight Θ(n) bound.

Why this matters in interviews. "Prove BUILD-MAX-HEAP is O(n)" is one of the most commonly asked algorithm proof questions. The two key steps are: (1) recognize that nodes at height h cost O(h) and there are O(n/2^h) of them, and (2) evaluate the resulting sum. Memorize the derivation above — it takes 30 seconds to write on a whiteboard and it is a strong signal of mathematical maturity.

Empirical Verification

Let us verify the O(n) claim by counting actual swaps. For a random array of n elements:

n	n log₂ n (naive bound)	Actual swaps (avg)	Ratio (swaps/n)
100	664	~85	0.85
1,000	9,966	~920	0.92
10,000	132,877	~9,400	0.94
100,000	1,660,964	~95,000	0.95

The swaps/n ratio approaches ~1.0 as n grows — beautifully linear, just as the theory predicts. The naive O(n log n) bound overestimates by a factor of 10x or more.

Alternative: Top-Down Build (Repeated Insertion)

There is another way to build a heap: start empty and insert elements one by one. Each INSERT calls bubble-up, which is O(log k) for the k-th insertion. Total cost:

T = ∑_k=1ⁿ O(log k) = O(log(n!)) = O(n log n)

This is Θ(n log n) — worse than bottom-up's Θ(n). The difference: bottom-up lets most nodes (the leaves) do zero work. Top-down insertion forces every element to bubble up from the bottom, and the last n/2 elements each travel O(log n) levels.

When to use which. If you have all the data upfront, use BUILD-MAX-HEAP (bottom-up) — it is O(n). If data arrives one at a time in a stream, use repeated INSERT — you have no choice but to pay O(log n) per element. Python's heapq.heapify() uses the bottom-up approach; heapq.heappush() is the streaming approach.

Visualizing the O(n) vs O(n log n) Difference

For n = 1,000,000:

Method	Work per node	Total
Bottom-up BUILD-MAX-HEAP	Leaves (500K): 0. Next 250K: 1. Next 125K: 2. ... Root: 20.	≈ 1,000,000
Top-down repeated INSERT	1st: 0. 2nd: 1. 4th: 2. ... 1Mth: 20.	≈ 20,000,000

Bottom-up is 20x faster for a million elements. This is one of the most impactful algorithmic insights in the chapter — the order in which you process nodes changes the total cost from O(n log n) to O(n).

python
def build_max_heap(A):
    """Convert array A (1-indexed) into a max-heap in-place.
    Time: O(n). Space: O(1)."""
    n = len(A) - 1  # A[0] is unused sentinel
    for i in range(n // 2, 0, -1):  # n//2 down to 1
        max_heapify(A, i, n)

# Example
A = [None, 4, 1, 3, 2, 16, 9, 10, 14, 8, 7]
build_max_heap(A)
# A is now [None, 16, 14, 10, 8, 7, 9, 3, 2, 4, 1]

Concept check: Why does BUILD-MAX-HEAP process nodes from ⌊n/2⌋ down to 1 instead of from 1 down to ⌊n/2⌋ (top-down)?

MAX-HEAPIFY at node i requires that i's left and right subtrees are already valid max-heaps. Bottom-up processing ensures this precondition holds at every step. Top-down would call MAX-HEAPIFY on the root before its children's subtrees are fixed, violating the precondition. Top-down would be O(n log n) instead of O(n) It doesn't matter — both orderings produce the same result

Chapter 4: Heapsort

We now have all the pieces to build a sorting algorithm. The idea is beautifully simple: (1) build a max-heap from the input, (2) repeatedly extract the maximum and place it at the end.

The Algorithm

1. BUILD-MAX-HEAP

Convert the unordered array into a max-heap. O(n). Now A[1] is the largest element.

↓

2. Swap A[1] with A[n]

The largest element is now in its final position at the end. Reduce heap_size by 1.

↓

3. MAX-HEAPIFY(A, 1)

The new root might violate the heap property. Fix it. O(log n).

↻ repeat steps 2-3 for heap_size = n-1, n-2, ..., 2

↓

4. Done

The array is sorted in ascending order.

Each iteration places the next-largest element at the end. After n-1 iterations, the entire array is sorted.

Worked Example

Start with max-heap: [_, 16, 14, 10, 8, 7, 9, 3, 2, 4, 1]

Iteration 1: Swap A[1]=16 with A[10]=1. Sorted region: [16]. Heap: [_, 1, 14, 10, 8, 7, 9, 3, 2, 4 | 16]. MAX-HEAPIFY root. Result: [_, 14, 8, 10, 4, 7, 9, 3, 2, 1 | 16].

Iteration 2: Swap A[1]=14 with A[9]=1. Sorted: [14, 16]. Heapify. Result: [_, 10, 8, 9, 4, 7, 1, 3, 2 | 14, 16].

This continues until the heap has size 1. The sorted portion grows from the right.

Runtime

T(Heapsort) = O(n) + (n-1) × O(log n) = O(n log n)

BUILD-MAX-HEAP is O(n). Then n-1 calls to MAX-HEAPIFY, each O(log n). Total: O(n log n) in the worst case. This is a guarantee — unlike quicksort, which can degrade to O(n²) on adversarial inputs.

Property	Heapsort	Quicksort	Mergesort
Best case	O(n log n)	O(n log n)	O(n log n)
Worst case	O(n log n)	O(n²)	O(n log n)
Average case	O(n log n)	O(n log n)	O(n log n)
Space	O(1) in-place	O(log n) stack	O(n) auxiliary
Stable?	No	No (typically)	Yes
Cache-friendly?	Poor	Excellent	Good

Why isn't heapsort more popular? Despite O(n log n) worst-case and O(1) space — both better guarantees than quicksort — heapsort loses in practice because of cache behavior. Quicksort accesses elements sequentially (partition scans left-to-right and right-to-left). Heapsort jumps between parent and child nodes, which are far apart in the array (parent at i, children at 2i and 2i+1). These jumps cause cache misses. On modern CPUs where a cache miss costs 100+ cycles, this overhead makes heapsort 2-3x slower than quicksort on average despite the same asymptotic complexity.

Hand Calculation: Counting Swaps

Let us count exactly how many swaps heapsort performs on our 10-element array. The initial max-heap is [_, 16, 14, 10, 8, 7, 9, 3, 2, 4, 1].

Iteration 1: Swap root (16) with A[10] (1). Heapify: 1 sinks two levels (1→14→8). Swaps: 1 (initial) + 2 (heapify) = 3.

Iteration 2: Swap root (14) with A[9] (4). Heapify: 4 sinks two levels (4→10→9). Swaps: 1 + 2 = 3.

Iteration 3: Swap root (10) with A[8] (2). Heapify: 2 sinks two levels (2→9→3). Swaps: 1 + 2 = 3.

Iteration 4: Swap root (9) with A[7] (3). Heapify: 3 sinks one level (3→7). Swaps: 1 + 1 = 2.

Iteration 5: Swap root (8) with A[6] (1). Heapify: 1 sinks two levels (1→7→4). Swaps: 1 + 2 = 3.

Iterations 6-9: Remaining swaps: 2 + 2 + 1 + 1 = 6.

Total: 3 + 3 + 3 + 2 + 3 + 6 = 20 swaps for 10 elements. The theoretical upper bound is n log n = 10 × 3.32 ≈ 33 swaps. We used about 60% of the budget — heapify often terminates early when the swapped element is not terribly small.

Stability: Why Heapsort Is Not Stable

A sorting algorithm is stable if equal elements appear in the output in the same order as in the input. Heapsort is not stable. Here is a concrete example.

Input: [(5, 'A'), (3, 'B'), (5, 'C'), (1, 'D')]. The two 5s should stay in order (A before C) in a stable sort.

During BUILD-MAX-HEAP and the extract-max loop, the two 5s may be swapped. The long-range swaps between the root and the last element are inherently order-disrupting. If stability matters (and it often does — sorting database rows by multiple columns), use mergesort instead.

Introsort: The Best of Both Worlds

Modern standard libraries (C++ std::sort, Rust sort_unstable) use introsort: start with quicksort, but track recursion depth. If the recursion exceeds 2 log₂ n, switch to heapsort. This gives quicksort's average-case speed with heapsort's worst-case guarantee. The result is O(n log n) worst case with excellent practical performance.

Python's sort. Python uses Timsort, a hybrid of mergesort and insertion sort. It is stable and O(n log n) worst case, but uses O(n) extra space. Python chose stability over in-place sorting. If you need an in-place O(n log n) sort in Python, you must implement heapsort yourself.

Heapsort vs Quicksort: The Full Picture

This comparison is an interview favorite. Here is the complete analysis:

Average case: Quicksort does ~1.39n log n comparisons on average. Heapsort does ~2n log n (each heapify call compares two children per level). Quicksort is ~30% fewer comparisons.

Cache behavior: Quicksort's partition scan touches elements sequentially — the CPU prefetcher loves this. Heapsort's parent-child access pattern jumps by factors of 2 in the array — terrible for caches. On a 64-byte cache line holding 8 integers, quicksort's sequential scan gets 8 values per cache load. Heapsort might load a new cache line for every comparison at depth > 3.

Worst case: Quicksort's O(n²) worst case occurs on already-sorted input with naive pivot selection. Random pivot or median-of-three mitigates this but does not eliminate it. Heapsort is always O(n log n).

Space: Heapsort uses O(1) auxiliary space. Quicksort uses O(log n) for the recursion stack (with tail-call optimization). Mergesort uses O(n).

Adaptivity: Neither heapsort nor quicksort is adaptive (they do not benefit from partially sorted input). Timsort IS adaptive — it detects existing runs and merges them, giving O(n) on nearly sorted data.

Sorting Algorithm Comparison

Compare operation counts for different sorting algorithms on the same random data. Click "New Data" to regenerate.

Click to generate comparison

Heapsort Animation

Watch heapsort in action. The heap region shrinks from the left; the sorted region (teal) grows from the right. Click Step to advance one swap-and-heapify cycle.

Heap size: 10

python
def heapsort(A):
    """Sort array A (1-indexed) in-place using heapsort.
    Time: O(n log n). Space: O(1)."""
    n = len(A) - 1
    build_max_heap(A)  # O(n)

    for i in range(n, 1, -1):  # n down to 2
        A[1], A[i] = A[i], A[1]  # move max to end
        max_heapify(A, 1, i - 1)  # fix heap of reduced size

# 0-indexed version (practical Python)
def heapsort_0(arr):
    n = len(arr)
    # Build max-heap
    for i in range(n // 2 - 1, -1, -1):
        _sift_down(arr, i, n)
    # Extract max repeatedly
    for end in range(n - 1, 0, -1):
        arr[0], arr[end] = arr[end], arr[0]
        _sift_down(arr, 0, end)

def _sift_down(arr, i, size):
    while 2 * i + 1 < size:
        l = 2 * i + 1
        r = 2 * i + 2
        largest = i
        if l < size and arr[l] > arr[largest]: largest = l
        if r < size and arr[r] > arr[largest]: largest = r
        if largest == i: break
        arr[i], arr[largest] = arr[largest], arr[i]
        i = largest

Concept check: After the first swap-and-heapify iteration of heapsort on a 10-element max-heap, where is the largest element?

At index 1 (still the root) At index 5 (the middle) At index 10 (the last position, in its final sorted position)

Chapter 5: Priority Queues

Heapsort is theoretically elegant but rarely used in practice (quicksort is faster due to cache effects). The real killer application of heaps is the priority queue — an abstract data type that supports inserting elements with priorities and extracting the highest-priority element. This is THE reason heaps exist in the standard libraries of every programming language.

The Four Operations

A max-priority queue supports four operations, all powered by the underlying max-heap:

Operation	What it does	Time	How it works
MAXIMUM()	Return the element with the largest key	O(1)	Return A[1] (the root)
EXTRACT-MAX()	Remove and return the largest key	O(log n)	Save A[1], move A[n] to root, shrink heap, MAX-HEAPIFY(1)
INSERT(key)	Add a new element with given key	O(log n)	Append at end, then bubble UP to restore heap property
INCREASE-KEY(i, key)	Increase the key of element at position i	O(log n)	Set A[i]=key, then bubble UP (parent might now be smaller)

EXTRACT-MAX: Removing the Top

We already know this from heapsort. Save the root, replace it with the last element, shrink the heap, and heapify the root. The only twist is that we return the removed value.

python
def extract_max(self):
    if self.heap_size < 1:
        raise Exception("Heap underflow")
    max_val = self.A[1]
    self.A[1] = self.A[self.heap_size]
    self.heap_size -= 1
    max_heapify(self.A, 1, self.heap_size)
    return max_val

INSERT and INCREASE-KEY: Bubbling Up

These are the opposite of MAX-HEAPIFY. Instead of sinking a too-small element down, we float a too-large element up. When a node's key increases (or a new element arrives), it might be larger than its parent. So we swap it with its parent, then check the grandparent, and so on until the heap property is restored or we reach the root.

python
def increase_key(self, i, key):
    if key < self.A[i]:
        raise Exception("New key is smaller than current key")
    self.A[i] = key
    # Bubble up: while node is larger than its parent, swap
    while i > 1 and self.A[self.parent(i)] < self.A[i]:
        self.A[i], self.A[self.parent(i)] = self.A[self.parent(i)], self.A[i]
        i = self.parent(i)

def insert(self, key):
    self.heap_size += 1
    if self.heap_size >= len(self.A):
        self.A.append(-float('inf'))  # extend array
    else:
        self.A[self.heap_size] = -float('inf')
    self.increase_key(self.heap_size, key)  # bubble up

Two directions, one structure. MAX-HEAPIFY sinks a too-small node DOWN by swapping with the larger child. INCREASE-KEY floats a too-large node UP by swapping with its parent. Down = O(h) comparisons from current position to leaf. Up = O(h) comparisons from current position to root. Both are O(log n) in the worst case.

Worked Example: INSERT

Let us insert the value 85 into the max-heap [_, 90, 70, 80, 30, 60, 50, 40]. The heap has 7 elements. After inserting, it will have 8.

Step 1: Place 85 at the end. Array: [_, 90, 70, 80, 30, 60, 50, 40, 85]. Index 8.

Step 2: Bubble up. Parent(8) = 4. A[4] = 30. Is 85 > 30? Yes. Swap. Array: [_, 90, 70, 80, 85, 60, 50, 40, 30].

Step 3: Now at index 4. Parent(4) = 2. A[2] = 70. Is 85 > 70? Yes. Swap. Array: [_, 90, 85, 80, 70, 60, 50, 40, 30].

Step 4: Now at index 2. Parent(2) = 1. A[1] = 90. Is 85 > 90? No. STOP. Final array: [_, 90, 85, 80, 70, 60, 50, 40, 30]. The 85 settled at index 2, exactly where it belongs — second largest.

Total swaps: 2. Maximum possible for an 8-element heap: log₂(8) = 3. The element found its level in 2 out of 3 possible swaps.

Worked Example: EXTRACT-MAX

Extract the maximum from [_, 90, 85, 80, 70, 60, 50, 40, 30].

Step 1: Save max = 90. Move last element (30) to root. Shrink heap. Array: [_, 30, 85, 80, 70, 60, 50, 40]. Heap size = 7.

Step 2: MAX-HEAPIFY(1). Compare 30, 85, 80. Largest = 85 (index 2). Swap. Array: [_, 85, 30, 80, 70, 60, 50, 40].

Step 3: At index 2. Compare 30, 70, 60. Largest = 70 (index 4). Swap. Array: [_, 85, 70, 80, 30, 60, 50, 40].

Step 4: At index 4. No children in range (Left(4)=8 > heap_size=7). STOP. Return 90.

DELETE Arbitrary Element

What if you want to delete a node at position i that is NOT the root? This operation is not part of the standard CLRS priority queue interface, but it appears in practice (e.g., canceling a scheduled event).

The trick: increase the key at position i to +∞, let it bubble up to the root, then extract-max. Total cost: O(log n) for bubble-up + O(log n) for extract = O(log n).

python
def delete(self, i):
    """Delete the element at position i. O(log n)."""
    self.increase_key(i, float('inf'))  # bubble to root
    self.extract_max()                   # remove root

Alternatively (and more efficiently): replace A[i] with the last element, shrink the heap, then either bubble up or sift down depending on whether the replacement is larger or smaller than the old value.

Where Priority Queues Appear

OS task scheduling. Every modern OS uses a priority queue to decide which process runs next. Linux's CFS (Completely Fair Scheduler) uses a red-black tree (a balanced BST), but many real-time OS kernels use binary heaps for their simplicity and predictable timing.

Dijkstra's shortest path. The textbook implementation of Dijkstra's algorithm uses a min-priority queue to always process the vertex with the smallest tentative distance. With a binary heap, Dijkstra runs in O((V + E) log V). With a Fibonacci heap, it runs in O(V log V + E).

Event-driven simulation. In discrete-event simulators (network traffic, physics engines, game loops), events are stored in a priority queue ordered by timestamp. The next event to process is always the one with the smallest timestamp.

Huffman coding. The Huffman algorithm repeatedly extracts the two lowest-frequency symbols and merges them. A min-priority queue makes this O(n log n) instead of O(n²).

Building a CPU Scheduler with a Priority Queue

Let us design a simplified CPU scheduler to see the priority queue in action. The rules:

1. Each task has a name and a priority (higher = more important).

2. Each time slice, the scheduler runs the highest-priority task.

3. New tasks can arrive at any time.

4. A task's priority can be boosted (e.g., to prevent starvation).

5. When a task completes, it is removed from the queue.

python
import heapq

class CPUScheduler:
    def __init__(self):
        self.heap = []   # max-heap via negation
        self.counter = 0 # tiebreaker for equal priorities

    def add_task(self, name, priority):
        """Add a new task. O(log n)."""
        entry = (-priority, self.counter, name)
        self.counter += 1
        heapq.heappush(self.heap, entry)
        print(f"Added {name} with priority {priority}")

    def run_next(self):
        """Run the highest-priority task. O(log n)."""
        if not self.heap:
            print("No tasks to run")
            return None
        neg_pri, _, name = heapq.heappop(self.heap)
        print(f"Running {name} (priority {-neg_pri})")
        return name

    def peek(self):
        """See what runs next without removing. O(1)."""
        if self.heap:
            return self.heap[0][2], -self.heap[0][0]
        return None, None

# Example session
sched = CPUScheduler()
sched.add_task("email", 30)
sched.add_task("browser", 50)
sched.add_task("kernel", 90)
sched.add_task("backup", 10)

sched.run_next()  # Running kernel (priority 90)
sched.run_next()  # Running browser (priority 50)
sched.add_task("virus_scan", 95)  # urgent!
sched.run_next()  # Running virus_scan (priority 95)

The tiebreaker counter. When two tasks have the same priority, Python would try to compare the task names — which works for strings but fails for non-comparable objects. The monotonically increasing counter ensures that ties are broken by insertion order (FIFO for equal priorities). This is a critical pattern for production priority queues.

Interactive Priority Queue — CPU Scheduler

Simulate a CPU task scheduler. Add tasks with priorities, extract the highest-priority task, or boost a task's priority. The tree view and array view stay synchronized. Watch how each operation restructures the heap.

Click a node to select it. Then "Boost" to increase its priority by 10.

Concept check: You INSERT a new element with key 100 into a max-heap where the current root has key 80. How many swaps does INSERT perform?

0 swaps — it goes to the end and stays there 1 swap — it only needs to swap with its immediate parent h swaps (where h is the height of the heap) — the new element is larger than every ancestor, so it bubbles all the way to the root

Chapter 6: Min-Heaps & Variations

Everything we have built so far uses a max-heap: the largest element is at the root. A min-heap is the mirror image: the smallest element is at the root. Every parent is less than or equal to its children.

A[Parent(i)] ≤ A[i] (min-heap property)

The operations are identical but with reversed comparisons. MIN-HEAPIFY swaps with the smaller child. INSERT bubbles up when the new key is smaller than its parent. The code is a one-line change: flip every > to <.

Python's heapq Module

Python's standard library provides heapq, which implements a min-heap using 0-based indexing. This is the heap you will use in interviews and production code.

python
import heapq

# Create a min-heap from a list
data = [4, 1, 3, 2, 16, 9, 10, 14, 8, 7]
heapq.heapify(data)          # O(n), in-place
# data is now [1, 2, 3, 4, 7, 9, 10, 14, 8, 16]

# Push a new element
heapq.heappush(data, 5)     # O(log n)

# Pop the minimum
smallest = heapq.heappop(data)  # O(log n), returns 1

# Push and pop in one step (more efficient)
result = heapq.heappushpop(data, 6)  # push 6, then pop min

# Get the k largest elements
top3 = heapq.nlargest(3, data)  # [16, 14, 10]

# Get the k smallest elements
bot3 = heapq.nsmallest(3, data)  # [2, 3, 4]

Max-heap trick with heapq. Python's heapq is min-heap only. To simulate a max-heap, negate all values: push -x and pop -heapq.heappop(heap). For tuples, negate the priority: heapq.heappush(heap, (-priority, item)).

heapq Pitfalls and Best Practices

Pitfall 1: Heapify modifies in place. heapq.heapify(data) transforms data into a heap in place. It does NOT return a new list. The return value is None. Writing heap = heapq.heapify(data) gives you None.

python
# WRONG
heap = heapq.heapify([3, 1, 4])  # heap is None!

# CORRECT
heap = [3, 1, 4]
heapq.heapify(heap)  # heap is now [1, 3, 4]

Pitfall 2: Direct indexing is not sorted. heap[0] is the minimum, but heap[1] is NOT the second smallest. The heap is partially ordered, not fully sorted. To get elements in order, you must pop them one by one.

Pitfall 3: Tuple comparison. When pushing tuples, Python compares element by element. If the first elements are equal, it compares the second. If the second element is not comparable (e.g., a custom object), you get a TypeError. Always include a tiebreaker integer.

python
# WRONG — crashes if priorities are equal and Task is not comparable
heapq.heappush(heap, (priority, task_object))

# CORRECT — counter breaks ties
counter = 0
heapq.heappush(heap, (priority, counter, task_object))
counter += 1

Best practice: heapreplace vs heappushpop. heapq.heapreplace(heap, val) pops first, then pushes — use when you always want to replace the min. heapq.heappushpop(heap, val) pushes first, then pops — use when you want to insert and immediately remove the new min (which might be the value you just pushed). For the top-K pattern, heapreplace is correct because you have already checked that the new value is larger than the current min.

D-ary Heaps

A binary heap has 2 children per node. A d-ary heap has d children per node. For a node at index i (0-based):

Parent(i) = ⌊(i - 1) / d⌋ k-th child(i) = d · i + k + 1 (k = 0, 1, ..., d-1)

The height becomes log_d n instead of log₂ n. This makes bubble-up operations faster (fewer levels to climb), but bubble-down operations slower (must compare with d children instead of 2).

Property	Binary (d=2)	Ternary (d=3)	4-ary (d=4)
Height	log₂ n	log₃ n	log₄ n
Bubble up (insert)	log₂ n comparisons	log₃ n	log₄ n
Bubble down (extract)	2 log₂ n	3 log₃ n	4 log₄ n
Best for	General purpose	Insert-heavy loads	DECREASE-KEY-heavy (Dijkstra)

When d-ary beats binary. If your workload does many more inserts/increase-key operations than extract-max operations, a higher d reduces the cost of bubble-up (fewer levels) at the expense of slower extract (more children to compare). Dijkstra's algorithm calls DECREASE-KEY far more often than EXTRACT-MIN (E decreases vs V extracts), so a d-ary heap with d = E/V can be optimal.

Fibonacci Heaps

A Fibonacci heap is a more complex data structure that achieves amortized O(1) for INSERT and DECREASE-KEY, with O(log n) amortized EXTRACT-MIN. This matters for algorithms like Dijkstra (O(V log V + E) with Fibonacci heap vs O((V+E) log V) with binary heap) and Prim's MST.

The trade-off: Fibonacci heaps have large constant factors and complex implementation. In practice, binary heaps win for inputs under ~10,000 elements, and even for larger inputs, the simpler cache behavior of binary heaps often compensates. Fibonacci heaps are primarily a theoretical tool — they appear in complexity proofs but rarely in production code.

Implementing a D-ary Heap

A d-ary heap generalizes the binary heap. Here is a complete implementation for any d:

python
class DaryMinHeap:
    """D-ary min-heap with 0-based indexing."""

    def __init__(self, d=2):
        self.d = d
        self.data = []

    def _parent(self, i):
        return (i - 1) // self.d

    def _children(self, i):
        """Return indices of all children of node i."""
        start = self.d * i + 1
        return range(start, min(start + self.d, len(self.data)))

    def push(self, val):
        self.data.append(val)
        i = len(self.data) - 1
        while i > 0 and self.data[i] < self.data[self._parent(i)]:
            p = self._parent(i)
            self.data[i], self.data[p] = self.data[p], self.data[i]
            i = p

    def pop(self):
        self.data[0], self.data[-1] = self.data[-1], self.data[0]
        val = self.data.pop()
        if self.data:
            self._sift_down(0)
        return val

    def _sift_down(self, i):
        while True:
            smallest = i
            for c in self._children(i):
                if self.data[c] < self.data[smallest]:
                    smallest = c
            if smallest == i:
                break
            self.data[i], self.data[smallest] = \
                self.data[smallest], self.data[i]
            i = smallest

# Usage: 4-ary heap
h = DaryMinHeap(d=4)
for x in [10, 4, 15, 20, 0, 30, 1]:
    h.push(x)
print([h.pop() for _ in range(7)])
# [0, 1, 4, 10, 15, 20, 30]

The d-ary trade-off in numbers. For d=4 and n=1,000,000: height = log₄(10⁶) = 10 (vs 20 for binary). Bubble-up does 10 comparisons (vs 20). Sift-down does 4 × 10 = 40 comparisons (vs 2 × 20 = 40). For insert-heavy workloads, the 4-ary heap is 2x faster at insertion with identical extraction cost. This is why Go's runtime timer heap uses d=4.

Binomial Heaps (The Middle Ground)

Between binary heaps and Fibonacci heaps sit binomial heaps. They support merge in O(log n) (vs O(n) for binary heaps), making them useful when you frequently merge two priority queues. The structure is a collection of binomial trees, each satisfying the heap property. Java's PriorityQueue does not use this, but it appears in theoretical algorithms and some network protocols.

Pairing Heaps (The Practical Champion)

If you need DECREASE-KEY in practice (not just theory), the pairing heap is the go-to. It is simpler to implement than Fibonacci heaps and has excellent amortized performance in practice. Amortized O(1) insert and merge, O(log n) extract-min, and "probably" O(log n) decrease-key (the exact amortized bound is still an open question in theoretical CS). Many graph algorithm implementations in competitive programming use pairing heaps.

When to Use What

Scenario	Best choice	Why
General priority queue	Binary heap (array)	Simplest, best cache behavior, O(log n) everything
Python code	`heapq` (min-heap)	Built-in, zero dependencies, well-tested
Need max-heap in Python	Negate values with `heapq`	No max-heap in standard library
Need DECREASE-KEY	Binary heap + index map, or lazy deletion	Simple, practical. Fibonacci heap only if theoretically necessary.
Frequent merges	Binomial or pairing heap	O(log n) or O(1) merge vs O(n) for binary
Dijkstra on dense graphs	Fibonacci heap (theory) or binary heap (practice)	Fibonacci heap gives better asymptotic bound; binary heap has better constants
Insert-heavy workload	d-ary heap (d=4 or d=8)	Shallower tree means faster bubble-up

Min-Heap vs Max-Heap Comparison

Insert values into both a min-heap and max-heap simultaneously. Watch how the same values arrange themselves differently under opposite ordering constraints.

Concept check: You need to repeatedly find and remove the SMALLEST element from a dynamic collection in Python. Which approach is correct?

Use heapq directly — it is already a min-heap. heapq.heappop() returns the smallest element. Negate all values and use heapq as a max-heap Sort the list and pop from the front each time

Chapter 7: Heap Applications

Now we cash in on everything we have built. Heaps power four classic algorithmic patterns that appear constantly in interviews and production systems.

Pattern 1: Top-K Elements

Problem: Given a stream of n numbers, maintain the K largest at all times.

Key insight: Use a min-heap of size K (not a max-heap!). The min-heap's root is the smallest of the K largest elements — the "gatekeeper." If a new element is larger than the gatekeeper, it kicks the gatekeeper out and joins the club.

For each new element x:

If heap has fewer than K elements: just push x. If x > heap[0] (the min): pop the min, push x. Otherwise: discard x.

python
import heapq

def top_k(stream, k):
    """Maintain the k largest elements from a stream."""
    heap = []
    for x in stream:
        if len(heap) < k:
            heapq.heappush(heap, x)
        elif x > heap[0]:
            heapq.heapreplace(heap, x)  # pop min, push x
    return sorted(heap, reverse=True)

Complexity: O(n log K) time, O(K) space. When K is much smaller than n, this is far better than sorting (O(n log n)).

Worked Example: Top-3 from Stream

Stream: [10, 4, 20, 5, 100, 3, 7, 50]. K = 3.

10: Heap has room. Push 10. Heap: [10]. Size: 1.

4: Heap has room. Push 4. Heap: [4, 10]. Size: 2.

20: Heap has room. Push 20. Heap: [4, 10, 20]. Size: 3. Gatekeeper: 4.

5: 5 > 4 (gatekeeper). Replace 4 with 5. Heap: [5, 10, 20]. Gatekeeper: 5.

100: 100 > 5. Replace 5 with 100. Heap: [10, 100, 20]. Gatekeeper: 10.

3: 3 < 10 (gatekeeper). Reject. Heap unchanged.

7: 7 < 10. Reject.

50: 50 > 10. Replace 10 with 50. Heap: [20, 100, 50]. Gatekeeper: 20.

Result: Top-3 = {100, 50, 20}. Correct! We processed 8 elements but never stored more than 3. For a stream of 1 billion elements with K = 100, we use 100 integers of memory — not 1 billion.

Why min-heap for TOP-K? It seems backwards, but the logic is clean: we want to quickly identify and discard the smallest element among our K candidates. A min-heap gives us O(1) access to the smallest, and O(log K) to replace it. If we used a max-heap, we could not efficiently check whether a new element belongs in the top K.

Pattern 2: Median Maintenance (Two-Heap Trick)

Problem: Given a stream of numbers, find the median at any point.

Key insight: Maintain two heaps. A max-heap holds the smaller half of the numbers. A min-heap holds the larger half. The median is at the top of one or both heaps.

python
import heapq

class MedianFinder:
    def __init__(self):
        self.lo = []  # max-heap (negate values) — smaller half
        self.hi = []  # min-heap — larger half

    def add(self, num):
        heapq.heappush(self.lo, -num)          # add to max-heap
        heapq.heappush(self.hi, -heapq.heappop(self.lo))  # balance
        if len(self.hi) > len(self.lo):
            heapq.heappush(self.lo, -heapq.heappop(self.hi))

    def median(self):
        if len(self.lo) > len(self.hi):
            return -self.lo[0]
        return (-self.lo[0] + self.hi[0]) / 2

Complexity: O(log n) per insertion, O(1) to query the median.

How the Two-Heap Trick Works (Detailed)

This is worth understanding deeply because it appears in interviews constantly. Imagine the numbers sorted on a number line. The median splits them into two halves: the left half (smaller numbers) and the right half (larger numbers).

The max-heap (called lo) stores the left half. Its root is the largest of the small numbers — the left candidate for the median. The min-heap (called hi) stores the right half. Its root is the smallest of the large numbers — the right candidate for the median.

Invariants we maintain:

1. Every element in lo ≤ every element in hi.

2. The heaps differ in size by at most 1. lo is allowed to have one extra element.

Insert algorithm: Push the new number into lo (the max-heap). Then move the max of lo to hi (this ensures invariant 1). If hi is now larger than lo, move the min of hi back to lo (invariant 2).

Query: If sizes are equal, median = (top of lo + top of hi) / 2. If lo is bigger, median = top of lo.

Median Maintenance Simulator

Watch the two-heap trick maintain the running median. The left max-heap (orange) holds smaller values. The right min-heap (teal) holds larger values. The median is always at the boundary.

Median: --

Pattern 3: Merge K Sorted Lists

Problem: Given K sorted lists, merge them into one sorted list.

Key insight: Use a min-heap of size K. Initially push the first element from each list. Repeatedly pop the minimum and push the next element from that list.

python
import heapq

def merge_k_sorted(lists):
    """Merge K sorted lists into one sorted list.
    Time: O(N log K) where N is total elements."""
    heap = []
    for i, lst in enumerate(lists):
        if lst:
            heapq.heappush(heap, (lst[0], i, 0))  # (value, list_idx, elem_idx)

    result = []
    while heap:
        val, li, ei = heapq.heappop(heap)
        result.append(val)
        if ei + 1 < len(lists[li]):
            heapq.heappush(heap, (lists[li][ei + 1], li, ei + 1))
    return result

Worked Example: Merge 3 Sorted Lists

Lists: L1 = [1, 4, 7], L2 = [2, 5, 8], L3 = [3, 6, 9].

Init: Push (1, L1), (2, L2), (3, L3) into min-heap. Heap: [(1,L1), (2,L2), (3,L3)].

Pop (1, L1): Output: [1]. Push next from L1: (4, L1). Heap: [(2,L2), (4,L1), (3,L3)].

Pop (2, L2): Output: [1, 2]. Push (5, L2). Heap: [(3,L3), (4,L1), (5,L2)].

Pop (3, L3): Output: [1, 2, 3]. Push (6, L3). Heap: [(4,L1), (6,L3), (5,L2)].

Pop (4, L1): Output: [1, 2, 3, 4]. Push (7, L1). Heap: [(5,L2), (6,L3), (7,L1)].

...and so on. The heap always has at most K = 3 elements, so each pop/push pair costs O(log 3). Total elements: N = 9. Total cost: O(N log K) = O(9 · 2) = 18 operations.

Why not just concatenate and sort? Concatenation + sort costs O(N log N) = O(9 · 4) = 36. For K = 3 the difference is small. But for K = 1000 sorted lists with N = 10,000,000 total elements, K-way merge costs O(N log K) = O(10⁷ · 10) = 10⁸, while sorting costs O(N log N) = O(10⁷ · 23) = 2.3 × 10⁸. The heap approach is 2.3x faster AND uses O(K) space instead of O(N).

Pattern 4: Huffman Coding

Huffman's algorithm builds an optimal prefix-free binary code. It repeatedly extracts the two symbols with the lowest frequency and merges them into a new node. A min-heap makes the two extractions O(log n) each.

python
import heapq

def huffman(freq):
    """Build Huffman tree. freq = {'a': 5, 'b': 9, ...}"""
    heap = [(f, c) for c, f in freq.items()]
    heapq.heapify(heap)
    while len(heap) > 1:
        lo = heapq.heappop(heap)   # smallest frequency
        hi = heapq.heappop(heap)   # second smallest
        merged = (lo[0] + hi[0], (lo, hi))
        heapq.heappush(heap, merged)
    return heap[0]

Worked Example: Huffman with a Heap

Characters and frequencies: a:5, b:9, c:12, d:13, e:16, f:45.

Init min-heap: [(5,'a'), (9,'b'), (12,'c'), (13,'d'), (16,'e'), (45,'f')]

Round 1: Pop 5 and 9. Merge into (14, {a,b}). Push back. Heap: [(12,'c'), (13,'d'), (14,{a,b}), (16,'e'), (45,'f')].

Round 2: Pop 12 and 13. Merge into (25, {c,d}). Heap: [(14,{a,b}), (16,'e'), (25,{c,d}), (45,'f')].

Round 3: Pop 14 and 16. Merge into (30, {a,b,e}). Heap: [(25,{c,d}), (30,{a,b,e}), (45,'f')].

Round 4: Pop 25 and 30. Merge into (55, {a,b,c,d,e}). Heap: [(45,'f'), (55,{...})].

Round 5: Pop 45 and 55. Merge into (100, root). Done.

The resulting Huffman codes: f=0, c=100, d=101, a=1100, b=1101, e=111. The most frequent character (f, freq 45) gets the shortest code (1 bit). The least frequent (a, freq 5) gets the longest (4 bits). Total bits = 5×4 + 9×4 + 12×3 + 13×3 + 16×3 + 45×1 = 224 bits, versus 300 bits for a fixed-length 3-bit encoding. Huffman saves 25%.

Top-K Stream Simulator

A stream of random numbers arrives one at a time. The min-heap maintains the top K. Numbers that make it into the top K are highlighted in teal; those rejected are dimmed. Watch the "gatekeeper" (heap root, the smallest of the top K) rise over time.

K = 5

Waiting for stream...

Concept check: You need to find the median of a stream of 1 million numbers. You use the two-heap approach. How much memory do you need?

O(1) — the heaps only store the two middle elements O(n) — both heaps together store all n elements, roughly n/2 each O(log n) — only the paths from root to leaf matter

Chapter 8: Interview Arsenal

This chapter distills everything into patterns, drills, and cheat sheets you can use in coding interviews. Every problem below has appeared in real interviews at FAANG companies.

Cheat Sheet: When to Use a Heap

Signal in the problem	Pattern	Heap type
"K largest" / "K most frequent"	Top-K	Min-heap of size K
"K smallest" / "K closest"	Top-K (reversed)	Max-heap of size K
"Median" / "middle element"	Two heaps	Max-heap + min-heap
"Merge sorted" / "K sorted lists"	K-way merge	Min-heap of size K
"Schedule" / "next event" / "priority"	Priority queue	Min or max depending on context
"Continuously add and query extreme"	Streaming	Depends on which extreme

Drill 1: Implement a Heap from Scratch

This is the most common heap interview question. You must implement push, pop, and heapify without using any library.

python
class MinHeap:
    def __init__(self):
        self.data = []

    def push(self, val):
        self.data.append(val)
        self._bubble_up(len(self.data) - 1)

    def pop(self):
        if not self.data: raise IndexError("empty")
        self.data[0], self.data[-1] = self.data[-1], self.data[0]
        val = self.data.pop()
        if self.data:
            self._sift_down(0)
        return val

    def peek(self):
        return self.data[0] if self.data else None

    def _bubble_up(self, i):
        while i > 0:
            parent = (i - 1) // 2
            if self.data[i] < self.data[parent]:
                self.data[i], self.data[parent] = self.data[parent], self.data[i]
                i = parent
            else: break

    def _sift_down(self, i):
        n = len(self.data)
        while True:
            smallest = i
            l = 2 * i + 1
            r = 2 * i + 2
            if l < n and self.data[l] < self.data[smallest]: smallest = l
            if r < n and self.data[r] < self.data[smallest]: smallest = r
            if smallest == i: break
            self.data[i], self.data[smallest] = self.data[smallest], self.data[i]
            i = smallest

Complete MaxHeap Class (Interview-Ready)

Here is the full, production-quality max-heap implementation with all operations. Memorize the structure — in an interview, write _bubble_up and _sift_down first, then build everything else on top.

python
class MaxHeap:
    """Complete max-heap with 0-based indexing.
    All operations: push O(log n), pop O(log n), peek O(1),
    heapify O(n), increase_key O(log n), delete O(log n)."""

    def __init__(self, items=None):
        self.data = list(items) if items else []
        if self.data:
            self._heapify()

    def _heapify(self):
        """Build heap from unordered data. O(n)."""
        n = len(self.data)
        for i in range(n // 2 - 1, -1, -1):
            self._sift_down(i)

    def push(self, val):
        """Insert val. O(log n)."""
        self.data.append(val)
        self._bubble_up(len(self.data) - 1)

    def pop(self):
        """Remove and return max. O(log n)."""
        if not self.data:
            raise IndexError("pop from empty heap")
        self.data[0], self.data[-1] = self.data[-1], self.data[0]
        val = self.data.pop()
        if self.data:
            self._sift_down(0)
        return val

    def peek(self):
        """Return max without removing. O(1)."""
        if not self.data:
            raise IndexError("peek at empty heap")
        return self.data[0]

    def increase_key(self, i, new_val):
        """Increase value at index i. O(log n)."""
        if new_val < self.data[i]:
            raise ValueError("new value is smaller")
        self.data[i] = new_val
        self._bubble_up(i)

    def delete(self, i):
        """Delete element at index i. O(log n)."""
        self.data[i] = self.data[-1]
        self.data.pop()
        if i < len(self.data):
            self._bubble_up(i)
            self._sift_down(i)

    def _bubble_up(self, i):
        while i > 0:
            parent = (i - 1) // 2
            if self.data[i] > self.data[parent]:
                self.data[i], self.data[parent] = \
                    self.data[parent], self.data[i]
                i = parent
            else:
                break

    def _sift_down(self, i):
        n = len(self.data)
        while True:
            largest = i
            l = 2 * i + 1
            r = 2 * i + 2
            if l < n and self.data[l] > self.data[largest]:
                largest = l
            if r < n and self.data[r] > self.data[largest]:
                largest = r
            if largest == i:
                break
            self.data[i], self.data[largest] = \
                self.data[largest], self.data[i]
            i = largest

    def __len__(self):
        return len(self.data)

    def __bool__(self):
        return bool(self.data)

    def __repr__(self):
        return f'MaxHeap({self.data})'

# Usage:
h = MaxHeap([4, 1, 3, 2, 16, 9, 10, 14, 8, 7])
print(h.peek())      # 16
print(h.pop())       # 16
print(h.peek())      # 14
h.push(20)
print(h.peek())      # 20
h.increase_key(3, 25)
print(h.peek())      # 25

Drill 2: K Largest Elements in an Array (LeetCode 215)

python
import heapq

def find_kth_largest(nums, k):
    """Return the kth largest element.
    Uses a min-heap of size k."""
    heap = []
    for num in nums:
        if len(heap) < k:
            heapq.heappush(heap, num)
        elif num > heap[0]:
            heapq.heapreplace(heap, num)
    return heap[0]  # the kth largest is the smallest in our top-k

Drill 3: Merge K Sorted Lists (LeetCode 23)

python
import heapq

def merge_k_lists(lists):
    """Merge k sorted linked lists.
    Time: O(N log K), Space: O(K)"""
    heap = []
    for i, node in enumerate(lists):
        if node:
            heapq.heappush(heap, (node.val, i, node))

    dummy = curr = ListNode(0)
    while heap:
        val, i, node = heapq.heappop(heap)
        curr.next = node
        curr = curr.next
        if node.next:
            heapq.heappush(heap, (node.next.val, i, node.next))
    return dummy.next

Drill 4: Find Median from Data Stream (LeetCode 295)

python
import heapq

class MedianFinder:
    def __init__(self):
        self.small = []  # max-heap (negated)
        self.large = []  # min-heap

    def addNum(self, num):
        heapq.heappush(self.small, -num)
        # ensure every element in small ≤ every element in large
        heapq.heappush(self.large, -heapq.heappop(self.small))
        # balance sizes: small can have at most 1 more than large
        if len(self.large) > len(self.small):
            heapq.heappush(self.small, -heapq.heappop(self.large))

    def findMedian(self):
        if len(self.small) > len(self.large):
            return -self.small[0]
        return (-self.small[0] + self.large[0]) / 2

Drill 5: Sort a Nearly Sorted Array (K-Sorted)

A k-sorted array is one where each element is at most k positions from its correct sorted position. This is common in streaming systems where data arrives roughly in order.

python
import heapq

def sort_k_sorted(arr, k):
    """Sort an array where each element is at most k positions
    from its sorted position. Time: O(n log k). Space: O(k)."""
    heap = arr[:k + 1]
    heapq.heapify(heap)  # O(k)

    result = []
    for i in range(k + 1, len(arr)):
        result.append(heapq.heapreplace(heap, arr[i]))
    # Drain remaining elements
    while heap:
        result.append(heapq.heappop(heap))
    return result

# Example: k=3 sorted array
arr = [3, 2, 1, 5, 4, 7, 6, 8]
print(sort_k_sorted(arr, 3))  # [1, 2, 3, 4, 5, 6, 7, 8]

Why O(n log k) and not O(n log n)? The heap never exceeds size k+1. Each of the n elements goes through the heap once (push + pop). Each push/pop is O(log k). Total: O(n log k). When k is a constant (like k=10), this is O(n) — linear time sorting! This is why k-sorted arrays are a common interview topic.

Drill 6: Task Scheduler (LeetCode 621)

python
import heapq
from collections import Counter

def least_interval(tasks, n):
    """Minimum intervals to execute all tasks with cooldown n."""
    counts = Counter(tasks)
    heap = [-c for c in counts.values()]
    heapq.heapify(heap)

    time = 0
    while heap:
        cycle = []
        for _ in range(n + 1):  # one full cooldown window
            if heap:
                cycle.append(heapq.heappop(heap))
        for cnt in cycle:
            if cnt + 1 < 0:  # still has remaining tasks
                heapq.heappush(heap, cnt + 1)
        time += n + 1 if heap else len(cycle)
    return time

Complexity Quick Reference

Operation	Binary Heap	Fibonacci Heap	Sorted Array
Find max/min	O(1)	O(1)	O(1)
Insert	O(log n)	O(1) amortized	O(n)
Extract max/min	O(log n)	O(log n) amortized	O(1)
Increase/Decrease key	O(log n)	O(1) amortized	O(n)
Build from array	O(n)	O(n)	O(n log n)
Merge two heaps	O(n)	O(1)	O(n)

Interview meta-strategy. When you see a heap problem, say the pattern name out loud: "This is a top-K problem, so I'll use a min-heap of size K." Or "This requires maintaining a running median, so I'll use the two-heap trick." Naming the pattern signals to the interviewer that you have a systematic framework, not just memorized solutions.

Drill 6: Sort a Nearly Sorted Array Revisited

Let us trace the k-sorted algorithm on [6, 5, 3, 2, 8, 10, 9] with k=3.

The guarantee: each element is at most 3 positions from where it belongs in the sorted output [2, 3, 5, 6, 8, 9, 10].

Init: Push first k+1 = 4 elements: heap = [2, 5, 6, 3] (heapified, but internally ordered as min-heap: [2, 3, 6, 5]).

i=4 (val=8): Pop min (2), push 8. Output: [2]. Heap: [3, 5, 6, 8].

i=5 (val=10): Pop min (3), push 10. Output: [2, 3]. Heap: [5, 8, 6, 10].

i=6 (val=9): Pop min (5), push 9. Output: [2, 3, 5]. Heap: [6, 8, 9, 10].

Drain: Pop 6, 8, 9, 10. Output: [2, 3, 5, 6, 8, 9, 10]. Sorted!

The key insight: since each element is at most k positions away, the correct output element is always within the next k+1 candidates. A min-heap of size k+1 always contains the next correct element.

Drill 8: K Closest Points to Origin (LeetCode 973)

python
import heapq

def k_closest(points, k):
    """Return k closest points to origin.
    Use a MAX-heap of size k (negate distances)."""
    heap = []
    for x, y in points:
        dist = x*x + y*y  # no need for sqrt
        if len(heap) < k:
            heapq.heappush(heap, (-dist, [x, y]))
        elif -dist > heap[0][0]:
            heapq.heapreplace(heap, (-dist, [x, y]))
    return [p for _, p in heap]

Drill 9: Reorganize String (LeetCode 767)

python
import heapq
from collections import Counter

def reorganize_string(s):
    """Rearrange so no two adjacent chars are same."""
    counts = Counter(s)
    heap = [(-cnt, ch) for ch, cnt in counts.items()]
    heapq.heapify(heap)

    result = []
    prev_cnt, prev_ch = 0, ''
    while heap:
        cnt, ch = heapq.heappop(heap)
        result.append(ch)
        if prev_cnt < 0:  # push back previous char
            heapq.heappush(heap, (prev_cnt, prev_ch))
        prev_cnt, prev_ch = cnt + 1, ch  # decrement (negative)

    return ''.join(result) if len(result) == len(s) else ''

Common Mistakes in Heap Interviews

Mistake	Why it happens	Fix
Using max-heap for top-K smallest	Confusing "I want the K smallest" with "my heap stores the smallest"	For K smallest: use a MAX-heap of size K. The root is the gatekeeper — the largest of the K smallest.
Forgetting tuple comparison	Python compares tuples element-by-element. If first elements tie, it compares second elements — which may not be comparable.	Add a tiebreaker: `(priority, counter, item)` where counter is a monotonically increasing int.
Calling heapify on each push	Misunderstanding heapify vs heappush	`heapify` = convert list to heap, O(n). `heappush` = add one element, O(log n). Never call heapify repeatedly.
Modifying heap elements directly	Changing a value in the array without re-heapifying	After changing A[i], you must bubble-up or sift-down. Or use the "lazy deletion" pattern.

The Five Interview Dimensions for Heaps

Dimension	Example question	What a strong answer includes
Concept	"Prove BUILD-MAX-HEAP is O(n)"	The sum ∑h/2^h = 2, with the intuition that most nodes are near the bottom
Design	"Design a system to find the top 10 trending topics from a stream of tweets"	Min-heap of size 10, hash map for counts, sliding window for recency
Code	"Implement a heap from scratch, then solve K largest elements"	Clean _bubble_up/_sift_down, correct 0-indexed bounds, heapify in O(n)
Debug	"My priority queue returns wrong elements after INCREASE-KEY. Why?"	Check: are you bubbling up after increase? Is the comparison direction correct?
Frontier	"When would you use a Fibonacci heap vs binary heap?"	Fibonacci for decrease-key-heavy algorithms (Dijkstra on dense graphs). Binary heap for everything else due to simpler code and better cache behavior.

Interview question: You have a stream of 10 million integers and need to find the 100 largest. What is the time and space complexity of the optimal approach?

Time O(n log K) = O(n log 100) = O(n · 7), Space O(K) = O(100). Use a min-heap of size 100. Each of the 10 million elements is compared against the heap root, and only pushed/replaced if larger. The heap never exceeds 100 elements. Time O(n log n), Space O(n). Sort the entire array and take the last 100 elements. Time O(n), Space O(n). Use quickselect to find the 100th largest.

Chapter 9: Connections

Heaps do not exist in isolation. They are a foundational data structure that powers algorithms throughout computer science. Here is how Chapter 6 connects to the rest of CLRS and beyond.

Forward Connections

Topic	How heaps appear
Quicksort (Ch 7)	Heapsort's O(n log n) worst case is the benchmark quicksort fails to match. Introsort (used in C++ std::sort) falls back to heapsort when quicksort's recursion depth exceeds 2 log n.
Counting/Radix Sort (Ch 8)	Linear-time sorts only work for integers with bounded range. Heapsort handles arbitrary comparable elements. Knowing when to use which is a key interview skill.
Red-Black Trees (Ch 13)	Both support O(log n) insert and extract. BSTs additionally support O(log n) search, predecessor/successor, and ordered iteration — but with higher constant factors and pointer overhead.
Dijkstra's Algorithm (Ch 22/24)	The canonical use of a min-priority queue. Binary heap gives O((V+E) log V). Fibonacci heap gives O(V log V + E). The choice of heap determines Dijkstra's overall complexity.
Prim's MST (Ch 23)	Like Dijkstra, Prim's algorithm uses a min-priority queue to always grow the MST by adding the lightest crossing edge.
Huffman Codes (Ch 16)	Greedy algorithm that repeatedly extracts two minimum-frequency nodes. Min-heap is the natural implementation.

What Heaps Cannot Do

Operation	Heap	Better alternative
Search for arbitrary element	O(n) — must scan everything	BST: O(log n), Hash table: O(1)
Find k-th element	O(k log n) — extract k times	Order-statistic tree: O(log n)
Sorted iteration	O(n log n) — must extract all	BST in-order walk: O(n)
Merge two heaps	O(n) for binary heap	Fibonacci/Binomial heap: O(1)/O(log n)

The hierarchy of ordered data structures. Unsorted array < Heap < BST < Balanced BST. Each step adds more structure and supports more operations, but at the cost of higher constant factors. Heaps sit at the sweet spot for priority-based access: they do less than a BST but they do it faster (smaller constants, better cache behavior). Choose the simplest structure that supports the operations you need.

Heaps in Real Systems

System	Heap variant used	Why
Linux kernel (timers)	Red-black tree (rb-tree)	Timer wheel for most timers, rb-tree for high-resolution. Heaps were considered but rb-trees allow deletion of arbitrary nodes more naturally.
Go runtime	4-ary min-heap	Timer heap uses d=4 for better cache behavior on modern CPUs (fewer cache lines per sift-down).
Python heapq	Binary min-heap (array)	Simplest possible implementation. 0-indexed. No decrease-key (use lazy deletion instead).
Java PriorityQueue	Binary min-heap (array)	Grows dynamically via array doubling. Natural ordering or custom Comparator.
C++ priority_queue	Binary max-heap over vector	Uses `std::make_heap`, `push_heap`, `pop_heap` internally.
Dijkstra (textbook)	Binary min-heap or Fibonacci heap	Binary heap: O((V+E) log V). Fibonacci heap: O(V log V + E). In practice, binary heap wins for sparse graphs.
*A search**	Binary min-heap (open set)	The open set is a priority queue ordered by f(n) = g(n) + h(n).
Huffman coding	Binary min-heap	Repeatedly extract two minimum-frequency nodes. Classic greedy algorithm.

The Lazy Deletion Pattern

A common practical issue: Python's heapq does not support DECREASE-KEY or DELETE. The standard workaround is lazy deletion: instead of removing an element, mark it as deleted. When it reaches the top during extract-min, skip it.

python
import heapq

class LazyPQ:
    def __init__(self):
        self.heap = []
        self.deleted = set()
        self.counter = 0

    def push(self, priority, item):
        entry = (priority, self.counter, item)
        self.counter += 1
        heapq.heappush(self.heap, entry)
        return entry  # caller keeps this for deletion

    def pop(self):
        while self.heap:
            pri, cnt, item = heapq.heappop(self.heap)
            if cnt not in self.deleted:
                return item
            self.deleted.discard(cnt)
        raise IndexError("empty")

    def remove(self, entry):
        self.deleted.add(entry[1])  # mark counter as deleted

Key Takeaways from Chapter 6

1. The heap is an array pretending to be a tree. Parent(i)=i/2, Left(i)=2i, Right(i)=2i+1. No pointers, no overhead, just arithmetic.

2. Two primitives, everything else. Bubble-down (MAX-HEAPIFY) and bubble-up (INCREASE-KEY) are the only two operations. Build-heap, heapsort, insert, extract — all are just these two primitives composed differently.

3. BUILD-MAX-HEAP is O(n), not O(n log n). Most nodes are near the bottom and barely move. This is one of the most commonly asked proof questions in algorithms interviews.

4. Heapsort: O(n log n) worst case, O(1) space. The only comparison sort with both guarantees. But loses to quicksort in practice due to cache behavior.

5. Priority queues are the real star. Heaps exist in production code as priority queues, not as sorting algorithms. Python's heapq, Java's PriorityQueue, C++'s priority_queue — all backed by binary heaps.

The one sentence summary. A heap is a nearly-sorted array that trades full ordering for the ability to find and maintain the extreme element in O(log n) time, and it is the engine behind priority queues, graph algorithms, and streaming computations.

Final question: You are designing a system that must support: (1) insert elements with priorities, (2) extract the highest-priority element, and (3) search for a specific element by name. Which data structure(s) should you use?

A binary heap alone — it handles all three operations A binary heap for insert/extract-max (O(log n)) combined with a hash map for name-to-index lookup (O(1)). The heap alone cannot search by name efficiently — it would require O(n) scan. The hash map stores {name: heap_index} and is updated on every swap. A balanced BST — it supports all three in O(log n)

A Complete CLRS Chapter 6 Study Plan

If you are studying for an algorithms exam or a technical interview, here is a structured plan for mastering this chapter:

Day	Focus	Exercises
Day 1	Heap structure, array representation	Draw a 10-element heap. Convert between tree and array. Verify parent/child formulas by hand.
Day 2	MAX-HEAPIFY	Trace MAX-HEAPIFY by hand on 3 different examples. Implement recursively AND iteratively.
Day 3	BUILD-MAX-HEAP, O(n) proof	Trace BUILD-MAX-HEAP on a random 10-element array. Prove the O(n) bound from memory.
Day 4	Heapsort	Implement heapsort from scratch (no libraries). Count comparisons for n=100.
Day 5	Priority queues	Implement INSERT, EXTRACT-MAX, INCREASE-KEY. Solve LeetCode 215, 295.
Day 6	Applications	Solve LeetCode 23 (merge K lists), 347 (top K frequent), 621 (task scheduler).
Day 7	Review + mock interview	Implement a complete heap from scratch in 15 minutes. Explain BUILD-MAX-HEAP O(n) in 2 minutes.

CLRS Exercises Worth Doing

Not all exercises are equally valuable. Here are the ones that deepen understanding:

6.1-1: What are the minimum and maximum number of elements in a heap of height h? Answer: 2^h (just the last level has one node) to 2^h+1-1 (all levels full).

6.1-7: Show that the leaves of an n-element heap are at indices ⌊n/2⌋+1, ⌊n/2⌋+2, ..., n. Proof: Left(i) = 2i. For a leaf, 2i > n, so i > n/2. The first leaf is at ⌊n/2⌋+1.

6.2-5: Write iterative MAX-HEAPIFY. We covered this in Chapter 2.

6.4-3: What is the running time of heapsort on an already-sorted array? Still O(n log n). BUILD-MAX-HEAP is O(n). Then each extraction does O(log n) work because the smallest element (swapped to root) always sinks to the bottom. Heapsort does not benefit from presortedness.

6.5-9: Give an O(n log k) algorithm to merge k sorted lists into one sorted list, where n is the total number of elements. This is the K-way merge pattern from Chapter 7.

Heap Complexity Summary

One final reference table to burn into memory. Every time complexity below assumes a binary heap with n elements.

Operation	Best case	Worst case	Amortized	Triggers
Find max	O(1)	O(1)	O(1)	Return A[0]
Insert	O(1)*	O(log n)	O(log n)	Append + bubble-up
Extract max	O(log n)	O(log n)	O(log n)	Swap root/last + sift-down
Increase key	O(1)*	O(log n)	O(log n)	Update + bubble-up
Delete at i	O(log n)	O(log n)	O(log n)	Replace + bubble/sift
Build heap	O(n)	O(n)	O(n)	Bottom-up heapify
Heapsort	O(n log n)	O(n log n)	—	Build + n extractions
Merge heaps	O(n)	O(n)	O(n)	Concatenate + build

* Insert best case is O(1) when the new element is smaller than its parent (no bubble-up needed). Increase-key best case is O(1) when the new value is still less than the parent.

What to Study Next

If you have mastered this chapter, the natural next steps are:

1. Quicksort (CLRS Ch 7). The other O(n log n) comparison sort. Understand the randomized pivot analysis, the O(n²) worst case, and why introsort combines both.

2. Linear-time sorting (CLRS Ch 8). Counting sort, radix sort, bucket sort. These break the O(n log n) comparison-sort lower bound by exploiting the structure of the keys (integers, fixed-length strings).

3. Red-Black Trees (CLRS Ch 13). The balanced BST that supports all heap operations PLUS search, predecessor/successor, and ordered iteration. When you need more than just min/max access.

4. Graph algorithms (CLRS Ch 22-24). Dijkstra's shortest path and Prim's MST both use priority queues as their core data structure. Implementing these algorithms solidifies your understanding of heaps in context.

"The heap is a wolf in sheep's clothing — it looks like a simple array, but it behaves like a tree." — paraphrasing Williams, 1964

Heapsort & Priority Queues

Chapter 0: The Problem

The Three Contenders

Hand Calculation: Why Linear Search Hurts

Why Not Just Use a Balanced BST?

The Road Ahead

Chapter 1: Binary Heap Structure

Complete Binary Tree

The Array Trick

Hand Calculation: Navigating the Heap

The Max-Heap Property

Heap Height

Why the Shape Constraint Matters

Verifying the Max-Heap Property

How Many Valid Max-Heaps Exist?

Chapter 2: MAX-HEAPIFY

The Algorithm

Worked Example

Runtime Analysis

The Iterative Version

Correctness Argument

What Can Go Wrong

Tracing MAX-HEAPIFY on a Larger Example

Chapter 3: BUILD-MAX-HEAP

Why Bottom-Up?

Worked Example

The O(n) Proof — This Is NOT O(n log n)

Deriving the Sum ∑ h/2h

Empirical Verification

Alternative: Top-Down Build (Repeated Insertion)

Visualizing the O(n) vs O(n log n) Difference

Chapter 4: Heapsort

The Algorithm

Worked Example

Runtime

Hand Calculation: Counting Swaps

Stability: Why Heapsort Is Not Stable

Introsort: The Best of Both Worlds

Heapsort vs Quicksort: The Full Picture

Chapter 5: Priority Queues

The Four Operations

EXTRACT-MAX: Removing the Top

INSERT and INCREASE-KEY: Bubbling Up

Worked Example: INSERT

Worked Example: EXTRACT-MAX

DELETE Arbitrary Element

Where Priority Queues Appear

Building a CPU Scheduler with a Priority Queue

Chapter 6: Min-Heaps & Variations

Python's heapq Module

heapq Pitfalls and Best Practices

D-ary Heaps

Fibonacci Heaps

Implementing a D-ary Heap

Binomial Heaps (The Middle Ground)

Pairing Heaps (The Practical Champion)

When to Use What

Chapter 7: Heap Applications

Pattern 1: Top-K Elements

Worked Example: Top-3 from Stream

Pattern 2: Median Maintenance (Two-Heap Trick)

How the Two-Heap Trick Works (Detailed)

Pattern 3: Merge K Sorted Lists

Worked Example: Merge 3 Sorted Lists

Pattern 4: Huffman Coding

Worked Example: Huffman with a Heap

Chapter 8: Interview Arsenal

Cheat Sheet: When to Use a Heap

Drill 1: Implement a Heap from Scratch

Complete MaxHeap Class (Interview-Ready)

Drill 2: K Largest Elements in an Array (LeetCode 215)

Drill 3: Merge K Sorted Lists (LeetCode 23)

Drill 4: Find Median from Data Stream (LeetCode 295)

Drill 5: Sort a Nearly Sorted Array (K-Sorted)

Drill 6: Task Scheduler (LeetCode 621)

Complexity Quick Reference

Drill 6: Sort a Nearly Sorted Array Revisited

Drill 8: K Closest Points to Origin (LeetCode 973)

Drill 9: Reorganize String (LeetCode 767)

Common Mistakes in Heap Interviews

Deriving the Sum ∑ h/2^h