DSA Tutorial
🔍

Heap

What Is a Heap?

A heap is a complete binary tree satisfying the heap property — a parent's value always compares in a fixed way to its children's values.

MIN-HEAP:  every parent ≤ both children  →  root = global minimum
MAX-HEAP:  every parent ≥ both children  →  root = global maximum

MIN-HEAP:                    MAX-HEAP:
          1                           9
        /   \                       /   \
       3     2                     7     8
      / \   / \                   / \   / \
     6   5 4   8                 2   4 5   1

Root is ALWAYS the minimum.   Root is ALWAYS the maximum.
ANY parent ≤ its children.    ANY parent ≥ its children.

NOT a heap (parent 2 > child 1 on right):
          2
         / \
        3   1  ← 1 < 2 violates min-heap property only at this pair
                  but wait: 2 > 1 means parent > child → violates min-heap ✗

Array Representation

A heap is stored as a flat array — the complete binary tree structure maps perfectly to contiguous indices with no gaps.

MIN-HEAP:          1
                 /   \
                3     2
               / \   / \
              6   5 4   8

Array: [1, 3, 2, 6, 5, 4, 8]
        0  1  2  3  4  5  6

INDEX ARITHMETIC (0-based):
  Parent of index i:       (i - 1) / 2   (integer division)
  Left child of index i:   2 * i + 1
  Right child of index i:  2 * i + 2

VERIFICATION:
  Node at index 1 (value 3):
    Parent:      (1-1)/2 = 0  → arr[0] = 1  ✓  (1 ≤ 3)
    Left child:  2*1+1 = 3    → arr[3] = 6  ✓  (3 ≤ 6)
    Right child: 2*1+2 = 4    → arr[4] = 5  ✓  (3 ≤ 5)

  Node at index 2 (value 2):
    Parent:      (2-1)/2 = 0  → arr[0] = 1  ✓  (1 ≤ 2)
    Left child:  2*2+1 = 5    → arr[5] = 4  ✓  (2 ≤ 4)
    Right child: 2*2+2 = 6    → arr[6] = 8  ✓  (2 ≤ 8)

WHY ARRAY? No pointers needed. O(1) access to parent and children.
Arrays are contiguous in memory → better cache performance than linked nodes.

Complete Min-Heap Implementation

Java
1import java.util.ArrayList; 2import java.util.NoSuchElementException; 3 4public class MinHeap { 5 private ArrayList<Integer> data; 6 7 public MinHeap() { data = new ArrayList<>(); } 8 9 private int parent(int i) { return (i - 1) / 2; } 10 private int leftChild(int i) { return 2 * i + 1; } 11 private int rightChild(int i) { return 2 * i + 2; } 12 13 private void swap(int i, int j) { 14 int tmp = data.get(i); 15 data.set(i, data.get(j)); 16 data.set(j, tmp); 17 } 18 19 // INSERT — add at end, sift up — O(log n) 20 public void insert(int value) { 21 data.add(value); // Add at end (maintain complete tree) 22 siftUp(data.size() - 1); // Restore heap property upward 23 } 24 25 private void siftUp(int i) { 26 while (i > 0 && data.get(parent(i)) > data.get(i)) { 27 swap(i, parent(i)); 28 i = parent(i); 29 } 30 } 31 32 // PEEK minimum — O(1) 33 public int peekMin() { 34 if (data.isEmpty()) throw new NoSuchElementException("Heap is empty"); 35 return data.get(0); // Root is always the minimum 36 } 37 38 // EXTRACT minimum — remove root, replace with last, sift down — O(log n) 39 public int extractMin() { 40 if (data.isEmpty()) throw new NoSuchElementException("Heap is empty"); 41 42 int min = data.get(0); // Save the minimum (root) 43 int last = data.size() - 1; 44 45 data.set(0, data.get(last)); // Move last element to root 46 data.remove(last); // Remove last slot 47 48 if (!data.isEmpty()) siftDown(0); // Restore heap property downward 49 50 return min; 51 } 52 53 private void siftDown(int i) { 54 int n = data.size(); 55 56 while (true) { 57 int smallest = i; 58 int l = leftChild(i), r = rightChild(i); 59 60 if (l < n && data.get(l) < data.get(smallest)) smallest = l; 61 if (r < n && data.get(r) < data.get(smallest)) smallest = r; 62 63 if (smallest == i) break; // Heap property satisfied 64 65 swap(i, smallest); 66 i = smallest; 67 } 68 } 69 70 // BUILD HEAP from existing array — Floyd's algorithm — O(n) 71 public static MinHeap buildHeap(int[] arr) { 72 MinHeap heap = new MinHeap(); 73 for (int v : arr) heap.data.add(v); 74 75 // Sift down from last internal node to root 76 for (int i = arr.length / 2 - 1; i >= 0; i--) { 77 heap.siftDown(i); 78 } 79 return heap; 80 } 81 82 public int size() { return data.size(); } 83 public boolean isEmpty() { return data.isEmpty(); } 84 85 @Override 86 public String toString() { return data.toString(); } 87 88 public static void main(String[] args) { 89 MinHeap heap = new MinHeap(); 90 heap.insert(5); heap.insert(3); heap.insert(8); 91 heap.insert(1); heap.insert(7); heap.insert(2); 92 93 System.out.println("Heap: " + heap); // [1, 3, 2, 5, 7, 8] 94 System.out.println("Min: " + heap.peekMin()); // 1 95 96 System.out.println("Extract: " + heap.extractMin()); // 1 97 System.out.println("Extract: " + heap.extractMin()); // 2 98 System.out.println("Heap after extracts: " + heap); // [3, 5, 8, 7] 99 100 MinHeap built = MinHeap.buildHeap(new int[]{4, 10, 3, 5, 1}); 101 System.out.println("Built: " + built); // [1, 4, 3, 5, 10] 102 } 103}
Output:
Heap: [1, 3, 2, 5, 7, 8]
Min:  1
Extract: 1
Extract: 2
After: [3, 5, 8, 7]
Built: [1, 4, 3, 5, 10]

Sift-Up and Sift-Down Traced

Sift-Up: Inserting 0 into [1, 3, 2, 6, 5, 4, 8]

Add 0 at end:  [1, 3, 2, 6, 5, 4, 8, 0]   i=7
Parent of 7: (7-1)/2=3 → value 6.   0 < 6  → SWAP
              [1, 3, 2, 0, 5, 4, 8, 6]    i=3
Parent of 3: (3-1)/2=1 → value 3.   0 < 3  → SWAP
              [1, 0, 2, 3, 5, 4, 8, 6]    i=1
Parent of 1: (1-1)/2=0 → value 1.   0 < 1  → SWAP
              [0, 1, 2, 3, 5, 4, 8, 6]    i=0
i=0 (root) → STOP

Result: [0, 1, 2, 3, 5, 4, 8, 6]   new min = 0 ✓

Sift-Down: After Extracting Min from [1, 3, 2, 6, 5, 4, 8]

Swap root with last: [8, 3, 2, 6, 5, 4]   (8 is now at root)
Remove last slot.

Sift-down from i=0 (value 8):
  Children: left=1(val 3), right=2(val 2). Smallest child = 2 at index 2.
  8 > 2 → SWAP   [2, 3, 8, 6, 5, 4]   i=2
  Children of 2: left=5(val 4), right=6(out of bounds).
  8 > 4 → SWAP   [2, 3, 4, 6, 5, 8]   i=5
  No children of index 5 → STOP

Result: [2, 3, 4, 6, 5, 8]   new min = 2 ✓   extracted = 1 ✓

Build Heap O(n) — Floyd's Algorithm

PROBLEM: build a heap from an unsorted array efficiently.

NAIVE approach — insert each element one by one:
  n insertions × O(log n) per insert = O(n log n)

FLOYD'S algorithm — sift-down from bottom up:
  Only internal nodes need sifting (leaves are trivially valid heaps).
  Last internal node = index n/2 - 1.
  Sift down from index n/2-1 to 0.

WHY O(n)?
  Nodes at height h: approximately n/2^(h+1) nodes
  Work per node at height h: O(h)
  Total work = Σ (n/2^(h+1)) × h for h=0..log n
             = n × Σ h/2^(h+1)
             = n × O(1)   ← geometric series converges to a constant
             = O(n)

Example: unsorted [4, 10, 3, 5, 1]

Array:    [4, 10, 3, 5, 1]
           0   1  2  3  4

Last internal node = 5/2 - 1 = 1

i=1 (value 10): children are index 3(5) and 4(1). Min child = 1 at index 4.
  10 > 1 → SWAP   [4, 1, 3, 5, 10]
  No more children for index 4 → done for i=1

i=0 (value 4): children are index 1(1) and 2(3). Min child = 1 at index 1.
  4 > 1 → SWAP    [1, 4, 3, 5, 10]
  Children of index 1: left=3(5), right=4(10). Min child = 5 at index 3.
  4 < 5 → STOP

Result: [1, 4, 3, 5, 10]   ← valid min-heap ✓

Built-in Priority Queues

Java
1import java.util.*; 2 3public class BuiltinPriorityQueue { 4 5 public static void main(String[] args) { 6 // MIN-HEAP (default — natural ordering) 7 PriorityQueue<Integer> minPQ = new PriorityQueue<>(); 8 minPQ.offer(5); minPQ.offer(3); minPQ.offer(8); minPQ.offer(1); 9 System.out.println("Min peek: " + minPQ.peek()); // 1 10 System.out.println("Min poll: " + minPQ.poll()); // 1 11 System.out.println("Min poll: " + minPQ.poll()); // 3 12 13 // MAX-HEAP — use reverseOrder comparator 14 PriorityQueue<Integer> maxPQ = new PriorityQueue<>(Collections.reverseOrder()); 15 maxPQ.offer(5); maxPQ.offer(3); maxPQ.offer(8); maxPQ.offer(1); 16 System.out.println("Max peek: " + maxPQ.peek()); // 8 17 System.out.println("Max poll: " + maxPQ.poll()); // 8 18 19 // CUSTOM COMPARATOR — sort by string length (min-heap by length) 20 PriorityQueue<String> strPQ = new PriorityQueue<>( 21 Comparator.comparingInt(String::length) 22 ); 23 strPQ.offer("banana"); strPQ.offer("fig"); strPQ.offer("kiwi"); 24 System.out.println("Shortest: " + strPQ.poll()); // fig 25 26 // PAIRS [value, index] — min-heap by value 27 PriorityQueue<int[]> pairPQ = new PriorityQueue<>( 28 (a, b) -> Integer.compare(a[0], b[0]) 29 ); 30 pairPQ.offer(new int[]{3, 0}); 31 pairPQ.offer(new int[]{1, 2}); 32 pairPQ.offer(new int[]{2, 1}); 33 int[] top = pairPQ.poll(); 34 System.out.println("Min pair: [" + top[0] + "," + top[1] + "]"); // [1,2] 35 36 // INITIAL CAPACITY hint (optional — for performance) 37 PriorityQueue<Integer> bigPQ = new PriorityQueue<>(1000); 38 System.out.println("Size: " + bigPQ.size()); // 0 39 } 40}
Output (Java):
Min peek:   1
Min poll:   1
Min poll:   3
Max peek:   8
Max poll:   8
Shortest:   fig
Min pair:   [1,2]

Application 1: Kth Largest Element

Maintain a min-heap of size k. The heap top is always the kth largest.

Problem: find kth largest in [3,2,1,5,6,4], k=2

Min-heap of size k=2:
  Process 3: heap=[3]
  Process 2: heap=[2,3]  (size=k=2)
  Process 1: push 1→[1,2,3], size>k → pop min(1) → heap=[2,3]
  Process 5: push 5→[2,3,5], size>k → pop min(2) → heap=[3,5]
  Process 6: push 6→[3,5,6], size>k → pop min(3) → heap=[5,6]
  Process 4: push 4→[4,5,6], size>k → pop min(4) → heap=[5,6]

heap.top() = 5 = 2nd largest ✓
Java
1import java.util.PriorityQueue; 2 3public class KthLargest { 4 public static int findKthLargest(int[] nums, int k) { 5 PriorityQueue<Integer> minHeap = new PriorityQueue<>(k); 6 7 for (int num : nums) { 8 minHeap.offer(num); 9 if (minHeap.size() > k) minHeap.poll(); // Remove smallest 10 } 11 12 return minHeap.peek(); // k-th largest 13 } 14 15 public static void main(String[] args) { 16 System.out.println(findKthLargest(new int[]{3,2,1,5,6,4}, 2)); // 5 17 System.out.println(findKthLargest(new int[]{3,2,3,1,2,4,5,5,6}, 4)); // 4 18 } 19}
Output:
5
4

Application 2: Heap Sort

Build a max-heap, then repeatedly extract the maximum to sort in ascending order — O(n log n) in-place.

Array: [4, 10, 3, 5, 1]

Step 1: Build max-heap (Floyd's O(n)):
  [10, 5, 3, 4, 1]

Step 2: Repeatedly extract max:
  Swap root(10) with last(1): [1, 5, 3, 4, |10]
  Sift-down 1 in [1,5,3,4]:   [5, 4, 3, 1, |10]

  Swap root(5) with last(1): [1, 4, 3, |5, 10]
  Sift-down 1 in [1,4,3]:    [4, 1, 3, |5, 10]

  Swap root(4) with last(3): [3, 1, |4, 5, 10]
  Sift-down 3 in [3,1]:      [3, 1, |4, 5, 10]

  Swap root(3) with last(1): [1, |3, 4, 5, 10]
  (only 1 element — done)

Sorted: [1, 3, 4, 5, 10] ✓
Java
1public class HeapSort { 2 public static void heapSort(int[] arr) { 3 int n = arr.length; 4 5 // Build max-heap — Floyd's O(n) 6 for (int i = n / 2 - 1; i >= 0; i--) siftDown(arr, n, i); 7 8 // Extract elements one by one 9 for (int i = n - 1; i > 0; i--) { 10 swap(arr, 0, i); // Move current max to end 11 siftDown(arr, i, 0); // Heapify the reduced heap 12 } 13 } 14 15 private static void siftDown(int[] arr, int heapSize, int i) { 16 while (true) { 17 int largest = i, l = 2*i+1, r = 2*i+2; 18 if (l < heapSize && arr[l] > arr[largest]) largest = l; 19 if (r < heapSize && arr[r] > arr[largest]) largest = r; 20 if (largest == i) break; 21 swap(arr, i, largest); 22 i = largest; 23 } 24 } 25 26 private static void swap(int[] arr, int i, int j) { 27 int tmp = arr[i]; arr[i] = arr[j]; arr[j] = tmp; 28 } 29 30 public static void main(String[] args) { 31 int[] arr = {4, 10, 3, 5, 1}; 32 heapSort(arr); 33 System.out.println(java.util.Arrays.toString(arr)); // [1, 3, 4, 5, 10] 34 } 35}
Output:
[1, 3, 4, 5, 10]

Complexity Summary

OperationTimeSpaceNotes
Insert (sift-up)O(log n)O(1)Add at end, bubble up
Extract min/maxO(log n)O(1)Remove root, sift-down last
Peek min/maxO(1)O(1)Root is always min/max
Build heap (Floyd)O(n)O(1)Better than n × insert = O(n log n)
Heap sortO(n log n)O(1)In-place, not stable
kth largest (heap size k)O(n log k)O(k)Better than O(n log n) sort
Merge k sorted listsO(n log k)O(k)n total elements, k lists
Find median (two heaps)O(log n) per insertO(n)Max-heap + min-heap

Common Mistakes

Using (a - b) as comparator in Java when values can be near Integer.MIN_VALUE or Integer.MAX_VALUE. Integer subtraction overflows silently: Integer.MIN_VALUE - 1 wraps to Integer.MAX_VALUE. Use Integer.compare(a, b) for safety.

Python max-heap: forgetting to negate on pop. Push -val to simulate a max-heap, but when popping, the result is also negated. Always do -heapq.heappop(max_heap) to get the actual maximum value.

Accessing the minimum before inserting anything. peekMin() on an empty heap throws. Always check isEmpty() or size() > 0 before peeking or extracting.

Build heap: starting sift-down at index n-1 instead of n/2-1. Leaf nodes (indices n/2 to n-1) have no children and don't need sifting. Starting at n-1 wastes O(n/2) calls. Always start Floyd's from the last internal node at index n/2 - 1.

Heap sort: sifting up instead of down during the sort phase. After placing the max at the end, sift-DOWN the new root (old last element). Sifting up from the last element in the remaining heap is incorrect and breaks the heap property.

Interview Questions

Q: Why is build-heap O(n) while n insertions is O(n log n)?

Floyd's algorithm sifts elements downward from the last internal node to the root. Nodes at the bottom (leaves) need no sifting work. Only nodes near the top need full O(log n) sifts, but there are very few of those. Formally: sum of (work per node × node count) at each level = n × Σ h/2^h ≈ 2n = O(n). With n insertions, every element sifts up from the bottom, paying O(log n) even for elements that should end up near the bottom.

Q: Why is heap sort rarely used despite O(n log n) guaranteed and O(1) space?

Heap sort's sift-down operation accesses children at indices 2i+1 and 2i+2. For large heaps, these jump to far-apart memory locations at each level — almost every comparison causes a cache miss. QuickSort's partition accesses elements sequentially — excellent cache locality. Empirically, QuickSort runs 2-3× faster than heap sort on real hardware despite identical asymptotic complexity.

Q: When would you choose a min-heap of size k over sorting for kth-largest?

When k << n. Sorting takes O(n log n) regardless of k. A min-heap of size k processes each element in O(log k) time — total O(n log k). When k = 10 and n = 1,000,000: sorting takes 20,000,000 operations; heap approach takes about 1,000,000 × 4 = 4,000,000 operations. The heap approach wins more dramatically as k decreases relative to n.

Summary

A heap is a complete binary tree stored as an array with the parent always ≤ (min-heap) or ≥ (max-heap) its children. The root is always the minimum or maximum.

Array index arithmetic (0-based): parent = (i-1)/2, left = 2i+1, right = 2i+2.

Four core operations:

  • Insert — add at end, sift-up — O(log n)
  • Extract min/max — remove root, move last to root, sift-down — O(log n)
  • Peek — read root — O(1)
  • Build heap — Floyd's sift-down from n/2-1 to 0 — O(n)

Built-in defaults:

  • Java: PriorityQueue = min-heap; Collections.reverseOrder() for max-heap
  • Python: heapq = min-heap; negate values for max-heap
  • C++: priority_queue = max-heap; greater<T> for min-heap
  • JavaScript: no built-in — implement with comparator-based class

Classic applications:

  • Kth largest: min-heap of size k — top = kth largest — O(n log k)
  • Heap sort: build max-heap, extract max n times — O(n log n), O(1) space
  • Merge k sorted lists: min-heap of k heads — O(n log k)
  • Find median: max-heap (lower half) + min-heap (upper half) — O(log n) per insert

In the next topic you will explore Trie — the prefix tree data structure for fast string operations, autocomplete, and word search.

Suggested Quiz

A min-heap is stored in an array. Node at index i has children at which indices (0-based)?

1/6
Heap