DSA Tutorial
🔍

Minimum Spanning Tree

What Is a Minimum Spanning Tree?

A Spanning Tree of a connected graph is a subgraph that:

  • Includes ALL vertices
  • Is connected (one component)
  • Is acyclic (no cycles)
  • Has exactly V-1 edges

A Minimum Spanning Tree (MST) is the spanning tree with the minimum total edge weight.

CONNECTED WEIGHTED GRAPH:

       A ──4── B ──8── C
       │       │       │
       11      7      2──9
       │       │       │
       F ──6── E ──10─ D
              / \
             2   5
            /     \
           G       H

SPANNING TREE (one possible — not minimum):
  Include: A-B(4), A-F(11), B-E(7), E-C(2→wait... simplified example)

Using cleaner 4-vertex example:
       A ──1── B
       │ \     │
       3   5   4
       │     \ │
       C ──2── D

Weights: A-B:1, A-C:3, A-D:5, B-D:4, C-D:2

All spanning trees and their weights:
  {A-B, A-C, A-D}:  1+3+5 = 9
  {A-B, A-C, C-D}:  1+3+2 = 6  ← MINIMUM SPANNING TREE
  {A-B, B-D, A-C}:  1+4+3 = 8
  {A-B, B-D, C-D}:  1+4+2 = 7
  ...

MST = {A-B:1, C-D:2, A-C:3}, total weight = 6

KEY PROPERTIES:
  V-1 edges (V=4 vertices → 3 edges)
  Connected (can reach any vertex from any other)
  No cycles (remove any edge → disconnected)
  Minimum total weight among all spanning trees

Real-World Applications

NETWORK DESIGN:
  Connect n cities with minimum total cable/road length.
  Each city = vertex, possible connection = edge with weight (distance/cost).
  MST = minimum cost network connecting all cities.

CLUSTER ANALYSIS:
  Remove the k-1 most expensive edges from MST → k clusters.
  Useful in data mining and image segmentation.

APPROXIMATION ALGORITHMS:
  Travelling Salesman Problem (TSP): MST-based 2-approximation.
  Build MST, then traverse it → path is at most 2× optimal TSP cost.

CIRCUIT DESIGN:
  Connect components on a chip with minimum total wire length.
  Reduces manufacturing cost and electrical interference.

NETWORK RELIABILITY:
  MST connects all nodes with minimum redundancy.
  Used in backbone network design.

Two Greedy Approaches

KRUSKAL'S ALGORITHM:               PRIM'S ALGORITHM:
  Consider edges globally.           Grow from one vertex.
  Sort all edges by weight.          At each step, add cheapest
  Greedily add cheapest edge         crossing edge (from current
  that doesn't form a cycle.         MST to unvisited vertex).

  View: EDGE-centric                 View: VERTEX-centric

  Best for: SPARSE graphs            Best for: DENSE graphs
  Data structure: Union-Find         Data structure: Min-heap
  Time: O(E log E)                   Time: O((V+E) log V)

Algorithm 1: Kruskal's

Sort all edges by weight. Use Union-Find to add cheapest edge that connects two different components.

KRUSKAL'S TRACE — 4-vertex example:
  Vertices: A, B, C, D (A=0, B=1, C=2, D=3)
  Edges sorted by weight: (A-B:1), (C-D:2), (A-C:3), (B-D:4), (A-D:5)

  parent = [0,1,2,3]  (each vertex is its own component)

  ① Process (A-B, weight=1):
      find(A)=0, find(B)=1 → different → ADD edge (A-B:1)
      union(A,B): parent[1]=0
      MST edges: {A-B}, weight=1

  ② Process (C-D, weight=2):
      find(C)=2, find(D)=3 → different → ADD edge (C-D:2)
      union(C,D): parent[3]=2
      MST edges: {A-B, C-D}, weight=3

  ③ Process (A-C, weight=3):
      find(A)=0, find(C)=2 → different → ADD edge (A-C:3)
      union(0,2): parent[2]=0 (now A,B,C,D all in same component)
      MST edges: {A-B, C-D, A-C}, weight=6

  ④ Process (B-D, weight=4):
      find(B)=0, find(D)=0 → SAME component → SKIP (would form cycle)

  ⑤ Process (A-D, weight=5):
      find(A)=0, find(D)=0 → SAME → SKIP

  MST has V-1=3 edges → DONE
  MST: {A-B:1, C-D:2, A-C:3}, total weight = 6 ✓
Java
1import java.util.*; 2 3public class KruskalsAlgorithm { 4 5 // Union-Find with path compression and union by rank 6 static int[] parent, rank; 7 8 static void init(int n) { 9 parent = new int[n]; 10 rank = new int[n]; 11 for (int i = 0; i < n; i++) parent[i] = i; 12 } 13 14 static int find(int x) { 15 if (parent[x] != x) parent[x] = find(parent[x]); // Path compression 16 return parent[x]; 17 } 18 19 static boolean union(int x, int y) { 20 int px = find(x), py = find(y); 21 if (px == py) return false; // Same component — would form cycle 22 23 if (rank[px] < rank[py]) { int t = px; px = py; py = t; } 24 parent[py] = px; 25 if (rank[px] == rank[py]) rank[px]++; 26 return true; 27 } 28 29 // Edge: [u, v, weight] 30 public static int[][] kruskal(int V, int[][] edges) { 31 // Sort edges by weight — O(E log E) 32 Arrays.sort(edges, Comparator.comparingInt(e -> e[2])); 33 34 init(V); 35 36 List<int[]> mst = new ArrayList<>(); 37 int weight = 0; 38 39 for (int[] edge : edges) { 40 int u = edge[0], v = edge[1], w = edge[2]; 41 42 if (union(u, v)) { // Different components → add to MST 43 mst.add(edge); 44 weight += w; 45 if (mst.size() == V - 1) break; // MST complete 46 } 47 } 48 49 System.out.println("MST total weight: " + weight); 50 return mst.toArray(new int[0][]); 51 } 52 53 public static void main(String[] args) { 54 Scanner sc = new Scanner(System.in); 55 int V = sc.nextInt(); 56 int E = sc.nextInt(); 57 58 int[][] edges = new int[E][3]; 59 for (int i = 0; i < E; i++) { 60 edges[i][0] = sc.nextInt(); // u 61 edges[i][1] = sc.nextInt(); // v 62 edges[i][2] = sc.nextInt(); // weight 63 } 64 65 int[][] mst = kruskal(V, edges); 66 System.out.println("MST edges:"); 67 for (int[] e : mst) { 68 System.out.println(" " + e[0] + " -- " + e[1] + " : " + e[2]); 69 } 70 } 71} 72 73/* 74Sample Input: 754 5 760 1 1 772 3 2 780 2 3 791 3 4 800 3 5 81 82Sample Output: 83MST total weight: 6 84MST edges: 85 0 -- 1 : 1 86 2 -- 3 : 2 87 0 -- 2 : 3 88*/
MST total weight: 6
MST edges:
  0 -- 1 : 1
  2 -- 3 : 2
  0 -- 2 : 3

Algorithm 2: Prim's

Grow the MST from a single vertex. At each step, add the minimum weight edge that connects the current MST to an unvisited vertex.

PRIM'S TRACE — same 4-vertex example:
  Vertices: 0,1,2,3
  Edges: (0-1:1), (2-3:2), (0-2:3), (1-3:4), (0-3:5)
  Adjacency: 0→[(1,1),(2,3),(3,5)], 1→[(0,1),(3,4)], 2→[(0,3),(3,2)], 3→[(2,2),(1,4),(0,5)]

  Start from vertex 0.
  inMST = {0}, heap = [(1,1,0), (3,2,0), (5,3,0)]  (cost, dest, src)

  ① Extract min (1, dest=1, src=0):
      1 not in MST → ADD edge (0-1:1)
      inMST = {0,1}
      Push 1's neighbours: (4,3,1) already have (5,3) → push (4,3,1)
      heap = [(3,2,0),(4,3,1),(5,3,0)]

  ② Extract min (3, dest=2, src=0):
      2 not in MST → ADD edge (0-2:3)
      inMST = {0,1,2}
      Push 2's neighbours: (2,3,2)
      heap = [(2,3,2),(4,3,1),(5,3,0)]

  ③ Extract min (2, dest=3, src=2):
      3 not in MST → ADD edge (2-3:2)
      inMST = {0,1,2,3}
      MST complete! (4 vertices = V-1=3 edges added)

  ④ Remaining heap entries: stale, all vertices in MST

  MST: {(0-1:1), (0-2:3), (2-3:2)}, total weight = 6 ✓
Java
1import java.util.*; 2 3public class PrimsAlgorithm { 4 5 public static int prim(int V, List<List<int[]>> adj) { 6 boolean[] inMST = new boolean[V]; 7 int[] key = new int[V]; // Minimum edge weight to add this vertex to MST 8 int[] parent= new int[V]; // Parent in MST (for edge reconstruction) 9 10 Arrays.fill(key, Integer.MAX_VALUE); 11 Arrays.fill(parent, -1); 12 key[0] = 0; // Start from vertex 0 13 14 // Min-heap: [weight, vertex] 15 PriorityQueue<int[]> pq = new PriorityQueue<>( 16 Comparator.comparingInt(a -> a[0]) 17 ); 18 pq.offer(new int[]{0, 0}); 19 20 int totalWeight = 0; 21 List<int[]> mstEdges = new ArrayList<>(); 22 23 while (!pq.isEmpty()) { 24 int[] top = pq.poll(); 25 int w = top[0], u = top[1]; 26 27 if (inMST[u]) continue; // Already in MST — stale entry 28 inMST[u] = true; 29 totalWeight += w; 30 31 if (parent[u] != -1) { 32 mstEdges.add(new int[]{parent[u], u, w}); 33 } 34 35 for (int[] edge : adj.get(u)) { 36 int v = edge[0], weight = edge[1]; 37 if (!inMST[v] && weight < key[v]) { 38 key[v] = weight; 39 parent[v] = u; 40 pq.offer(new int[]{weight, v}); 41 } 42 } 43 } 44 45 System.out.println("MST total weight: " + totalWeight); 46 System.out.println("MST edges:"); 47 for (int[] e : mstEdges) { 48 System.out.println(" " + e[0] + " -- " + e[1] + " : " + e[2]); 49 } 50 return totalWeight; 51 } 52 53 public static void main(String[] args) { 54 Scanner sc = new Scanner(System.in); 55 int V = sc.nextInt(); 56 int E = sc.nextInt(); 57 58 List<List<int[]>> adj = new ArrayList<>(); 59 for (int i = 0; i < V; i++) adj.add(new ArrayList<>()); 60 61 for (int i = 0; i < E; i++) { 62 int u = sc.nextInt(); 63 int v = sc.nextInt(); 64 int w = sc.nextInt(); 65 adj.get(u).add(new int[]{v, w}); 66 adj.get(v).add(new int[]{u, w}); // Undirected 67 } 68 69 prim(V, adj); 70 } 71} 72 73/* 74Sample Input: 754 5 760 1 1 772 3 2 780 2 3 791 3 4 800 3 5 81 82Sample Output: 83MST total weight: 6 84MST edges: 85 0 -- 1 : 1 86 0 -- 2 : 3 87 2 -- 3 : 2 88*/
MST total weight: 6
MST edges:
  0 -- 1 : 1
  0 -- 2 : 3
  2 -- 3 : 2

Kruskal's vs Prim's — Full Comparison

                KRUSKAL'S              PRIM'S
  ───────────────────────────────────────────────────────────
  Approach       Edge-centric           Vertex-centric
  Strategy       Sort all edges,        Grow from one vertex,
                 add cheapest non-cycle add cheapest crossing edge
  Data structure Union-Find             Min-heap (priority queue)
  Initialization Sort E edges           Start from any vertex
  Time           O(E log E)             O((V+E) log V) with heap
                                        O(V²) with array
  Space          O(V + E)               O(V)
  Best for       Sparse graphs          Dense graphs (array version)
  Works on       Only undirected        Only undirected (typically)
  Implementation Simpler                Slightly more complex
  Disconnected?  Finds MSF (forest)     Only MST of one component

DECISION RULE:
  Sparse graph (E ≈ V):    Kruskal's O(V log V) ≈ Prim's O(V log V)
  Dense graph (E ≈ V²):    Prim's array O(V²) < Kruskal's O(V² log V)
  Input as edge list:      Kruskal's (already sorted)
  Input as adjacency list: Prim's (no need to extract edges first)

Dry Run Side by Side

SAME GRAPH: 4 vertices, edges sorted by weight:
  (0-1:1), (2-3:2), (0-2:3), (1-3:4), (0-3:5)

KRUSKAL'S:                        PRIM'S (from vertex 0):
  parent=[0,1,2,3]                  key=[0,∞,∞,∞], inMST=[F,F,F,F]
                                    heap=[(0,0)]
  ① (0-1:1): diff → ADD            ① Extract(0,0): add 0
     parent=[0,0,2,3]                  Push (1,1), (3,2), (5,3)
                                       heap=[(1,1),(3,2),(5,3)]
  ② (2-3:2): diff → ADD            ② Extract(1,1): ADD edge 0─1:1
     parent=[0,0,0,2]                  Push (4,3) → heap=[(3,2),(4,3),(5,3)]

  ③ (0-2:3): diff → ADD            ③ Extract(3,2): ADD edge 0─2:3
     parent=[0,0,0,0]                  Push (2,3) → heap=[(2,3),(4,3),(5,3)]

  ④ (1-3:4): same → SKIP           ④ Extract(2,3): ADD edge 2─3:2
  ⑤ (0-3:5): same → SKIP              All in MST → DONE

  MST: {0─1:1, 2─3:2, 0─2:3}      MST: {0─1:1, 0─2:3, 2─3:2}
  Total weight: 6 ✓                 Total weight: 6 ✓

Both algorithms produce the SAME MST total weight.
Edge ordering may differ but total weight is always equal.

Special Cases

DISCONNECTED GRAPH:
  Kruskal's produces a Minimum Spanning FOREST
  (one spanning tree per connected component).
  Prim's only spans the component containing the start vertex.
  To span all components with Prim's: restart from each unvisited vertex.

GRAPH WITH EQUAL WEIGHT EDGES:
  MST may not be unique — multiple valid MSTs exist.
  Both Kruskal's and Prim's produce A valid MST.
  Total weight of all MSTs is the same (unique minimum).

NEGATIVE EDGE WEIGHTS:
  Both algorithms work correctly with negative weights.
  MST is still defined (tree with minimum total weight).
  Bellman-Ford/Dijkstra's issues with negatives don't apply here.

SINGLE VERTEX:
  MST is just the single vertex — no edges.
  Total weight = 0.

COMPLETE GRAPH (all pairs connected):
  Both algorithms still find MST correctly.
  Kruskal's: sort V(V-1)/2 edges → O(V² log V).
  Prim's with array: O(V²) — better for this case.

Complexity Summary

AlgorithmTimeSpaceBest For
Kruskal'sO(E log E) = O(E log V)O(V + E)Sparse graphs, edge list input
Prim's (heap)O((V+E) log V)O(V)General/sparse graphs
Prim's (array)O(V²)O(V)Dense graphs, adjacency matrix

For sparse graphs (E ≈ V): All variants are similar — O(V log V). For dense graphs (E ≈ V²): Prim's array O(V²) beats Kruskal's O(V² log V).

Common Mistakes

Kruskal's: not checking for cycles with Union-Find — using visited[] instead. A simple visited array cannot detect if an edge creates a cycle correctly. Union-Find is necessary: find(u) == find(v) correctly identifies when u and v are already connected, regardless of path length.

Prim's: not checking if inMST[u]: continue for stale heap entries. Same as Dijkstra's stale entry problem — when a cheaper edge to vertex v is found, a new entry is pushed but the old one remains. Without the stale check, a vertex is processed multiple times, giving wrong MST edges and weight.

Adding V edges instead of V-1. An MST has exactly V-1 edges. Stop as soon as V-1 edges are added. Continuing adds redundant edges (for Kruskal's, these would be cycles caught by Union-Find; for Prim's, the loop naturally stops when all vertices are in the MST).

Using directed graph edges for MST algorithms. Both Kruskal's and Prim's work on undirected graphs. For Kruskal's: each undirected edge is stored once in the edge list. For Prim's: each edge must appear in BOTH adjacency lists (adj[u] and adj[v]).

Prim's: incorrect starting edge weight. Initialize key[start] = 0 (not ∞) for the source vertex. Without this, the source vertex is never extracted from the heap.

Interview Questions

Q: What is the cut property and how does it justify both MST algorithms?

The cut property states: for any cut (partition of vertices into two non-empty sets S and V-S), the minimum weight edge crossing the cut belongs to some MST. Both algorithms exploit this. Kruskal's: when it adds edge (u,v), it's the minimum weight edge crossing the cut between u's component and v's component. Prim's: at each step, the current MST vertices form set S; the added edge is the minimum crossing the cut (S, V-S). Both are therefore correct by the cut property.

Q: Can we have a unique MST? Under what conditions?

Yes — the MST is unique if and only if all edge weights are distinct. With distinct weights, the cut property's "minimum weight edge crossing the cut" is uniquely determined (no ties possible). If some edges share the same weight, tie-breaking by edge selection order can produce different (but equally minimal) MSTs. All these MSTs have identical total weight.

Q: How would you find the second minimum spanning tree?

One approach: for each edge e NOT in the MST, temporarily add e to create a cycle, find the maximum weight edge in that cycle (other than e), remove it, and compute the new total weight. The second MST uses the replacement that minimizes the total weight increase. Time: O(V × E) naively. Can be done in O(V²) with LCA preprocessing on the MST.

Summary

A Minimum Spanning Tree connects all V vertices with exactly V-1 edges at minimum total cost.

Two greedy algorithms — both correct, different strategies:

Kruskal's:

  1. Sort all edges by weight — O(E log E)
  2. For each edge (u,v,w): if find(u) ≠ find(v) → add to MST, union(u,v)
  3. Stop when V-1 edges added

Prim's:

  1. Start from any vertex; push its edges to min-heap
  2. Extract min-weight edge to unvisited vertex; add to MST
  3. Push new vertex's edges; repeat until all visited

Key differences:

Kruskal'sPrim's
ProcessesEdges globallyVertices one by one
Data structureUnion-FindMin-heap
Best forSparse (edge list)Dense (adjacency matrix)
TimeO(E log E)O(V²) array / O((V+E) log V) heap

Both produce the same minimum total weight — just potentially different edge sets when tie weights exist.

In the next topic you will explore Strongly Connected Components — Kosaraju's and Tarjan's algorithms for finding maximal strongly connected subgraphs in directed graphs.

Suggested Quiz

A Minimum Spanning Tree of a connected graph with V vertices has exactly how many edges?

1/6
Minimum Spanning Tree