DSA Tutorial
🔍

Graph Basics & Terminology

Graphs vs Trees — The Key Distinction

Before diving into terminology, understand where graphs sit in the data structure hierarchy:

DATA STRUCTURE HIERARCHY:

  Graph  (most general — no constraints on structure)
    │
    ├── Connected Graph  (one piece — path exists between any two nodes)
    │     │
    │     └── Tree  (connected + acyclic + exactly n-1 edges)
    │
    ├── Directed Graph   (edges have direction)
    ├── Weighted Graph   (edges have weights)
    ├── Cyclic Graph     (contains at least one cycle)
    └── Acyclic Graph    (no cycles)

Every tree IS a graph. Not every graph is a tree.

TREE constraints (all must hold):
  1. Connected (one component)
  2. Acyclic (no cycles)
  3. Exactly n-1 edges for n nodes
  4. Exactly one path between any two nodes

GRAPH (none of these are required):
  ✓ Can be disconnected
  ✓ Can have cycles
  ✓ Can have any number of edges
  ✓ Can have multiple paths between nodes
  ✓ Can have directed edges
  ✓ Can have weighted edges
  ✓ Can have self-loops

The Reference Graph

All terminology on this page uses this single undirected graph. Study it before reading the definitions.

Reference Graph G:

       1 ─────── 2
      / \       / \
     /   \     /   \
    0     \   /     3
     \     \ /     /
      \     5 ────
       \   /
        \ /
         4

Vertices: {0, 1, 2, 3, 4, 5}
Edges:    {(0,1), (0,4), (1,2), (1,5), (2,3), (2,5), (3,5), (4,5)}

Total: 6 vertices, 8 edges

Vertex (Node)

A vertex (plural: vertices) is the fundamental unit of a graph — a point that can store data. Vertices are also called nodes.

In reference graph G:
  Vertices: 0, 1, 2, 3, 4, 5
  |V| = 6   (|V| denotes the number of vertices)

Vertices can represent:
  Cities in a road network
  Computers in a network
  People in a social graph
  Web pages in the internet graph
  States in a state machine

Edge

An edge connects two vertices. It represents a relationship between them.

In reference graph G:
  Edges: (0,1), (0,4), (1,2), (1,5), (2,3), (2,5), (3,5), (4,5)
  |E| = 8   (|E| denotes the number of edges)

Edges represent:
  Roads between cities
  Network cables between computers
  Friendships between people
  Hyperlinks between web pages
  Transitions between states

NOTATION:
  Undirected edge:  (u, v) = (v, u)   — no direction
  Directed edge:    (u → v) ≠ (v → u) — direction matters

Directed vs Undirected

UNDIRECTED GRAPH:
  Edges have no direction — relationship is symmetric.
  If A connects to B, then B connects to A.

       A ─── B ─── C
              \
               D

  Edge (A,B) = edge (B,A). You can travel either way.
  Example: friendship networks, road networks (two-way streets)

DIRECTED GRAPH (Digraph):
  Edges have direction — relationship is one-way.
  Arrow shows which direction travel is allowed.

       A ──→ B ──→ C
              ↑
              D

  Edge A→B ≠ edge B→A.
  You can go A→B but NOT B→A (unless there is also a B→A edge).
  Example: web links, Twitter follows, task dependencies

MIXED GRAPH:
  Contains both directed and undirected edges.
  Less common but models real-world scenarios like one-way and two-way streets.

Weighted vs Unweighted

UNWEIGHTED GRAPH:
  All edges are equal — just connected or not.

       A ─── B ─── C
       |           |
       D ─────────

  Used when: social connections, adjacency, reachability

WEIGHTED GRAPH:
  Each edge has a numerical value (weight/cost).

       A ──4── B ──2── C
       |               |
       └──7────D──3────┘

  Edge (A,B) has weight 4 (e.g., distance 4 km, cost $4, time 4 min)

  Used when: road distances, network bandwidth, travel costs,
             edge probabilities in Bayesian networks

NEGATIVE WEIGHTS:
  Some algorithms (Bellman-Ford) handle negative weights.
  Dijkstra's algorithm CANNOT handle negative weights.
  Negative CYCLES (where total cycle weight < 0) make shortest path undefined.

Degree

The degree of a vertex is the number of edges connected to it.

UNDIRECTED GRAPH — DEGREE:

Reference graph G:
       1 ─────── 2
      / \       / \
     /   \     /   \
    0     \   /     3
     \     \ /     /
      \     5 ────
       \   /
        \ /
         4

  deg(0) = 2  (edges to 1, 4)
  deg(1) = 3  (edges to 0, 2, 5)
  deg(2) = 3  (edges to 1, 3, 5)
  deg(3) = 2  (edges to 2, 5)
  deg(4) = 2  (edges to 0, 5)
  deg(5) = 4  (edges to 1, 2, 3, 4)

HANDSHAKING LEMMA: sum of all degrees = 2 × |E|
  2 + 3 + 3 + 2 + 2 + 4 = 16 = 2 × 8 ✓

DIRECTED GRAPH — IN-DEGREE AND OUT-DEGREE:

       A ──→ B ──→ C
       ↑           |
       └─────D ←───┘

  in-degree(A)  = 1 (D→A)     out-degree(A) = 1 (A→B)
  in-degree(B)  = 1 (A→B)     out-degree(B) = 1 (B→C)
  in-degree(C)  = 1 (B→C)     out-degree(C) = 1 (C→D)
  in-degree(D)  = 1 (C→D)     out-degree(D) = 1 (D→A)

  Total in-degree = Total out-degree = |E| (for directed graphs)

Adjacency

Two vertices are adjacent (or neighbours) if there is an edge directly connecting them.

In reference graph G:
  Neighbours of vertex 5:  {1, 2, 3, 4}   (4 neighbours — deg(5)=4)
  Neighbours of vertex 0:  {1, 4}          (2 neighbours — deg(0)=2)
  Neighbours of vertex 3:  {2, 5}          (2 neighbours — deg(3)=2)

  Are 0 and 2 adjacent?    NO  (no direct edge between them)
  Are 1 and 5 adjacent?    YES (edge (1,5) exists)
  Are 3 and 4 adjacent?    NO  (no direct edge between them)

The set of all neighbours of vertex v is its ADJACENCY SET.

Path

A path is a sequence of vertices where each consecutive pair is connected by an edge, and no vertex is repeated.

In reference graph G:

Valid paths from 0 to 3:
  0 → 1 → 2 → 3          (length 3 — passes through 3 edges)
  0 → 1 → 5 → 2 → 3      (length 4)
  0 → 1 → 5 → 3          (length 3)
  0 → 4 → 5 → 2 → 3      (length 4)
  0 → 4 → 5 → 3          (length 3)

Shortest path from 0 to 3 = length 3 (multiple options)

WALK vs TRAIL vs PATH:
  Walk:  sequence of vertices and edges, repetition allowed
  Trail: edges not repeated, vertices may repeat
  Path:  neither vertices nor edges repeated ← most common meaning

PATH LENGTH = number of edges traversed (not vertices)

Cycle

A cycle is a path that starts and ends at the same vertex — a closed loop.

In reference graph G:

Cycles (some examples):
  1 → 2 → 5 → 1          (length 3 — triangle)
  0 → 1 → 5 → 4 → 0      (length 4)
  1 → 2 → 3 → 5 → 1      (length 4)
  0 → 1 → 2 → 5 → 4 → 0  (length 5)

A graph with at least one cycle is CYCLIC.
A graph with no cycles is ACYCLIC.

SPECIAL NAMES:
  Cycle of length 3:   Triangle
  Cycle of length 4:   Quadrilateral / 4-cycle
  Self-loop:           Edge from vertex to itself (cycle of length 1)

DIRECTED CYCLE: a cycle following edge directions
  A → B → C → A  is a directed cycle
  A → B → C with no C→A edge is NOT a directed cycle

IMPORTANCE OF CYCLES:
  Detecting cycles is needed for:
    - Topological sort (only works on Directed Acyclic Graphs = DAGs)
    - Deadlock detection
    - Finding infinite loops
    - Determining if a graph is a tree (trees have no cycles)

Connected Components

A connected component is a maximal subgraph where every vertex is reachable from every other vertex.

CONNECTED GRAPH (one component):

Reference graph G — all 6 vertices in ONE component:
  From any vertex, you can reach any other vertex.
  Component: {0, 1, 2, 3, 4, 5}

DISCONNECTED GRAPH (multiple components):

  0 ─── 1 ─── 2        5 ─── 6
                        |
  3 ─── 4               7

  Component 1: {0, 1, 2}
  Component 2: {3, 4}
  Component 3: {5, 6, 7}

  3 connected components — graph is DISCONNECTED.
  You cannot reach vertex 5 from vertex 0.

DIRECTED GRAPH CONNECTIVITY:
  Weakly connected:  connected if edge directions are ignored
  Strongly connected: path exists from EVERY vertex to EVERY other vertex
                      (following directed edges)

  A → B → C → D → A   ← strongly connected (cycle visits all)
  A → B ← C → D       ← weakly connected but NOT strongly connected
                          (can't reach A from D)

Special Graph Types

COMPLETE GRAPH (Kₙ):
  Every pair of vertices has an edge. Maximum possible edges.
  n vertices → n(n-1)/2 edges (undirected)

       K₄:
        1
       /|\
      / | \
     2──┼──4
      \ | /
       \|/
        3

  Every vertex connects to every other vertex.

BIPARTITE GRAPH:
  Vertices split into two groups (U and V).
  Edges only go between groups — never within a group.

       U: {A, B, C}    V: {X, Y, Z}

       A ─── X
       A ─── Y
       B ─── X
       B ─── Z
       C ─── Y
       C ─── Z

  No edge between A and B. No edge between X and Y.
  A graph is bipartite iff it has no odd-length cycles.

DAG (Directed Acyclic Graph):
  Directed graph with no directed cycles.
  Used for: task scheduling, dependency resolution, topological sort.

       A → B → D
       A → C → D
       B → C

  No way to start at any node and return to it.

MULTIGRAPH:
  Multiple edges between the same pair of vertices.
  Parallel edges allowed.

       A ═══ B   (two edges between A and B)
           /
          /
         C

SIMPLE GRAPH:
  No self-loops, no parallel edges.
  Most algorithm discussions assume simple graphs.

Graph Representations

1. Adjacency Matrix

Reference graph G as adjacency matrix:
(rows = from vertex, columns = to vertex, 1 = edge exists)

    0  1  2  3  4  5
0 [ 0, 1, 0, 0, 1, 0 ]
1 [ 1, 0, 1, 0, 0, 1 ]
2 [ 0, 1, 0, 1, 0, 1 ]
3 [ 0, 0, 1, 0, 0, 1 ]
4 [ 1, 0, 0, 0, 0, 1 ]
5 [ 0, 1, 1, 1, 1, 0 ]

PROPERTIES:
  Symmetric for undirected graphs: matrix[u][v] = matrix[v][u]
  Space: O(V²) — all V×V entries stored regardless of actual edges
  Check edge (u,v): O(1) — just read matrix[u][v]
  Get all neighbours of v: O(V) — scan entire row
  Best for: DENSE graphs where E ≈ V²
  Bad for: SPARSE graphs (wastes memory with mostly-zero entries)

2. Adjacency List

Reference graph G as adjacency list:

  0: [1, 4]
  1: [0, 2, 5]
  2: [1, 3, 5]
  3: [2, 5]
  4: [0, 5]
  5: [1, 2, 3, 4]

PROPERTIES:
  Space: O(V + E) — V list headers + one entry per edge (2E for undirected)
  Check edge (u,v): O(degree(u)) — scan u's neighbour list
  Get all neighbours of v: O(degree(v)) — just read v's list
  Best for: SPARSE graphs (most real-world graphs are sparse)
  Standard choice for most graph algorithms (BFS, DFS, Dijkstra's, etc.)

WEIGHTED adjacency list:
  0: [(1,4), (4,7)]     — (neighbour, weight)
  1: [(0,4), (2,2), (5,6)]
  ...

3. Edge List

Reference graph G as edge list:
  [(0,1), (0,4), (1,2), (1,5), (2,3), (2,5), (3,5), (4,5)]

Weighted:
  [(0,1,4), (0,4,7), (1,2,2), (1,5,6), ...]   — (u, v, weight)

PROPERTIES:
  Space: O(E)
  Check edge (u,v): O(E) — linear scan through all edges
  Get neighbours: O(E) — scan all edges
  Best for: algorithms that iterate over all edges (Kruskal's MST)
  Bad for: neighbour queries

Comparison Table

REPRESENTATION   Space     Edge (u,v)?  All neighbours?   Best for
──────────────────────────────────────────────────────────────────────
Adjacency Matrix O(V²)     O(1)         O(V)              Dense graphs
Adjacency List   O(V+E)    O(deg(u))    O(deg(v))         Sparse graphs ← default
Edge List        O(E)      O(E)         O(E)              Edge-only algorithms

Dense graph:   E ≈ V²   (e.g., social network cliques, complete graphs)
Sparse graph:  E ≈ V    (e.g., road networks, trees, most real-world graphs)

Complete Terminology Reference

Reference graph G:
       1 ─────── 2
      / \       / \
     /   \     /   \
    0     \   /     3
     \     \ /     /
      \     5 ────
       \   /
        \ /
         4

TERM                  DEFINITION                        IN REFERENCE GRAPH G
─────────────────────────────────────────────────────────────────────────────
Vertex / Node         Basic unit of a graph             0, 1, 2, 3, 4, 5
Edge                  Connection between two vertices    (0,1), (0,4), ... (8 total)
|V|                   Number of vertices                6
|E|                   Number of edges                   8
Adjacent              Directly connected by an edge     0 and 1 are adjacent
Neighbour             Vertex connected by direct edge   Neighbours(5) = {1,2,3,4}
Degree                Number of edges at a vertex       deg(5) = 4
In-degree             Incoming edges (directed only)    —
Out-degree            Outgoing edges (directed only)    —
Path                  Vertex sequence, no repeats       0→1→5→3
Path length           Number of edges in path           0→1→5→3 has length 3
Cycle                 Path that starts = ends           1→2→5→1
Self-loop             Edge from vertex to itself        Not in this graph
Connected             All vertices reachable            Yes — one component
Component             Maximal connected subgraph         One component = {0,1,2,3,4,5}
Simple graph          No loops, no parallel edges       Yes — this graph is simple
Weighted              Edges have numerical values       No — unweighted here
Directed              Edges have direction              No — undirected here
Bipartite             Two-set partition possible        Yes! U={0,3}, V={1,2,4,5}
                                                        (check: no odd cycle here?
                                                         actually 1-2-5-1 is odd! → NOT bipartite)
Acyclic               No cycles                         No — has many cycles
DAG                   Directed + Acyclic                No — undirected + cyclic
Sparse graph          E much less than V²               8 << 36 = 6² → sparse ✓

Key Properties and Formulas

For a simple undirected graph with V vertices:
  Maximum edges:          V(V-1)/2  (complete graph Kᵥ)
  Minimum edges (connected): V-1    (tree / spanning tree)
  Sum of degrees:         2 × E     (handshaking lemma)
  Odd-degree vertices:    Always even in count

For a directed graph:
  Maximum edges:          V(V-1)     (no self-loops)
  Sum of in-degrees:      = Sum of out-degrees = E
  DAG:                    Can always be topologically sorted

For a tree (special case of connected acyclic graph):
  Edges:                  V - 1
  Connected components:   1
  Any two vertices:       Exactly one path between them

Bipartite check:
  A graph is bipartite ←→ it contains no odd-length cycle
  2-colourable ←→ bipartite

Common Mistakes

Confusing graphs and trees. A tree is a graph — a special connected acyclic undirected graph. Saying "this is a tree, not a graph" is incorrect. A tree IS a graph. The question is whether the graph has the additional tree constraints (connected, acyclic, V-1 edges).

Mixing up path and walk. A walk allows repeated vertices and edges. A trail forbids repeated edges but allows repeated vertices. A path forbids both. Most algorithm discussions mean "path" (no repetition). Using "walk" when you mean "path" leads to incorrect cycle detection reasoning.

Degree vs in/out-degree for directed graphs. In undirected graphs, degree is simply the edge count. In directed graphs, each vertex has two degrees — in-degree (incoming) and out-degree (outgoing). "Degree" alone is ambiguous for directed graphs; always specify in or out.

Weak vs strong connectivity for directed graphs. A directed graph is weakly connected if its undirected version is connected — ignoring arrow directions. Strongly connected requires a directed path from every vertex to every other vertex. Most directed graph problems require checking strong connectivity, not weak.

Assuming O(V²) space is always acceptable. Adjacency matrices use O(V²) space. For a graph with 10,000 vertices, that is 100,000,000 entries — 400 MB for integers. Most real-world graphs are sparse (E ≈ V or E ≈ V log V) — always use adjacency lists unless the graph is explicitly dense.

Summary

A graph G = (V, E) consists of a set of vertices V and a set of edges E. Unlike trees, graphs are unrestricted: they can be cyclic, disconnected, directed, weighted, or contain parallel edges.

Seven essential terms:

  • Vertex — a node in the graph
  • Edge — a connection between two vertices
  • Degree — edge count at a vertex (in/out for directed)
  • Path — vertex sequence with no repeats; length = edge count
  • Cycle — a closed path (starts = ends)
  • Component — maximal subgraph of reachable vertices
  • Adjacency — direct connection by a single edge

Four graph types by structure:

  • Undirected — edges go both ways; Directed — edges have direction
  • Unweighted — equal edges; Weighted — edges have numeric values
  • Connected — one component; Disconnected — multiple components
  • Cyclic — has cycles; Acyclic — no cycles (DAG if directed)

Three representations:

  • Adjacency matrix — O(V²) space, O(1) edge check — use for dense graphs
  • Adjacency list — O(V+E) space, O(deg) edge check — default for most algorithms
  • Edge list — O(E) space, O(E) edge check — use for edge-centric algorithms

Key formula: sum of all degrees = 2 × |E| (handshaking lemma — every edge contributes 1 to each endpoint's degree).

In the next topic you will explore Graph DFS and BFS — depth-first and breadth-first traversal of graphs, cycle detection, connected components, and shortest paths.

Suggested Quiz

What is the key structural difference between a graph and a tree?

1/6
Graph Basics & Terminology