DSA Tutorial
🔍

Sorting Algorithm Comparison

Complete Algorithm Comparison Table

Algorithm     Best       Average     Worst      Space   Stable  Adaptive  Online
────────────────────────────────────────────────────────────────────────────────
Bubble        O(n)       O(n²)       O(n²)      O(1)    ✓ Yes   ✓ Yes     ✓ Yes
Selection     O(n²)      O(n²)       O(n²)      O(1)    ✗ No    ✗ No      ✗ No
Insertion     O(n)       O(n²)       O(n²)      O(1)    ✓ Yes   ✓ Yes     ✓ Yes
Shell         O(n log n) O(n log²n)  O(n²)      O(1)    ✗ No    ✓ Yes     ✗ No
Merge         O(n log n) O(n log n)  O(n log n) O(n)    ✓ Yes   ✗ No      ✗ No
Quick         O(n log n) O(n log n)  O(n²)      O(logn) ✗ No    ✗ No      ✗ No
Heap          O(n log n) O(n log n)  O(n log n) O(1)    ✗ No    ✗ No      ✗ No
TimSort       O(n)       O(n log n)  O(n log n) O(n)    ✓ Yes   ✓ Yes     ✗ No
IntroSort     O(n log n) O(n log n)  O(n log n) O(logn) ✗ No    ✗ No      ✗ No
Counting      O(n+k)     O(n+k)      O(n+k)     O(n+k)  ✓ Yes   —         —
Radix         O(nk)      O(nk)       O(nk)      O(n+k)  ✓ Yes   —         —
Bucket        O(n+k)     O(n+k)      O(n²)      O(n+k)  ✓ Yes   —         —

Head-to-Head: The Simple Sorts

All three simple sorts are O(n²) average and worst case, but they differ meaningfully in real performance:

INSERTION SORT vs BUBBLE SORT:
  Both: O(n²) worst, O(n) best, stable, adaptive, O(1) space
  Insertion wins: each "move" is one assignment (shift), not three (swap)
                  — ~3× fewer memory writes for the same logical work
  Bubble wins: nothing. Insertion Sort strictly dominates Bubble Sort
               for all practical purposes.

INSERTION SORT vs SELECTION SORT:
  Insertion: O(n) best, O(n²) worst, stable, adaptive
  Selection: O(n²) all cases, unstable, non-adaptive, ≤n-1 swaps
  Insertion wins: adaptive (O(n) for nearly sorted), stable
  Selection wins: ≤n-1 writes regardless of input
                  → use when write cost >> read cost (flash memory, NVRAM)

SELECTION SORT vs BUBBLE SORT:
  Both: O(n²) all cases, O(1) space
  Selection: unstable, non-adaptive, ≤n-1 swaps
  Bubble: stable, adaptive (O(n) best with early exit), up to O(n²) swaps
  Selection wins: minimum writes
  Bubble wins: adaptive for nearly-sorted, stable

VERDICT:
  For small arrays in practice: Insertion Sort > Bubble Sort > Selection Sort
  Exception: write-limited hardware → Selection Sort wins

Head-to-Head: The Efficient Sorts

MERGE SORT vs QUICK SORT:
  Merge:  O(n log n) guaranteed, stable, O(n) space, great for linked lists
  Quick:  O(n log n) average (O(n²) worst with bad pivot), unstable, O(logn) space
  Merge wins: stability, guaranteed O(n log n), linked lists, external sort
  Quick wins: in-place (no auxiliary array), better cache locality in practice,
              ~2× faster than merge sort on typical random arrays
  Use Merge: when stability required, when worst-case guarantee needed,
             when sorting linked lists, for external sort
  Use Quick: when in-place required, when maximum average speed needed,
             when stability not required

MERGE SORT vs HEAP SORT:
  Merge:  stable, O(n) space, good cache (sequential access)
  Heap:   in-place O(1) space, unstable, poor cache (random heap access)
  Merge wins: stability, cache performance
  Heap wins: O(1) space with O(n log n) guarantee — no auxiliary array
  Use Heap: when in-place + guaranteed O(n log n) + no stability needed

QUICK SORT vs HEAP SORT:
  Quick: O(n log n) average, O(n²) worst, better cache, smaller constants
  Heap:  O(n log n) always, O(1) space, poor cache, larger constants
  Quick wins: real-world speed (~2-3× faster than heap in practice)
  Heap wins:  guaranteed O(n log n) worst case, in-place
  Use Heap: when quick sort's O(n²) worst case is unacceptable and
            in-place is required (e.g. real-time systems)

INTROSORT (C++ std::sort) — the pragmatic hybrid:
  Phase 1: Quick Sort (fast average case)
  Phase 2: If recursion depth > 2·log(n): switch to Heap Sort (O(n log n) guarantee)
  Phase 3: Insertion Sort for small subarrays (n < 16)
  Result:  Quick Sort speed + Heap Sort safety + Insertion Sort for small n
           No O(n²) worst case. Not stable.

Head-to-Head: Non-Comparison Sorts

COUNTING SORT:
  Complexity: O(n + k) where k = range of values
  Space:      O(k) for count array
  Stable:     ✓ Yes
  Best for:   integers in a small, known range [0, k]
  Worst case: k >> n (huge range with few elements) → wastes O(k) space

  Example good use: sort 10,000 exam scores in range [0, 100]
    k=100, n=10,000 → O(10,100) ≈ O(n)
  Example bad use: sort 100 GUIDs (128-bit integers)
    k=2^128 → completely impractical

RADIX SORT:
  Complexity: O(n × d) where d = number of digits/characters
  Space:      O(n + b) where b = base (10, 256, etc.)
  Stable:     ✓ Yes (requires stable per-digit sort)
  Best for:   fixed-length integers, fixed-length strings, phone numbers
  Limitation: element size must be fixed; comparisons between arbitrary types not supported

  LSD (Least Significant Digit first): standard, stable, easy to implement
  MSD (Most Significant Digit first): handles variable length, more complex

BUCKET SORT:
  Complexity: O(n + k) average where k = number of buckets
  Space:      O(n + k)
  Stable:     ✓ Yes (if inner sort per bucket is stable)
  Best for:   uniformly distributed floats in [0, 1)
  Worst case: all elements in one bucket → O(n²) for that bucket
  Limitation: requires knowledge of value distribution

When non-comparison sorts beat comparison sorts:
  Comparison sorts: Ω(n log n) lower bound — cannot do better
  Non-comparison:   beat O(n log n) by exploiting data properties
    → Counting Sort beats O(n log n) when k = O(n)
    → Radix Sort beats O(n log n) when d is constant and small
    → Bucket Sort beats O(n log n) for uniform distributions

Algorithm Selection Decision Framework

Question 1: Do you care about stability?
  YES →  Must use: Merge Sort, TimSort, Insertion Sort, Counting Sort, Radix Sort
  NO  →  Any algorithm is valid

Question 2: What is the data type?
  Integers in range [0, k], k small  → Counting Sort: O(n + k)
  Fixed-length integers or strings   → Radix Sort: O(n × d)
  Floats uniformly distributed       → Bucket Sort: O(n + k) average
  General objects with comparator    → Comparison sort (continue)

Question 3: What is n?
  n ≤ 10-15                → Insertion Sort: lowest overhead, no allocation
  n ≤ 50                  → Insertion Sort or Shell Sort
  n > 50, general use     → O(n log n) sort (continue)

Question 4: Memory constraint?
  O(1) space required     → Quick Sort or Heap Sort
    + no O(n²) worst case → Heap Sort or IntroSort
  O(n) space acceptable   → Merge Sort, TimSort (often fastest in practice)

Question 5: Input characteristics?
  Nearly sorted           → Insertion Sort (O(n)) or TimSort
  Many duplicates         → Quick Sort with 3-way partition, or TimSort
  Linked list             → Merge Sort (no random access needed)
  External data (disk)    → External Merge Sort
  Random, no constraints  → Quick Sort with random pivot or IntroSort

Question 6: Worst-case guarantee needed?
  YES → Merge Sort, Heap Sort, TimSort, IntroSort (all O(n log n) guaranteed)
  NO  → Quick Sort (O(n²) possible but extremely unlikely with random pivot)

QUICK REFERENCE:
  "Just sort an array in code"       → Use the built-in (TimSort/IntroSort)
  "Sort and need stability"          → Merge Sort or TimSort
  "Sort in-place, no stability"      → Quick Sort
  "Guaranteed fast, no stability"    → Heap Sort
  "Small n (≤ 15)"                   → Insertion Sort
  "Integers in small range"          → Counting Sort
  "External or disk sort"            → Merge Sort
  "Linked list"                      → Merge Sort
  "Nearly sorted data"               → Insertion Sort or TimSort

Built-In Sort: What Your Language Actually Uses

JAVA:
  Arrays.sort(int[])        → Dual-Pivot QuickSort   O(n log n) avg, unstable
  Arrays.sort(long[])       → Dual-Pivot QuickSort   O(n log n) avg, unstable
  Arrays.sort(Object[])     → TimSort                O(n log n) guar, stable
  Arrays.sort(T[], Cmp)     → TimSort                O(n log n) guar, stable
  Collections.sort(List)    → TimSort                O(n log n) guar, stable
  List.sort(Cmp)            → TimSort                O(n log n) guar, stable
  PriorityQueue             → Binary Heap            O(log n) per op

PYTHON:
  list.sort()               → TimSort                O(n) best, O(n log n) guar, stable
  sorted()                  → TimSort                same

C++:
  std::sort                 → IntroSort              O(n log n) guar, unstable
  std::stable_sort          → MergeSort              O(n log n) guar, stable
  std::partial_sort         → HeapSort               O(n log k)
  std::nth_element          → IntroSelect            O(n) avg

JAVASCRIPT:
  Array.prototype.sort      → TimSort (V8/modern)   O(n log n) guar, stable (ES2019+)
                              (old V8: QuickSort for n > 10)

Real-World Performance vs Theoretical Complexity

Theoretical O notation and real-world performance can diverge significantly:

WHY QUICK SORT BEATS MERGE SORT IN PRACTICE despite same O(n log n):

1. Cache locality
   Quick Sort: works in-place on contiguous array — elements accessed sequentially
   Merge Sort: reads from two separate regions, writes to third — more cache misses

2. Memory allocation
   Quick Sort: no auxiliary array — zero allocation overhead
   Merge Sort: allocates O(n) on every sort call

3. Constant factor
   Quick Sort inner loop: 1 comparison + conditional swap
   Merge Sort inner loop: 1 comparison + 1 copy + pointer increment

   Empirical: Quick Sort typically 2× faster than Merge Sort for random data.

WHY HEAP SORT IS SLOW IN PRACTICE despite O(n log n) guaranteed:
  Heap access pattern: index i → child at 2i+1 and 2i+2
  For large heaps, these are far apart in memory → constant cache misses
  Each sift-down operation traverses O(log n) levels of the heap,
  with each level access likely causing a cache miss.
  Result: Heap Sort is ~3-4× slower than Quick Sort on real hardware.

WHY TIMSORT IS FAST ON REAL DATA:
  Real-world data is rarely random. It contains partial order (runs).
  TimSort detects and exploits these natural runs:
    - A sorted run of k elements: O(k) to detect, 0 merges needed
    - A nearly-sorted array: O(n) total, far better than O(n log n)
    - A fully random array: O(n log n) — same as merge sort
  TimSort's minRun parameter (32-64) is tuned to match L1/L2 cache sizes.

Sorting on Different Data Structures

ARRAY:
  Best overall: Quick Sort (in-place, cache-friendly)
  Best stable:  TimSort (used by language built-ins)
  Most predictable: Merge Sort or Heap Sort
  Tiny arrays: Insertion Sort

LINKED LIST:
  Best: Merge Sort
  Reason: merge operation requires only pointer relinking — O(1) per merge step
          no random access needed (unlike quick sort's partition and heap sort)
  Quick Sort on linked list: O(n) to find mid + O(n) per swap → O(n² log n) total
  Heap Sort: requires array-based heap — not applicable to linked list

NEARLY SORTED DATA (k inversions, k small):
  Best: Insertion Sort — O(n + k) where k = inversions
  TimSort also excellent: detects existing sorted runs
  Merge Sort and Quick Sort: still O(n log n) — do not exploit existing order

DATA WITH MANY DUPLICATES:
  Problem: standard quick sort places all duplicates on one side → O(n²)
  Solution: Quick Sort with 3-way partition — O(n log n) or even O(n) for all-equal
  TimSort: naturally handles duplicates well
  Counting Sort: O(n) if range is small

STREAMING / ONLINE DATA:
  Best: Insertion Sort — inserts new element into sorted position as it arrives
  Each insertion: O(n) in worst case, O(log n) comparisons (with binary search)
  Total for n arrivals: O(n²) — worse than offline sorts but works online

EXTERNAL DATA (disk / too large for RAM):
  Algorithm: External Merge Sort (k-way merge with min-heap)
  Phase 1: Read chunks into RAM, sort each, write sorted runs to disk
  Phase 2: k-way merge using a min-heap of k file pointers
  Why merge sort: sequential disk access — merge reads two files sequentially
  Why NOT quick sort: random partition swaps → random disk seeks → catastrophic latency

Summary Comparison Cards

BUBBLE SORT
  Use when: teaching sorting concepts, or n < 10 and simplicity matters
  Never when: n > 50 in production
  Key insight: adaptive (O(n) best case), stable, but strictly worse than Insertion Sort

SELECTION SORT
  Use when: writes are extremely expensive (flash memory, remote I/O)
  Never when: stability required, or n > 50 with normal write cost
  Key insight: exactly ≤ n-1 writes regardless of input — minimum possible writes

INSERTION SORT
  Use when: n < 50, nearly sorted data, online/streaming, small subroutine in hybrid sorts
  Never when: n > 1000 with random input
  Key insight: O(n) best case, shifts not swaps, lowest constant factor for tiny arrays

MERGE SORT
  Use when: stability required, linked lists, external sort, guaranteed O(n log n)
  Never when: O(n) extra space is unacceptable
  Key insight: O(n) auxiliary array is the price of stability and guaranteed performance

QUICK SORT
  Use when: maximum average speed, in-place, no stability needed
  Never when: stability required, or O(n²) worst case is unacceptable without mitigations
  Key insight: use random pivot or median-of-three to avoid O(n²) in practice

HEAP SORT
  Use when: in-place + guaranteed O(n log n) + no stability needed
  Never when: cache performance matters (almost always prefer Quick/TimSort)
  Key insight: theoretically optimal but practically slow due to cache thrashing

TIMSORT
  Use when: the built-in sort (language default) — nearly always the right choice
  Never when: you need to implement it yourself (use merge sort instead)
  Key insight: exploits real-world partial order, O(n) for sorted/nearly-sorted data

COUNTING SORT
  Use when: integers in small range [0, k] with k ≈ O(n)
  Never when: k >> n, or data is not integers in a bounded range
  Key insight: breaks O(n log n) lower bound by not comparing elements

RADIX SORT
  Use when: fixed-length integers, stable required, k (digit count) is small
  Never when: variable-length keys, arbitrary comparison needed
  Key insight: sort digit by digit using stable counting sort — O(n × digits)

Choosing Between Built-In and Custom

ALWAYS use the built-in unless:
  ✓ You are sorting a linked list    → Custom Merge Sort
  ✓ You need external sort           → Custom External Merge Sort
  ✓ You have a non-comparison domain → Custom Counting/Radix/Bucket Sort
  ✓ You are implementing a sort for  → As specified
    an interview problem

Built-in advantages:
  + Tuned for your hardware (cache size, branch predictor)
  + Handles edge cases (empty, single-element, all-equal)
  + Extensively tested
  + TimSort/IntroSort outperforms naive implementations of the same algorithm
  + Dual-Pivot QuickSort (Java primitives) has ~10% better constant factor
    than single-pivot QuickSort for most inputs

When comparing custom implementations to built-ins:
  A textbook Merge Sort may be 2-4× slower than TimSort on real data
  A textbook QuickSort may be 2× slower than IntroSort
  Not because the algorithm is different, but because the built-in is
  optimised for cache line size, SIMD, branch prediction, and memory layout.

Common Mistakes

Choosing Quick Sort when stability is required. Quick Sort is not stable. If equal elements must preserve their original relative order (multi-key sorting, database records, UI rendering order), use Merge Sort, TimSort, or Insertion Sort. This is the most common algorithm selection mistake.

Using Bubble Sort when Insertion Sort would work equally well. Both are O(n²) and serve the same use cases, but Insertion Sort does shifts (one assignment) instead of swaps (three assignments) — it is strictly faster with no tradeoffs. If you reach for Bubble Sort, reach for Insertion Sort instead.

Applying Counting Sort when the range k is large. If you sort n=100 elements with values in [0, 10^9], Counting Sort allocates 10^9 integers — gigabytes of memory. Always check that k = O(n) before using Counting Sort.

Not using the built-in sort for production code. Language built-ins (TimSort, Dual-Pivot QuickSort, IntroSort) are optimised, tested, and hardware-tuned. A custom implementation of the "same" algorithm is almost always slower and more buggy. Only deviate from the built-in for the specific reasons listed above.

Choosing Heap Sort over Quick Sort for performance. Despite identical O(n log n) complexity, Heap Sort is 3-4× slower than Quick Sort in practice due to cache thrashing. Heap Sort is for the constrained case (in-place + guaranteed O(n log n)) — not for general-purpose sorting where speed is the goal.

Interview Questions

Q: When would you choose Merge Sort over Quick Sort?

Four situations: (1) Stability is required — Merge Sort is stable, Quick Sort is not. (2) A guaranteed O(n log n) worst case is needed — Merge Sort always runs in O(n log n); Quick Sort can degrade to O(n²) with bad pivot choices. (3) Sorting a linked list — Merge Sort needs only pointer traversal, no random access; Quick Sort's partition step is inefficient on linked lists. (4) External sorting — data too large for RAM — Merge Sort's sequential access pattern matches disk I/O perfectly.

Q: Why does Heap Sort have poor cache performance despite being O(n log n) in-place?

Heap Sort accesses array elements by index arithmetic: parent at i, children at 2i+1 and 2i+2. For a large heap, these indices jump across memory with no locality — elements accessed at level k of the heap are distance 2^k apart. Every sift-down operation traverses O(log n) levels, each potentially a cache miss. Compare to Quick Sort, which partitions around a pivot by scanning elements sequentially — cache-friendly, near-zero cache misses during a partition pass.

Q: For what type of data does TimSort give O(n) performance?

TimSort detects natural sorted runs in the input. If the input is already sorted (ascending or descending), it detects one run covering the entire array and performs no merges — O(n) to scan and confirm the run, zero merges. For nearly-sorted data with k out-of-place elements, TimSort handles it in O(n + k log k) — dramatically better than O(n log n) for small k. For fully random data, TimSort falls back to O(n log n) merge sort behaviour.

Summary

No single algorithm wins on all dimensions. The choice depends on constraints:

Three-property decision (pick any two optimally):

Fast (O(n log n) practical) | In-place | Stable
  Fast + In-place:  Quick Sort     (not stable, O(n²) worst case)
  Fast + Stable:    Merge Sort     (not in-place, O(n) space)
  In-place + Stable: Insertion Sort (not fast, O(n²))
  All three:        No comparison sort achieves all three optimally

Practical defaults:

  • General use: Use the language built-in — TimSort/IntroSort is the answer
  • Stability required: Merge Sort or language's stable sort
  • In-place + guaranteed O(n log n): Heap Sort
  • Tiny arrays (n ≤ 15): Insertion Sort
  • Integers in range [0,k], k small: Counting Sort
  • Fixed-width integers/strings: Radix Sort
  • Nearly sorted: Insertion Sort or TimSort
  • Linked list: Merge Sort
  • External data: External Merge Sort

Performance reality check:

  • Quick Sort ~2× faster than Merge Sort on random arrays (cache + no allocation)
  • Heap Sort ~3-4× slower than Quick Sort (cache thrashing)
  • TimSort ~same as Merge Sort on random, much faster on structured data
  • Insertion Sort fastest for n ≤ 15 regardless of asymptotic complexity

In the next topic, you will explore Time and Space Complexity — the definitive reference for every sorting algorithm's complexity analysis with derivations and space breakdown.

Suggested Quiz

You need to sort 10 million records from a database where equal salaries must preserve employee ID order. Which algorithm and why?

1/6
Sorting Algorithm Comparison