Sorting Algorithm Comparison
Complete Algorithm Comparison Table
Algorithm Best Average Worst Space Stable Adaptive Online ──────────────────────────────────────────────────────────────────────────────── Bubble O(n) O(n²) O(n²) O(1) ✓ Yes ✓ Yes ✓ Yes Selection O(n²) O(n²) O(n²) O(1) ✗ No ✗ No ✗ No Insertion O(n) O(n²) O(n²) O(1) ✓ Yes ✓ Yes ✓ Yes Shell O(n log n) O(n log²n) O(n²) O(1) ✗ No ✓ Yes ✗ No Merge O(n log n) O(n log n) O(n log n) O(n) ✓ Yes ✗ No ✗ No Quick O(n log n) O(n log n) O(n²) O(logn) ✗ No ✗ No ✗ No Heap O(n log n) O(n log n) O(n log n) O(1) ✗ No ✗ No ✗ No TimSort O(n) O(n log n) O(n log n) O(n) ✓ Yes ✓ Yes ✗ No IntroSort O(n log n) O(n log n) O(n log n) O(logn) ✗ No ✗ No ✗ No Counting O(n+k) O(n+k) O(n+k) O(n+k) ✓ Yes — — Radix O(nk) O(nk) O(nk) O(n+k) ✓ Yes — — Bucket O(n+k) O(n+k) O(n²) O(n+k) ✓ Yes — —
Head-to-Head: The Simple Sorts
All three simple sorts are O(n²) average and worst case, but they differ meaningfully in real performance:
INSERTION SORT vs BUBBLE SORT:
Both: O(n²) worst, O(n) best, stable, adaptive, O(1) space
Insertion wins: each "move" is one assignment (shift), not three (swap)
— ~3× fewer memory writes for the same logical work
Bubble wins: nothing. Insertion Sort strictly dominates Bubble Sort
for all practical purposes.
INSERTION SORT vs SELECTION SORT:
Insertion: O(n) best, O(n²) worst, stable, adaptive
Selection: O(n²) all cases, unstable, non-adaptive, ≤n-1 swaps
Insertion wins: adaptive (O(n) for nearly sorted), stable
Selection wins: ≤n-1 writes regardless of input
→ use when write cost >> read cost (flash memory, NVRAM)
SELECTION SORT vs BUBBLE SORT:
Both: O(n²) all cases, O(1) space
Selection: unstable, non-adaptive, ≤n-1 swaps
Bubble: stable, adaptive (O(n) best with early exit), up to O(n²) swaps
Selection wins: minimum writes
Bubble wins: adaptive for nearly-sorted, stable
VERDICT:
For small arrays in practice: Insertion Sort > Bubble Sort > Selection Sort
Exception: write-limited hardware → Selection Sort wins
Head-to-Head: The Efficient Sorts
MERGE SORT vs QUICK SORT:
Merge: O(n log n) guaranteed, stable, O(n) space, great for linked lists
Quick: O(n log n) average (O(n²) worst with bad pivot), unstable, O(logn) space
Merge wins: stability, guaranteed O(n log n), linked lists, external sort
Quick wins: in-place (no auxiliary array), better cache locality in practice,
~2× faster than merge sort on typical random arrays
Use Merge: when stability required, when worst-case guarantee needed,
when sorting linked lists, for external sort
Use Quick: when in-place required, when maximum average speed needed,
when stability not required
MERGE SORT vs HEAP SORT:
Merge: stable, O(n) space, good cache (sequential access)
Heap: in-place O(1) space, unstable, poor cache (random heap access)
Merge wins: stability, cache performance
Heap wins: O(1) space with O(n log n) guarantee — no auxiliary array
Use Heap: when in-place + guaranteed O(n log n) + no stability needed
QUICK SORT vs HEAP SORT:
Quick: O(n log n) average, O(n²) worst, better cache, smaller constants
Heap: O(n log n) always, O(1) space, poor cache, larger constants
Quick wins: real-world speed (~2-3× faster than heap in practice)
Heap wins: guaranteed O(n log n) worst case, in-place
Use Heap: when quick sort's O(n²) worst case is unacceptable and
in-place is required (e.g. real-time systems)
INTROSORT (C++ std::sort) — the pragmatic hybrid:
Phase 1: Quick Sort (fast average case)
Phase 2: If recursion depth > 2·log(n): switch to Heap Sort (O(n log n) guarantee)
Phase 3: Insertion Sort for small subarrays (n < 16)
Result: Quick Sort speed + Heap Sort safety + Insertion Sort for small n
No O(n²) worst case. Not stable.
Head-to-Head: Non-Comparison Sorts
COUNTING SORT:
Complexity: O(n + k) where k = range of values
Space: O(k) for count array
Stable: ✓ Yes
Best for: integers in a small, known range [0, k]
Worst case: k >> n (huge range with few elements) → wastes O(k) space
Example good use: sort 10,000 exam scores in range [0, 100]
k=100, n=10,000 → O(10,100) ≈ O(n)
Example bad use: sort 100 GUIDs (128-bit integers)
k=2^128 → completely impractical
RADIX SORT:
Complexity: O(n × d) where d = number of digits/characters
Space: O(n + b) where b = base (10, 256, etc.)
Stable: ✓ Yes (requires stable per-digit sort)
Best for: fixed-length integers, fixed-length strings, phone numbers
Limitation: element size must be fixed; comparisons between arbitrary types not supported
LSD (Least Significant Digit first): standard, stable, easy to implement
MSD (Most Significant Digit first): handles variable length, more complex
BUCKET SORT:
Complexity: O(n + k) average where k = number of buckets
Space: O(n + k)
Stable: ✓ Yes (if inner sort per bucket is stable)
Best for: uniformly distributed floats in [0, 1)
Worst case: all elements in one bucket → O(n²) for that bucket
Limitation: requires knowledge of value distribution
When non-comparison sorts beat comparison sorts:
Comparison sorts: Ω(n log n) lower bound — cannot do better
Non-comparison: beat O(n log n) by exploiting data properties
→ Counting Sort beats O(n log n) when k = O(n)
→ Radix Sort beats O(n log n) when d is constant and small
→ Bucket Sort beats O(n log n) for uniform distributions
Algorithm Selection Decision Framework
Question 1: Do you care about stability?
YES → Must use: Merge Sort, TimSort, Insertion Sort, Counting Sort, Radix Sort
NO → Any algorithm is valid
Question 2: What is the data type?
Integers in range [0, k], k small → Counting Sort: O(n + k)
Fixed-length integers or strings → Radix Sort: O(n × d)
Floats uniformly distributed → Bucket Sort: O(n + k) average
General objects with comparator → Comparison sort (continue)
Question 3: What is n?
n ≤ 10-15 → Insertion Sort: lowest overhead, no allocation
n ≤ 50 → Insertion Sort or Shell Sort
n > 50, general use → O(n log n) sort (continue)
Question 4: Memory constraint?
O(1) space required → Quick Sort or Heap Sort
+ no O(n²) worst case → Heap Sort or IntroSort
O(n) space acceptable → Merge Sort, TimSort (often fastest in practice)
Question 5: Input characteristics?
Nearly sorted → Insertion Sort (O(n)) or TimSort
Many duplicates → Quick Sort with 3-way partition, or TimSort
Linked list → Merge Sort (no random access needed)
External data (disk) → External Merge Sort
Random, no constraints → Quick Sort with random pivot or IntroSort
Question 6: Worst-case guarantee needed?
YES → Merge Sort, Heap Sort, TimSort, IntroSort (all O(n log n) guaranteed)
NO → Quick Sort (O(n²) possible but extremely unlikely with random pivot)
QUICK REFERENCE:
"Just sort an array in code" → Use the built-in (TimSort/IntroSort)
"Sort and need stability" → Merge Sort or TimSort
"Sort in-place, no stability" → Quick Sort
"Guaranteed fast, no stability" → Heap Sort
"Small n (≤ 15)" → Insertion Sort
"Integers in small range" → Counting Sort
"External or disk sort" → Merge Sort
"Linked list" → Merge Sort
"Nearly sorted data" → Insertion Sort or TimSort
Built-In Sort: What Your Language Actually Uses
JAVA:
Arrays.sort(int[]) → Dual-Pivot QuickSort O(n log n) avg, unstable
Arrays.sort(long[]) → Dual-Pivot QuickSort O(n log n) avg, unstable
Arrays.sort(Object[]) → TimSort O(n log n) guar, stable
Arrays.sort(T[], Cmp) → TimSort O(n log n) guar, stable
Collections.sort(List) → TimSort O(n log n) guar, stable
List.sort(Cmp) → TimSort O(n log n) guar, stable
PriorityQueue → Binary Heap O(log n) per op
PYTHON:
list.sort() → TimSort O(n) best, O(n log n) guar, stable
sorted() → TimSort same
C++:
std::sort → IntroSort O(n log n) guar, unstable
std::stable_sort → MergeSort O(n log n) guar, stable
std::partial_sort → HeapSort O(n log k)
std::nth_element → IntroSelect O(n) avg
JAVASCRIPT:
Array.prototype.sort → TimSort (V8/modern) O(n log n) guar, stable (ES2019+)
(old V8: QuickSort for n > 10)
Real-World Performance vs Theoretical Complexity
Theoretical O notation and real-world performance can diverge significantly:
WHY QUICK SORT BEATS MERGE SORT IN PRACTICE despite same O(n log n):
1. Cache locality
Quick Sort: works in-place on contiguous array — elements accessed sequentially
Merge Sort: reads from two separate regions, writes to third — more cache misses
2. Memory allocation
Quick Sort: no auxiliary array — zero allocation overhead
Merge Sort: allocates O(n) on every sort call
3. Constant factor
Quick Sort inner loop: 1 comparison + conditional swap
Merge Sort inner loop: 1 comparison + 1 copy + pointer increment
Empirical: Quick Sort typically 2× faster than Merge Sort for random data.
WHY HEAP SORT IS SLOW IN PRACTICE despite O(n log n) guaranteed:
Heap access pattern: index i → child at 2i+1 and 2i+2
For large heaps, these are far apart in memory → constant cache misses
Each sift-down operation traverses O(log n) levels of the heap,
with each level access likely causing a cache miss.
Result: Heap Sort is ~3-4× slower than Quick Sort on real hardware.
WHY TIMSORT IS FAST ON REAL DATA:
Real-world data is rarely random. It contains partial order (runs).
TimSort detects and exploits these natural runs:
- A sorted run of k elements: O(k) to detect, 0 merges needed
- A nearly-sorted array: O(n) total, far better than O(n log n)
- A fully random array: O(n log n) — same as merge sort
TimSort's minRun parameter (32-64) is tuned to match L1/L2 cache sizes.
Sorting on Different Data Structures
ARRAY:
Best overall: Quick Sort (in-place, cache-friendly)
Best stable: TimSort (used by language built-ins)
Most predictable: Merge Sort or Heap Sort
Tiny arrays: Insertion Sort
LINKED LIST:
Best: Merge Sort
Reason: merge operation requires only pointer relinking — O(1) per merge step
no random access needed (unlike quick sort's partition and heap sort)
Quick Sort on linked list: O(n) to find mid + O(n) per swap → O(n² log n) total
Heap Sort: requires array-based heap — not applicable to linked list
NEARLY SORTED DATA (k inversions, k small):
Best: Insertion Sort — O(n + k) where k = inversions
TimSort also excellent: detects existing sorted runs
Merge Sort and Quick Sort: still O(n log n) — do not exploit existing order
DATA WITH MANY DUPLICATES:
Problem: standard quick sort places all duplicates on one side → O(n²)
Solution: Quick Sort with 3-way partition — O(n log n) or even O(n) for all-equal
TimSort: naturally handles duplicates well
Counting Sort: O(n) if range is small
STREAMING / ONLINE DATA:
Best: Insertion Sort — inserts new element into sorted position as it arrives
Each insertion: O(n) in worst case, O(log n) comparisons (with binary search)
Total for n arrivals: O(n²) — worse than offline sorts but works online
EXTERNAL DATA (disk / too large for RAM):
Algorithm: External Merge Sort (k-way merge with min-heap)
Phase 1: Read chunks into RAM, sort each, write sorted runs to disk
Phase 2: k-way merge using a min-heap of k file pointers
Why merge sort: sequential disk access — merge reads two files sequentially
Why NOT quick sort: random partition swaps → random disk seeks → catastrophic latency
Summary Comparison Cards
BUBBLE SORT Use when: teaching sorting concepts, or n < 10 and simplicity matters Never when: n > 50 in production Key insight: adaptive (O(n) best case), stable, but strictly worse than Insertion Sort SELECTION SORT Use when: writes are extremely expensive (flash memory, remote I/O) Never when: stability required, or n > 50 with normal write cost Key insight: exactly ≤ n-1 writes regardless of input — minimum possible writes INSERTION SORT Use when: n < 50, nearly sorted data, online/streaming, small subroutine in hybrid sorts Never when: n > 1000 with random input Key insight: O(n) best case, shifts not swaps, lowest constant factor for tiny arrays MERGE SORT Use when: stability required, linked lists, external sort, guaranteed O(n log n) Never when: O(n) extra space is unacceptable Key insight: O(n) auxiliary array is the price of stability and guaranteed performance QUICK SORT Use when: maximum average speed, in-place, no stability needed Never when: stability required, or O(n²) worst case is unacceptable without mitigations Key insight: use random pivot or median-of-three to avoid O(n²) in practice HEAP SORT Use when: in-place + guaranteed O(n log n) + no stability needed Never when: cache performance matters (almost always prefer Quick/TimSort) Key insight: theoretically optimal but practically slow due to cache thrashing TIMSORT Use when: the built-in sort (language default) — nearly always the right choice Never when: you need to implement it yourself (use merge sort instead) Key insight: exploits real-world partial order, O(n) for sorted/nearly-sorted data COUNTING SORT Use when: integers in small range [0, k] with k ≈ O(n) Never when: k >> n, or data is not integers in a bounded range Key insight: breaks O(n log n) lower bound by not comparing elements RADIX SORT Use when: fixed-length integers, stable required, k (digit count) is small Never when: variable-length keys, arbitrary comparison needed Key insight: sort digit by digit using stable counting sort — O(n × digits)
Choosing Between Built-In and Custom
ALWAYS use the built-in unless:
✓ You are sorting a linked list → Custom Merge Sort
✓ You need external sort → Custom External Merge Sort
✓ You have a non-comparison domain → Custom Counting/Radix/Bucket Sort
✓ You are implementing a sort for → As specified
an interview problem
Built-in advantages:
+ Tuned for your hardware (cache size, branch predictor)
+ Handles edge cases (empty, single-element, all-equal)
+ Extensively tested
+ TimSort/IntroSort outperforms naive implementations of the same algorithm
+ Dual-Pivot QuickSort (Java primitives) has ~10% better constant factor
than single-pivot QuickSort for most inputs
When comparing custom implementations to built-ins:
A textbook Merge Sort may be 2-4× slower than TimSort on real data
A textbook QuickSort may be 2× slower than IntroSort
Not because the algorithm is different, but because the built-in is
optimised for cache line size, SIMD, branch prediction, and memory layout.
Common Mistakes
Choosing Quick Sort when stability is required. Quick Sort is not stable. If equal elements must preserve their original relative order (multi-key sorting, database records, UI rendering order), use Merge Sort, TimSort, or Insertion Sort. This is the most common algorithm selection mistake.
Using Bubble Sort when Insertion Sort would work equally well. Both are O(n²) and serve the same use cases, but Insertion Sort does shifts (one assignment) instead of swaps (three assignments) — it is strictly faster with no tradeoffs. If you reach for Bubble Sort, reach for Insertion Sort instead.
Applying Counting Sort when the range k is large. If you sort n=100 elements with values in [0, 10^9], Counting Sort allocates 10^9 integers — gigabytes of memory. Always check that k = O(n) before using Counting Sort.
Not using the built-in sort for production code. Language built-ins (TimSort, Dual-Pivot QuickSort, IntroSort) are optimised, tested, and hardware-tuned. A custom implementation of the "same" algorithm is almost always slower and more buggy. Only deviate from the built-in for the specific reasons listed above.
Choosing Heap Sort over Quick Sort for performance. Despite identical O(n log n) complexity, Heap Sort is 3-4× slower than Quick Sort in practice due to cache thrashing. Heap Sort is for the constrained case (in-place + guaranteed O(n log n)) — not for general-purpose sorting where speed is the goal.
Interview Questions
Q: When would you choose Merge Sort over Quick Sort?
Four situations: (1) Stability is required — Merge Sort is stable, Quick Sort is not. (2) A guaranteed O(n log n) worst case is needed — Merge Sort always runs in O(n log n); Quick Sort can degrade to O(n²) with bad pivot choices. (3) Sorting a linked list — Merge Sort needs only pointer traversal, no random access; Quick Sort's partition step is inefficient on linked lists. (4) External sorting — data too large for RAM — Merge Sort's sequential access pattern matches disk I/O perfectly.
Q: Why does Heap Sort have poor cache performance despite being O(n log n) in-place?
Heap Sort accesses array elements by index arithmetic: parent at i, children at 2i+1 and 2i+2. For a large heap, these indices jump across memory with no locality — elements accessed at level k of the heap are distance 2^k apart. Every sift-down operation traverses O(log n) levels, each potentially a cache miss. Compare to Quick Sort, which partitions around a pivot by scanning elements sequentially — cache-friendly, near-zero cache misses during a partition pass.
Q: For what type of data does TimSort give O(n) performance?
TimSort detects natural sorted runs in the input. If the input is already sorted (ascending or descending), it detects one run covering the entire array and performs no merges — O(n) to scan and confirm the run, zero merges. For nearly-sorted data with k out-of-place elements, TimSort handles it in O(n + k log k) — dramatically better than O(n log n) for small k. For fully random data, TimSort falls back to O(n log n) merge sort behaviour.
Summary
No single algorithm wins on all dimensions. The choice depends on constraints:
Three-property decision (pick any two optimally):
Fast (O(n log n) practical) | In-place | Stable Fast + In-place: Quick Sort (not stable, O(n²) worst case) Fast + Stable: Merge Sort (not in-place, O(n) space) In-place + Stable: Insertion Sort (not fast, O(n²)) All three: No comparison sort achieves all three optimally
Practical defaults:
- ›General use: Use the language built-in — TimSort/IntroSort is the answer
- ›Stability required: Merge Sort or language's stable sort
- ›In-place + guaranteed O(n log n): Heap Sort
- ›Tiny arrays (n ≤ 15): Insertion Sort
- ›Integers in range [0,k], k small: Counting Sort
- ›Fixed-width integers/strings: Radix Sort
- ›Nearly sorted: Insertion Sort or TimSort
- ›Linked list: Merge Sort
- ›External data: External Merge Sort
Performance reality check:
- ›Quick Sort ~2× faster than Merge Sort on random arrays (cache + no allocation)
- ›Heap Sort ~3-4× slower than Quick Sort (cache thrashing)
- ›TimSort ~same as Merge Sort on random, much faster on structured data
- ›Insertion Sort fastest for n ≤ 15 regardless of asymptotic complexity
In the next topic, you will explore Time and Space Complexity — the definitive reference for every sorting algorithm's complexity analysis with derivations and space breakdown.
You need to sort 10 million records from a database where equal salaries must preserve employee ID order. Which algorithm and why?