DSA Tutorial
🔍

Sorting Basics

What Is Sorting?

Sorting is the process of arranging elements in a collection into a defined order — typically ascending or descending for numbers, or lexicographic for strings.

Unsorted: [64, 25, 12, 22, 11]
Sorted:   [11, 12, 22, 25, 64]   ← ascending

Unsorted: ["banana", "apple", "cherry"]
Sorted:   ["apple", "banana", "cherry"]   ← lexicographic (dictionary order)

Sorting is one of the most fundamental operations in computer science. It is the prerequisite for binary search, the foundation for efficient algorithms like merge intervals and two-pointer, and appears implicitly in databases, operating systems, and rendering pipelines.

Why Sorting Matters

Without sorting:              With sorted data:
  Search:    O(n)               Binary search:    O(log n)
  Find min:  O(n)               Read arr[0]:      O(1)
  Duplicates: O(n²) naive       Two-pointer:      O(n)
  Merge data: O(n×m)            Merge two arrays: O(n+m)
  Median:    O(n) (QuickSelect) Read arr[n/2]:    O(1)

Many problems reduce to: sort first, then apply a O(n) or O(log n) technique.
The sorting cost is paid once; subsequent operations are dramatically faster.

Key Vocabulary

Ascending order
  Smallest to largest: [1, 3, 5, 7, 9]
  Each element ≤ the next element.

Descending order
  Largest to smallest: [9, 7, 5, 3, 1]
  Each element ≥ the next element.

Lexicographic / Alphabetical order
  Dictionary order for strings: "apple" < "banana" < "cherry"
  Compared character by character left to right.

Comparator / Key function
  A function that defines the ordering rule.
  Example: sort Person objects by age, then by name alphabetically.

In-place sort
  Sorts within the original array — uses O(1) or O(log n) extra space.
  Examples: Bubble Sort, Selection Sort, Insertion Sort, Quick Sort (O(log n) stack).

Out-of-place sort
  Allocates a new auxiliary array — uses O(n) extra space.
  Examples: Merge Sort, Counting Sort, Radix Sort.

Stable sort
  Equal elements keep their original relative order.
  Example: if Alice and Bob both have age=30, and Alice appeared first,
           Alice appears first in the sorted output too.
  Stable: Merge Sort, Insertion Sort, Bubble Sort, TimSort.

Unstable sort
  Equal elements may appear in any order after sorting.
  Unstable: Selection Sort, Quick Sort, Heap Sort.

Adaptive sort
  Performs better when input is already partially sorted.
  Examples: Insertion Sort (O(n) for nearly sorted), TimSort.

Online sort
  Can sort a stream of data as it arrives — no need to see all elements first.
  Example: Insertion Sort.

Comparison-based sort
  Determines order by comparing pairs of elements.
  All comparison sorts have a lower bound of Ω(n log n).
  Examples: Merge Sort, Quick Sort, Heap Sort.

Non-comparison sort
  Uses properties of the data (digit values, frequencies) — not direct comparison.
  Can beat O(n log n) for specific data types.
  Examples: Counting Sort O(n+k), Radix Sort O(nk), Bucket Sort O(n).

In-Place vs Out-of-Place

IN-PLACE (O(1) extra space):
  Sort happens within the original array.
  Elements are swapped/shifted in-place.

  arr = [3, 1, 4, 1, 5]
         ↕   ↕           ← swap elements directly
  arr = [1, 1, 3, 4, 5]  ← sorted in the same array

  Pros: memory-efficient, no allocation overhead
  Cons: harder to implement, often destructive (original lost)

OUT-OF-PLACE (O(n) extra space):
  Sort writes results to a new array.

  arr = [3, 1, 4, 1, 5]   ← original unchanged
  aux = [1, 1, 3, 4, 5]   ← sorted copy

  Pros: original preserved, often simpler logic (Merge Sort)
  Cons: uses extra memory

Borderline case:
  Quick Sort uses O(log n) stack space (recursion depth) —
  sometimes called "in-place" because the array itself is not copied,
  though it is not strictly O(1) extra space.

Stable vs Unstable Sorting

Input: [(Alice, 30), (Bob, 25), (Charlie, 30), (Diana, 25)]
       Sort by age ascending.

STABLE SORT:
  [(Bob, 25), (Diana, 25), (Alice, 30), (Charlie, 30)]
   ^^^^^^^^^^^^^^^^^^^      ^^^^^^^^^^^^^^^^^^^^^^^^^^^
   Bob before Diana        Alice before Charlie
   (original order         (original order
    preserved for 25s)      preserved for 30s)

UNSTABLE SORT:
  [(Diana, 25), (Bob, 25), (Charlie, 30), (Alice, 30)]   ← possible output
   ^^^^^^^^^^^ order of 25s REVERSED    ^^^^^^^^^^^^ order of 30s REVERSED

When stability matters:
  Multi-key sorting: sort by last name, then by first name.
    If first-name sort is stable, last names sorted first are preserved.
  Database records: sorting rows by one column while preserving relative
    order from a previous sort on another column.
  UI rendering: sorting a list by score while keeping tied items in
    original insertion order (chronological).

When stability does NOT matter:
  Sorting unique primitives (integers, floats with no ties).
  Any problem where equal elements are truly interchangeable.
  Maximum performance is the priority.

Stable sorts:   Merge Sort, Insertion Sort, Bubble Sort, TimSort
Unstable sorts: Selection Sort, Quick Sort, Heap Sort

Comparison vs Non-Comparison Sorting

COMPARISON-BASED SORT:
  Determines order only by comparing pairs: is a < b?
  Examples: Merge Sort, Quick Sort, Heap Sort, Bubble Sort

  Lower bound: Ω(n log n)
    Proof: n! possible orderings. Each comparison has 2 outcomes.
    Decision tree needs at least log₂(n!) ≈ n log n leaves.
    Cannot do better without additional assumptions about the data.

NON-COMPARISON SORT:
  Uses properties of the data's values — not pairwise comparisons.
  Can beat O(n log n) for specific domains.

  Counting Sort:  O(n + k)   where k = range of values
    Counts frequency of each value, accumulates counts.
    Best for integers in a small range [0, k].
    Space: O(k) — bad if k is huge.

  Radix Sort:     O(n × d)   where d = number of digits
    Sorts digit by digit from least significant to most significant.
    Uses a stable sort (counting sort) per digit pass.
    Best for fixed-length integers or strings.

  Bucket Sort:    O(n + k)   average
    Distributes elements into buckets, sorts each bucket.
    Best for uniformly distributed real numbers in [0, 1].

When to use non-comparison sorts:
  ✓ Integers in a known range [0, k] with k << n² → Counting Sort
  ✓ Fixed-length integers (phone numbers, IDs) → Radix Sort
  ✓ Floating-point values uniformly distributed → Bucket Sort
  ✗ General objects with arbitrary comparators → Comparison sort

The O(n log n) Lower Bound

Why can't we do better than O(n log n) for comparison sorts?

There are n! ways to arrange n elements.
A sorting algorithm must distinguish between all n! orderings.
Each comparison has 2 possible outcomes (a < b or a ≥ b).

The algorithm's decisions form a binary decision tree.
To reach any of n! orderings, the tree must have ≥ n! leaves.
A binary tree with ≥ n! leaves has height ≥ log₂(n!).

By Stirling's approximation:
  log₂(n!) ≈ n log₂ n − n log₂ e ≈ n log₂ n

Therefore: any comparison-based sort needs Ω(n log n) comparisons.

Merge Sort and Heap Sort achieve this lower bound exactly.
Quick Sort achieves it on average but not in the worst case.
TimSort achieves it and additionally exploits pre-existing order.

Built-in Sort Functions

Every major language provides a fast, production-quality sort function. In interviews and production code, always use the built-in unless explicitly asked to implement a sorting algorithm.

Java
1import java.util.*; 2 3public class BuiltinSort { 4 5 public static void main(String[] args) { 6 // ── PRIMITIVE ARRAYS — Arrays.sort uses Dual-Pivot QuickSort ────── 7 int[] nums = {64, 25, 12, 22, 11}; 8 Arrays.sort(nums); // Ascending — in-place, O(n log n) 9 System.out.println(Arrays.toString(nums)); // [11, 12, 22, 25, 64] 10 11 // Sort a subrange [fromIndex, toIndex) 12 int[] partial = {5, 3, 8, 1, 9, 2}; 13 Arrays.sort(partial, 1, 4); // Sort indices 1,2,3 only 14 System.out.println(Arrays.toString(partial)); // [5, 1, 3, 8, 9, 2] 15 16 // Reverse sort for primitives — no direct method, sort then reverse 17 int[] rev = {3, 1, 4, 1, 5}; 18 Arrays.sort(rev); 19 for (int i = 0, j = rev.length-1; i < j; i++, j--) { 20 int tmp = rev[i]; rev[i] = rev[j]; rev[j] = tmp; 21 } 22 System.out.println(Arrays.toString(rev)); // [5, 4, 3, 1, 1] 23 24 // ── OBJECT ARRAYS — Arrays.sort uses TimSort (STABLE) ──────────── 25 String[] words = {"banana", "apple", "cherry", "date"}; 26 Arrays.sort(words); // Natural order 27 System.out.println(Arrays.toString(words)); // [apple, banana, cherry, date] 28 29 // Custom comparator — sort by string length 30 String[] byLen = {"banana", "apple", "cherry", "fig"}; 31 Arrays.sort(byLen, Comparator.comparingInt(String::length)); 32 System.out.println(Arrays.toString(byLen)); // [fig, apple, banana, cherry] 33 34 // Reverse order for objects 35 Integer[] ints = {3, 1, 4, 1, 5}; 36 Arrays.sort(ints, Collections.reverseOrder()); 37 System.out.println(Arrays.toString(ints)); // [5, 4, 3, 1, 1] 38 39 // ── LISTS — Collections.sort uses TimSort (STABLE) ─────────────── 40 List<Integer> list = new ArrayList<>(Arrays.asList(5, 2, 8, 1, 9)); 41 Collections.sort(list); 42 System.out.println(list); // [1, 2, 5, 8, 9] 43 44 // Or use List.sort (preferred in Java 8+) 45 list.sort(Comparator.reverseOrder()); 46 System.out.println(list); // [9, 8, 5, 2, 1] 47 48 // Sort 2D array — by first element 49 int[][] pairs = {{3,1},{1,5},{2,3},{1,2}}; 50 Arrays.sort(pairs, (a, b) -> a[0] != b[0] ? a[0] - b[0] : a[1] - b[1]); 51 for (int[] p : pairs) System.out.print(Arrays.toString(p) + " "); 52 // [1,2] [1,5] [2,3] [3,1] 53 System.out.println(); 54 } 55}
Output (Java):
[11, 12, 22, 25, 64]
[5, 1, 3, 8, 9, 2]
[fig, apple, banana, cherry]
[(Bob, 25), (Diana, 25), (Alice, 30), (Charlie, 30)]
[1,2] [1,5] [2,3] [3,1]

Built-in Sort Algorithm Reference

LanguageFunctionAlgorithmStableSpaceNotes
Java (primitives)Arrays.sort(int[])Dual-Pivot QuickSort✗ NoO(log n)Fastest for primitive arrays
Java (objects)Arrays.sort(Object[])TimSort✓ YesO(n)Stable, guaranteed O(n log n)
Java (objects)Collections.sort(List)TimSort✓ YesO(n)Delegates to Arrays.sort
Pythonlist.sort(), sorted()TimSort✓ YesO(n)Always stable
C++std::sortIntroSort✗ NoO(log n)Hybrid: quicksort + heapsort + insertion
C++std::stable_sortMerge Sort✓ YesO(n) or O(n log² n)Stable, slightly slower
JavaScriptArray.prototype.sortTimSort (V8)✓ Yes (ES2019+)O(n)Stable since ECMAScript 2019

Sorting Algorithms Overview

The six algorithms covered in dedicated pages, with their key properties at a glance:

Algorithm       Best     Average   Worst    Space   Stable  Adaptive
──────────────────────────────────────────────────────────────────────
Bubble Sort     O(n)     O(n²)     O(n²)    O(1)    ✓ Yes   ✓ Yes
Selection Sort  O(n²)    O(n²)     O(n²)    O(1)    ✗ No    ✗ No
Insertion Sort  O(n)     O(n²)     O(n²)    O(1)    ✓ Yes   ✓ Yes
Merge Sort      O(n logn) O(n logn) O(n logn) O(n)  ✓ Yes   ✗ No
Quick Sort      O(n logn) O(n logn) O(n²)   O(logn) ✗ No    ✗ No
Heap Sort       O(n logn) O(n logn) O(n logn) O(1)  ✗ No    ✗ No
TimSort         O(n)     O(n logn) O(n logn) O(n)   ✓ Yes   ✓ Yes
Counting Sort   O(n+k)   O(n+k)    O(n+k)  O(n+k)  ✓ Yes   —
Radix Sort      O(nk)    O(nk)     O(nk)   O(n+k)  ✓ Yes   —

Simple sorts (O(n²)): Bubble, Selection, Insertion — used for small arrays (n < ~50) or educational purposes. Insertion Sort is the exception: it is competitive for nearly-sorted data and is used internally by TimSort and IntroSort for small subarrays.

Efficient sorts (O(n log n)): Merge Sort (stable, guaranteed), Quick Sort (unstable, fastest in practice), Heap Sort (in-place, guaranteed but poor cache behaviour).

Special sorts (sub-O(n log n)): Counting, Radix, Bucket — only applicable to specific data types; beat the O(n log n) lower bound by exploiting data properties rather than comparisons.

When to Use Each Sort

Use the built-in sort (always first choice in production and interviews):
  Java:  Arrays.sort for primitives, Collections.sort/List.sort for objects
  Python: list.sort() or sorted()
  C++:   std::sort (non-stable), std::stable_sort (stable)
  JS:    Array.prototype.sort with (a,b)=>a-b comparator

Use a specific algorithm when:
  Insertion Sort:  input is nearly sorted (O(n) best case), or n < 50
  Merge Sort:      stability required + out-of-place is acceptable
  Quick Sort:      in-place, average-case performance critical, no stability needed
  Heap Sort:       guaranteed O(n log n) in-place, when quick sort's O(n²) worst case is unacceptable
  Counting Sort:   integers in range [0, k] where k is small (k << n²)
  Radix Sort:      fixed-width integers or strings, want better than O(n log n)

NEVER use for large n in production:
  Bubble Sort:     O(n²) — only for teaching comparison-based sort concepts
  Selection Sort:  O(n²) — never adaptive, never competitive

Choosing a Sort: Quick Decision Guide

Question 1: Do you need a stable sort?
  Yes → Merge Sort, TimSort, Insertion Sort
  No  → Any sort (Quick Sort, Heap Sort often fastest)

Question 2: Is memory a constraint?
  O(1) space required → Insertion Sort, Selection Sort, Quick Sort, Heap Sort
  O(n) space acceptable → Merge Sort (simpler, stable)

Question 3: What is the data type?
  Integers in small range → Counting Sort O(n+k)
  Fixed-length integers    → Radix Sort O(nk)
  General objects          → Comparison sort (Merge, Quick, TimSort)

Question 4: What is the input pattern?
  Nearly sorted → Insertion Sort (O(n)), TimSort
  Random        → Quick Sort (best average constant)
  Worst-case guarantee → Merge Sort, Heap Sort

Question 5: How large is n?
  n < 20         → Insertion Sort (low overhead)
  20 ≤ n < 1000  → Any O(n log n), or Insertion Sort if nearly sorted
  n > 1000       → O(n log n) required — use built-in

Common Mistakes

Not providing a comparator for JavaScript numeric sort. [10, 9, 2].sort() produces [10, 2, 9] — JavaScript's default comparator converts elements to strings and sorts lexicographically. Always pass (a, b) => a - b for ascending numeric sort. This is the most common JavaScript sorting bug.

Assuming Java's Arrays.sort is stable for primitive arrays. Arrays.sort(int[]) uses Dual-Pivot QuickSort — it is NOT stable. If you need stable sort for an int[], convert to Integer[] and use Arrays.sort(Integer[], comparator), which uses TimSort.

Mutating the original array when only a sorted copy is needed. In Python, arr.sort() modifies arr in-place; sorted(arr) returns a new sorted list leaving arr unchanged. In JavaScript, arr.sort() mutates arr; use [...arr].sort(...) for an immutable sort.

Using subtraction a - b as a comparator when values can overflow. (a, b) => a - b works correctly for integers in the range [-2^30, 2^30]. For very large or very small integers, a - b can overflow in Java (int subtraction) or give incorrect results. Use Integer.compare(a, b) in Java or a < b ? -1 : a > b ? 1 : 0 for safety.

Interview Questions

Q: What is the difference between a stable and an unstable sort, and when does it matter?

A stable sort guarantees that elements with equal keys appear in the same relative order in the output as in the input. An unstable sort may reorder equal elements arbitrarily. Stability matters for multi-key sorting: if you sort employee records by salary (stable), employees with equal salaries retain their original relative order — which might be a previous sort by name or ID. Unstable sorts are fine when equal elements are truly interchangeable, such as sorting a list of unique integers.

Q: Why can no comparison-based sort do better than O(n log n)?

The information-theoretic argument: there are n! possible orderings of n elements. A sorting algorithm must identify the correct one. Each comparison has two outcomes, building a binary decision tree. To distinguish n! orderings, the tree needs at least n! leaves, requiring height at least log₂(n!). By Stirling's approximation, log₂(n!) ≈ n log n. Therefore any comparison-based sort needs at least Ω(n log n) comparisons in the worst case. Merge Sort and Heap Sort achieve this bound exactly.

Q: When would you use a non-comparison sort, and what is its limitation?

Non-comparison sorts (Counting Sort, Radix Sort, Bucket Sort) beat O(n log n) by using properties of the data values rather than pairwise comparisons. Use Counting Sort for integers in a small range [0, k] — O(n + k). Use Radix Sort for fixed-length integers — O(n × d). The limitation: these sorts only work on specific data types. They cannot handle arbitrary objects with custom comparators. The space requirements also grow with k or d.

FAQs

Why does Python always use TimSort?

TimSort (invented by Tim Peters for Python 2.2) exploits natural runs of already-sorted data, making it O(n) for sorted or nearly-sorted inputs while maintaining O(n log n) worst case. It is stable, which simplifies multi-key sorting with Python's key functions. Java adopted it for object arrays (Java 7+), and JavaScript engines (V8, SpiderMonkey) now implement it for Array.prototype.sort. Its real-world performance on typical datasets outperforms purely asymptotic competitors like Heap Sort.

Is Quick Sort always faster than Merge Sort?

Quick Sort is faster in practice on average due to better cache locality (in-place, sequential access) and lower constant factors. However, Merge Sort has better guarantees — O(n log n) worst case, always stable — and outperforms Quick Sort on nearly-sorted data and adversarial inputs. TimSort (used in most language runtimes) combines both: merge sort for structure, with fallback to insertion sort for small runs.

Can you sort a linked list efficiently?

Yes — Merge Sort is ideal for linked lists. It does not need random access (unlike Quick Sort's partition step), runs in O(n log n), and can be implemented with O(1) extra space by relinking nodes. Quick Sort on linked lists requires O(n) to find the middle for partitioning — poor cache behaviour. Most interview problems asking to sort a linked list expect Merge Sort.

Summary

Sorting arranges a collection into a defined order. It is the prerequisite for binary search, two-pointer, and most interval-based algorithms.

Key properties every sorting algorithm is judged on:

  • Time complexity — best, average, worst case
  • Space complexity — O(1) in-place vs O(n) out-of-place
  • Stability — equal elements preserve their relative order
  • Adaptiveness — performs better on already-sorted input

The O(n log n) lower bound is fundamental — no comparison-based sort can do better in the worst case. Non-comparison sorts (Counting, Radix) can beat this for specific data types by exploiting value properties rather than pairwise comparisons.

Built-in sort recommendations:

  • Java: Arrays.sort for primitives (Dual-Pivot QuickSort), Collections.sort / List.sort for objects (TimSort — stable)
  • Python: list.sort() or sorted() — always TimSort, always stable
  • C++: std::sort (IntroSort — unstable), std::stable_sort (stable)
  • JavaScript: arr.sort((a,b) => a-b) — always provide a numeric comparator; default is lexicographic

In the next topic, you will explore Bubble Sort — the simplest exchange-based sort, its O(n) best case with the early-exit optimisation, and why it is the entry point for understanding all comparison-based sorts.

Suggested Quiz

What is the minimum number of comparisons needed to sort an array of n elements in the worst case?

1/6
Sorting Basics