DSA Tutorial
🔍

HashMap Basics

HashMap Basics

Hash maps (also called hash tables or dictionaries) are one of the most powerful and frequently used data structures in programming. They provide near-instant access to data through key-value pairs.

Goal: Understand how hash maps work, when to use them, and master common patterns for coding interviews.

Key Insight: Hash maps trade space for time - using O(n) extra space to achieve O(1) average-case lookup, insertion, and deletion. This makes them perfect for problems requiring frequent lookups.

What is a HashMap?

A hash map is a data structure that maps keys to values using a hash function.

Simple Analogy: Think of a library catalog. Instead of searching through every book (O(n)), you look up the book's location using its catalog number (O(1)).

Key-Value Pairs

Key → Value "Alice" → 25 "Bob" → 30 "Charlie" → 35

Operations:

  • Insert: Add a key-value pair
  • Search: Find value by key
  • Delete: Remove a key-value pair

Basic Usage

Python
1# Creating a hash map 2person_ages = {} 3 4# Inserting key-value pairs 5person_ages["Alice"] = 25 6person_ages["Bob"] = 30 7person_ages["Charlie"] = 35 8 9# Accessing values 10print(person_ages["Alice"]) # Output: 25 11 12# Checking if key exists 13if "Bob" in person_ages: 14 print("Bob's age:", person_ages["Bob"]) 15 16# Deleting a key 17del person_ages["Charlie"] 18 19# Iterating 20for name, age in person_ages.items(): 21 print(f"{name} is {age} years old")

How Hash Maps Work

The Hash Function

A hash function converts a key into an array index.

Key "Alice" → hash("Alice") → 3 Key "Bob" → hash("Bob") → 7 Key "Eve" → hash("Eve") → 3 ← Collision!

Properties of Good Hash Functions:

  1. Deterministic: Same key always produces same hash
  2. Fast: O(1) computation time
  3. Uniform: Distributes keys evenly across buckets

Internal Structure

Hash Table (array of buckets) Index: 0 1 2 3 4 5 [ ] [ ] [ ] [A] [ ] [B] ↓ ↓ Alice=25 Bob=30

Basic Operations Flow

Insert "Alice" → 25:

  1. Compute hash: hash("Alice") = 3
  2. Go to bucket 3
  3. Store key-value pair

Search "Alice":

  1. Compute hash: hash("Alice") = 3
  2. Go to bucket 3
  3. Return value: 25

Collision Handling

When two keys hash to the same index, we have a collision.

Method 1: Chaining (Linked Lists)

Each bucket stores a linked list of key-value pairs.

Index: 3 ↓ [Alice=25] → [Eve=22] → null
Python
1# Python handles collisions automatically 2# But here's a simple implementation concept 3 4class HashMapChaining: 5 def __init__(self, size=10): 6 self.size = size 7 self.buckets = [[] for _ in range(size)] 8 9 def _hash(self, key): 10 return hash(key) % self.size 11 12 def put(self, key, value): 13 index = self._hash(key) 14 # Update if key exists, else append 15 for i, (k, v) in enumerate(self.buckets[index]): 16 if k == key: 17 self.buckets[index][i] = (key, value) 18 return 19 self.buckets[index].append((key, value)) 20 21 def get(self, key): 22 index = self._hash(key) 23 for k, v in self.buckets[index]: 24 if k == key: 25 return v 26 return None 27 28# Usage 29hmap = HashMapChaining() 30hmap.put("Alice", 25) 31hmap.put("Bob", 30) 32print(hmap.get("Alice")) # Output: 25

Method 2: Open Addressing (Linear Probing)

When collision occurs, find the next empty slot.

Index: 0 1 2 3 4 [A] [ ] [ ] [B] [C] ↑ ↑ Bob Eve (collided at 3, moved to 4)
Python
1class HashMapOpenAddressing: 2 def __init__(self, size=10): 3 self.size = size 4 self.keys = [None] * size 5 self.values = [None] * size 6 7 def _hash(self, key): 8 return hash(key) % self.size 9 10 def put(self, key, value): 11 index = self._hash(key) 12 13 # Linear probing to find empty slot 14 while self.keys[index] is not None: 15 if self.keys[index] == key: 16 # Update existing key 17 self.values[index] = value 18 return 19 index = (index + 1) % self.size 20 21 # Insert at empty slot 22 self.keys[index] = key 23 self.values[index] = value 24 25 def get(self, key): 26 index = self._hash(key) 27 28 # Linear probing to find key 29 while self.keys[index] is not None: 30 if self.keys[index] == key: 31 return self.values[index] 32 index = (index + 1) % self.size 33 34 return None # Not found 35 36# Usage 37hmap = HashMapOpenAddressing() 38hmap.put("Alice", 25) 39hmap.put("Bob", 30) 40print(hmap.get("Alice")) # Output: 25

Time Complexity

OperationAverage CaseWorst Case
InsertO(1)O(n)
SearchO(1)O(n)
DeleteO(1)O(n)

Why Worst Case O(n)?

  • All keys hash to same bucket (poor hash function)
  • Requires traversing entire chain or probing entire table

In Practice:

  • Good hash function → Average case O(1)
  • Load factor < 0.75 → Performance stays good

Common HashMap Patterns

Pattern 1: Frequency Counter

Count occurrences of each element.

Python
1def count_frequencies(arr): 2 """Count frequency of each element""" 3 freq = {} 4 5 for num in arr: 6 freq[num] = freq.get(num, 0) + 1 7 8 return freq 9 10# Example 11arr = [1, 2, 2, 3, 3, 3] 12print(count_frequencies(arr)) 13# Output: {1: 1, 2: 2, 3: 3}

Pattern 2: Two Sum Problem

Find two numbers that add up to target.

Python
1def two_sum(nums, target): 2 """Find indices of two numbers that sum to target""" 3 seen = {} # value → index 4 5 for i, num in enumerate(nums): 6 complement = target - num 7 if complement in seen: 8 return [seen[complement], i] 9 seen[num] = i 10 11 return [] 12 13# Example 14nums = [2, 7, 11, 15] 15target = 9 16print(two_sum(nums, target)) # Output: [0, 1]

Pattern 3: Group Anagrams

Group strings that are anagrams of each other.

Python
1def group_anagrams(strs): 2 """Group anagrams together""" 3 anagram_map = {} 4 5 for s in strs: 6 # Sort string to use as key 7 key = ''.join(sorted(s)) 8 if key not in anagram_map: 9 anagram_map[key] = [] 10 anagram_map[key].append(s) 11 12 return list(anagram_map.values()) 13 14# Example 15strs = ["eat", "tea", "tan", "ate", "nat", "bat"] 16print(group_anagrams(strs)) 17# Output: [['eat', 'tea', 'ate'], ['tan', 'nat'], ['bat']]

Pattern 4: Longest Substring Without Repeating Characters

Find length of longest substring without duplicates.

Python
1def longest_unique_substring(s): 2 """Find longest substring without repeating characters""" 3 char_index = {} # character → last seen index 4 max_length = 0 5 start = 0 6 7 for end in range(len(s)): 8 char = s[end] 9 10 # If character seen and in current window 11 if char in char_index and char_index[char] >= start: 12 start = char_index[char] + 1 13 14 char_index[char] = end 15 max_length = max(max_length, end - start + 1) 16 17 return max_length 18 19# Example 20s = "abcabcbb" 21print(longest_unique_substring(s)) # Output: 3 ("abc")

HashMap vs Array

FeatureHashMapArray
Access by indexO(n)O(1)
Access by keyO(1)O(n)
SpaceO(n) extraO(1) extra
OrderedNoYes
Use whenNeed fast lookup by keyNeed ordered data

When to Use HashMap

✅ Use HashMap When:

  1. Need fast lookups - Finding elements in O(1)
  2. Counting frequencies - Track occurrences
  3. Caching/memoization - Store computed results
  4. Finding pairs - Two sum, complement problems
  5. Grouping items - Group anagrams, etc.

❌ Don't Use HashMap When:

  1. Need ordering - Use sorted containers instead
  2. Memory constrained - HashMap uses extra space
  3. Keys are integers in small range - Use array instead
  4. Need to iterate frequently - Arrays are faster

Key Takeaways

HashMap Essentials:

Definition: Data structure mapping keys to values using hash function

Core Operations:

  • Insert: map[key] = value - O(1) average
  • Search: value = map[key] - O(1) average
  • Delete: del map[key] - O(1) average

How It Works:

  1. Hash function converts key to index
  2. Store key-value pair at that index
  3. Handle collisions with chaining or open addressing

Collision Handling:

  • Chaining: Each bucket stores a linked list
  • Open Addressing: Find next empty slot

Time Complexity:

  • Average: O(1) for all operations
  • Worst: O(n) with poor hash function

Common Patterns:

  • Frequency counter
  • Two sum / finding pairs
  • Group by key (anagrams)
  • Sliding window with uniqueness

When to Use:

  • Fast lookups by key needed
  • Counting/grouping elements
  • Caching results
  • Finding complements/pairs

Remember:

  • HashMap trades space for time
  • Good hash function is crucial
  • Average O(1) only with good hash function
  • Load factor affects performance

Interview Tip: Hash maps solve most "find in O(1)" problems!

What's Next?

Now that you understand hash maps:

  1. HashSet - Set implementation using hash map
  2. Advanced Hashing - Rolling hash, custom hash functions
  3. Practice Problems - Master through practice

Practice Problems

Problem 1: Valid Anagram

Check if two strings are anagrams using a hash map.

Problem 2: First Unique Character

Find first non-repeating character in a string.

Problem 3: Subarray Sum Equals K

Find number of subarrays with sum equal to K.

Problem 4: Longest Consecutive Sequence

Find longest sequence of consecutive numbers in unsorted array.

Congratulations! You now understand hash maps and can use them to solve problems efficiently!

HashMap Basics