Portfolio/Learn/Hash Tables: O(1) Average Lookup Explained
Algorithms & DSBeginner

Hash Tables: O(1) Average Lookup Explained

Understand how hash tables work internally — hash functions, collision resolution, load factors — and master hash map patterns for solving problems efficiently.

14 min read
March 18, 2026
Hash TablesHash MapsSetsCountingPython

The Power of O(1) Lookup

Hash tables are arguably the most important data structure in practical programming. They provide average O(1) insertion, deletion, and lookup by mapping keys to array indices through a hash function. Python's dict and set, Java's HashMap, and JavaScript's Object/Map all use hash tables internally.

How Hashing Works

A hash function converts a key into an integer index. Good hash functions distribute keys uniformly across the table. When two keys hash to the same index (collision), we resolve it via chaining (linked lists at each bucket) or open addressing (probing for the next empty slot). Python uses open addressing with perturbation.

Average case is O(1) for all operations, but worst case is O(n) when all keys collide. In practice, with a good hash function and load factor < 0.75, collisions are rare.

Pattern: Frequency Counting

The most common hash map pattern: count occurrences of elements. This solves problems like finding duplicates, majority elements, anagram detection, and top-k frequent elements.

python
from collections import Counter

def top_k_frequent(nums: list[int], k: int) -> list[int]:
    """Return the k most frequent elements."""
    count = Counter(nums)
    # Bucket sort: index = frequency, value = list of nums with that freq
    buckets = [[] for _ in range(len(nums) + 1)]
    for num, freq in count.items():
        buckets[freq].append(num)

    result = []
    for i in range(len(buckets) - 1, 0, -1):
        for num in buckets[i]:
            result.append(num)
            if len(result) == k:
                return result
    return result


def group_anagrams(strs: list[str]) -> list[list[str]]:
    """Group strings that are anagrams of each other."""
    groups: dict[str, list[str]] = {}
    for s in strs:
        key = "".join(sorted(s))
        groups.setdefault(key, []).append(s)
    return list(groups.values())

Pattern: Two Sum with Hash Map

The classic two sum problem: find two numbers that add to a target. A hash map stores seen values, turning O(n²) brute force into O(n). This pattern extends to 3Sum, 4Sum, and subarray sum problems.

python
def two_sum(nums: list[int], target: int) -> list[int]:
    """Find indices of two numbers that sum to target."""
    seen = {}  # value -> index

    for i, num in enumerate(nums):
        complement = target - num
        if complement in seen:
            return [seen[complement], i]
        seen[num] = i

    return []

Pattern: Hash Set for Existence Checks

When you only care about existence (not counts), use a set. Sets provide O(1) membership testing. Classic applications: detecting duplicates, finding intersections, and longest consecutive sequence.

python
def longest_consecutive(nums: list[int]) -> int:
    """Find the length of the longest consecutive sequence."""
    num_set = set(nums)
    max_length = 0

    for num in num_set:
        # Only start counting from the beginning of a sequence
        if num - 1 not in num_set:
            length = 1
            while num + length in num_set:
                length += 1
            max_length = max(max_length, length)

    return max_length

When choosing between sorting (O(n log n)) and hashing (O(n) time, O(n) space): if memory is tight, sort in-place. If speed matters most, use a hash map. In interviews, discuss both trade-offs.