Portfolio/Learn/Heaps & Priority Queues: Top-K, Median, and Scheduling
Algorithms & DSIntermediate

Heaps & Priority Queues: Top-K, Median, and Scheduling

Learn how heaps maintain partial order for efficient min/max operations. Master the patterns: top-k elements, running median, merge k sorted lists, and task scheduling.

15 min read
March 13, 2026
HeapsPriority QueueTop-KMerge KPython

How Heaps Work

A binary heap is a complete binary tree where every parent is smaller (min-heap) or larger (max-heap) than its children. It's stored as an array where parent of index i is at (i-1)//2, and children are at 2i+1 and 2i+2. This gives O(log n) push/pop and O(1) peek at the min/max.

Python's heapq module implements a min-heap. For a max-heap, negate values: push -val and negate when popping. heapq.heappush() and heapq.heappop() are O(log n). heapq.heapify() is O(n).

Pattern: Top-K Elements

To find the k largest elements, maintain a min-heap of size k. For each new element, if it's larger than the heap's minimum, replace it. At the end, the heap contains the k largest. This runs in O(n log k) — better than O(n log n) sorting when k << n.

python
import heapq

def find_k_largest(nums: list[int], k: int) -> list[int]:
    """Find k largest elements using a min-heap of size k."""
    heap = nums[:k]
    heapq.heapify(heap)

    for num in nums[k:]:
        if num > heap[0]:
            heapq.heapreplace(heap, num)  # Pop min, push new

    return sorted(heap, reverse=True)


def kth_largest_element(nums: list[int], k: int) -> int:
    """Find the kth largest element."""
    # Min-heap of size k — root is the kth largest
    heap = []
    for num in nums:
        heapq.heappush(heap, num)
        if len(heap) > k:
            heapq.heappop(heap)
    return heap[0]

Pattern: Merge K Sorted Lists

When merging k sorted sequences, a heap of size k tracks the smallest available element from each list. Pop the smallest, advance that list's pointer, and push the next element. This runs in O(n log k) where n is the total number of elements.

python
def merge_k_sorted(lists: list[list[int]]) -> list[int]:
    """Merge k sorted lists into one sorted list."""
    heap = []
    for i, lst in enumerate(lists):
        if lst:
            heapq.heappush(heap, (lst[0], i, 0))

    result = []
    while heap:
        val, list_idx, elem_idx = heapq.heappop(heap)
        result.append(val)
        if elem_idx + 1 < len(lists[list_idx]):
            next_val = lists[list_idx][elem_idx + 1]
            heapq.heappush(heap, (next_val, list_idx, elem_idx + 1))

    return result

Pattern: Running Median (Two Heaps)

Maintain a max-heap for the lower half and a min-heap for the upper half. The median is always at the top of one or both heaps. Balance the heaps so their sizes differ by at most 1. Each insertion is O(log n).

python
class MedianFinder:
    def __init__(self):
        self.lo = []  # Max-heap (negate values) — lower half
        self.hi = []  # Min-heap — upper half

    def add_num(self, num: int):
        heapq.heappush(self.lo, -num)
        # Ensure max of lower <= min of upper
        heapq.heappush(self.hi, -heapq.heappop(self.lo))
        # Balance sizes: lo can have at most 1 more than hi
        if len(self.hi) > len(self.lo):
            heapq.heappush(self.lo, -heapq.heappop(self.hi))

    def find_median(self) -> float:
        if len(self.lo) > len(self.hi):
            return -self.lo[0]
        return (-self.lo[0] + self.hi[0]) / 2

The two-heap pattern also applies to scheduling problems and any scenario where you need to track a partition point in a stream of data.