Find the Median of an Array in Python: Sorted Approach and QuickSelect

The median of a dataset is its middle value when sorted. For an odd-length list, that is the value at position n // 2. For an even-length list, it is the average of the two middle values. Simple definition — but the implementations range from a one-liner to a genuinely interesting algorithm.

Approach 1: Sort and Index

Sort the array, then compute the middle position. This is correct, readable, and O(n log n).

def find_median(arr):
    """Return the median of a list. Handles both odd and even length."""
    if not arr:
        raise ValueError("Cannot find median of an empty list")

    sorted_arr = sorted(arr)   # returns a new sorted list, arr unchanged
    n = len(sorted_arr)
    mid = n // 2

    if n % 2 == 1:
        # Odd length: exact middle element
        return sorted_arr[mid]
    else:
        # Even length: average of two middle elements
        return (sorted_arr[mid - 1] + sorted_arr[mid]) / 2

# Odd-length example
print(find_median([7, 2, 5, 1, 9, 3, 7]))
# Output: 5

# Even-length example
print(find_median([3, 1, 4, 1, 5, 9]))
# Output: 3.5

For most practical purposes, this is all you need. Python’s statistics library also has statistics.median() if you want a one-liner.

Using the statistics Module

import statistics

data = [7, 2, 5, 1, 9, 3, 7]
print(statistics.median(data))
# Output: 5

# For integer median (no averaging for even-length)
print(statistics.median_low(data))   # lower of the two middle values
print(statistics.median_high(data))  # higher of the two middle values

Approach 2: QuickSelect — O(n) Average Time

Sorting everything just to find one element is wasteful. QuickSelect finds the kth smallest element (and thus the median) without fully sorting the array. Average time: O(n). Worst case: O(n²), but randomised pivoting makes that unlikely.

import random

def quickselect(arr, k):
    """Return the kth smallest element (0-indexed) using QuickSelect."""
    if len(arr) == 1:
        return arr[0]

    pivot = random.choice(arr)   # random pivot reduces worst-case probability

    smaller = [x for x in arr if x < pivot]
    equal   = [x for x in arr if x == pivot]
    larger  = [x for x in arr if x > pivot]

    if k < len(smaller):
        return quickselect(smaller, k)
    elif k < len(smaller) + len(equal):
        return pivot
    else:
        return quickselect(larger, k - len(smaller) - len(equal))


def find_median_quickselect(arr):
    """Find median using QuickSelect — O(n) average time."""
    if not arr:
        raise ValueError("Cannot find median of an empty list")

    n = len(arr)
    if n % 2 == 1:
        return quickselect(arr, n // 2)
    else:
        lo = quickselect(arr, n // 2 - 1)
        hi = quickselect(arr, n // 2)
        return (lo + hi) / 2

data = [3, 1, 4, 1, 5, 9, 2, 6, 5, 3]
print(find_median_quickselect(data))
# Output: 4.0

The QuickSelect approach is more useful when you only need the median from a large dataset and want to avoid the O(n log n) sorting cost.

Edge Cases Worth Testing

# Single element
print(find_median([42]))       # 42

# Two elements
print(find_median([3, 7]))     # 5.0

# All same values
print(find_median([4, 4, 4]))  # 4

# Already sorted
print(find_median([1, 2, 3, 4, 5]))  # 3

# Reverse sorted
print(find_median([9, 7, 5, 3, 1]))  # 5

Which Approach to Use

Use sorted() + index arithmetic for everyday code. It is readable and correct. Use QuickSelect when you are specifically optimising for a large array and need better than O(n log n). The statistics.median() function is fine for data analysis scripts where you want the library to handle edge cases.

In interviews, implement the sort-based approach first, state its complexity, then mention QuickSelect as the optimisation if they ask.

Written by NPBlue Engineering Team — Software & Data Engineers who ships production Python across data, backend, and ML systems.

Reviewed for technical accuracy. Spot an error? Let us know.