Remove Duplicates from a List in Python: Five Approaches and When to Use Each
Deduplicating a list is one of those problems that looks trivial until someone asks whether order matters, or whether the input can contain unhashable types. Here are five ways to solve it, with honest notes on when each one fits.
Approach 1: Convert to a Set
The shortest version. A set stores only unique elements, so casting a list to a set drops duplicates immediately.
numbers = [4, 2, 7, 2, 4, 9, 1, 7]unique = list(set(numbers))print(unique)# Output: [1, 2, 4, 7, 9] — order is NOT guaranteedTrade-off: fast (O(n) average), but the original order is lost. If the caller only cares about the unique values and not their sequence, this is the right answer.
Approach 2: dict.fromkeys() — Fast and Order-Preserving
Since Python 3.7 dictionaries are ordered by insertion. dict.fromkeys() creates a dict where each element becomes a key (duplicates overwrite without adding), then you convert the keys back to a list.
numbers = [4, 2, 7, 2, 4, 9, 1, 7]unique = list(dict.fromkeys(numbers))print(unique)# Output: [4, 2, 7, 9, 1] — original order preservedThis is the idiomatic order-preserving deduplication in modern Python. O(n) time and space.
Approach 3: Manual Loop with a Seen Set
More explicit, and easy to extend if you need custom equality logic.
def remove_duplicates_ordered(items): """Remove duplicates while keeping the first occurrence of each element.""" seen = set() result = [] for item in items: if item not in seen: result.append(item) seen.add(item) return result
data = [3, 1, 4, 1, 5, 9, 2, 6, 5, 3]print(remove_duplicates_ordered(data))# Output: [3, 1, 4, 5, 9, 2, 6]O(n) time. The seen set makes membership checks O(1), so the overall loop stays linear.
Approach 4: List Comprehension with Running Seen Set
A compact version of the manual loop:
numbers = [10, 20, 10, 30, 20, 40]seen = set()unique = [x for x in numbers if not (x in seen or seen.add(x))]print(unique)# Output: [10, 20, 30, 40]This works but is a bit of a trick — seen.add(x) always returns None, which is falsy, so the short-circuit evaluation causes the item to be included the first time. Clear enough once you recognise the pattern, but the manual loop is more readable for a team.
Approach 5: Sorted Approach for Unhashable Types
If your list contains unhashable items (like nested lists), sets will not work. Sort first, then compare adjacent elements.
def remove_duplicates_sorted(items): """Deduplicate a sortable list — does NOT preserve original order.""" if not items: return [] sorted_items = sorted(items) result = [sorted_items[0]] for item in sorted_items[1:]: if item != result[-1]: result.append(item) return result
data = [5, 3, 1, 3, 5, 2]print(remove_duplicates_sorted(data))# Output: [1, 2, 3, 5]O(n log n) due to sorting. Useful when elements cannot be hashed.
Quick Reference
| Method | Order preserved | Works on unhashable | Time |
|---|---|---|---|
set() | No | No | O(n) |
dict.fromkeys() | Yes | No | O(n) |
| Manual loop + set | Yes | No | O(n) |
| Sorted comparison | No | Yes | O(n log n) |
Interview tip: the question almost always specifies whether order must be preserved. If it does, reach for dict.fromkeys() or the manual loop. If not, set() is the one-liner. Mention time complexity either way.