SQL Basics
- What is SQL?
- Data Definition Language
- Data Manipulation Language
- Data Query Language
- Data Control Language
- Transaction Control Language
- Tables and Schemas
- Data Types
- Constraints
Querying and Filtering
π§ How to Use the DISTINCT
Keyword in SQL: Remove Duplicates with Ease
When working with databases, there are times when you need only unique values from a column or a combination of columns. This is where the DISTINCT
keyword in SQL becomes incredibly helpful. Whether youβre trying to find all unique countries in a customer table or avoid duplicate entries in reports, DISTINCT
is the way to go.
In this article, weβll walk through what DISTINCT
is, how it works, when to use it, and some common mistakes to avoid β all in a clear and simple way thatβs perfect for new learners.
π What is the DISTINCT
Keyword in SQL?
The DISTINCT
keyword is used in SQL to remove duplicate rows from the results of a SELECT
query. It ensures that the data returned contains only unique rows.
By default, SQL returns all records, including duplicates. DISTINCT
filters out repeated rows based on the selected columns.
π§Ύ Basic Syntax
SELECT DISTINCT column1, column2, ...FROM table_name;
column1, column2
: The columns you want to retrieve unique values from.table_name
: The name of the table youβre querying.
π§ͺ Example Table: students
id | name | course |
---|---|---|
1 | Alice | Math |
2 | Bob | Science |
3 | Alice | Math |
4 | Diana | History |
5 | Evan | Math |
β Example 1: Select All Courses (Including Duplicates)
SELECT courseFROM students;
Result:
course |
---|
Math |
Science |
Math |
History |
Math |
π Example 2: Select Unique Courses Using DISTINCT
SELECT DISTINCT courseFROM students;
Result:
course |
---|
Math |
Science |
History |
Here, the duplicates for βMathβ are removed, and only unique course names are returned.
π§βπ Example 3: Select Unique Names and Courses
SELECT DISTINCT name, courseFROM students;
Result:
name | course |
---|---|
Alice | Math |
Bob | Science |
Diana | History |
Evan | Math |
Even though Alice appears twice in the table, the combination of name and course appears only once here.
π‘ Key Points to Remember
DISTINCT
applies to the entire row based on the selected columns.- It works with one or more columns.
- It removes duplicate rows, not individual values.
π Practical Use Cases
β 1. List All Unique Departments
SELECT DISTINCT departmentFROM employees;
Useful in HR dashboards to display department filters.
β 2. Count Unique Customers
SELECT COUNT(DISTINCT customer_id)FROM orders;
This returns the number of unique customers who placed orders.
β 3. Remove Duplicates in Reports
Imagine generating a monthly report that lists all products sold β using DISTINCT
ensures no product appears twice.
SELECT DISTINCT product_nameFROM sales;
π§― Common Mistakes with DISTINCT
β Using DISTINCT
on the Wrong Columns
If you select additional columns that vary between rows, the duplicates wonβt be removed as expected.
-- This won't remove duplicates if timestamp or other columns differSELECT DISTINCT customer_id, order_timeFROM orders;
β Thinking It Works Like GROUP BY
DISTINCT
removes duplicates but doesnβt perform aggregation like GROUP BY
. If you need totals or averages, use GROUP BY
.
π DISTINCT
vs. GROUP BY
Feature | DISTINCT | GROUP BY |
---|---|---|
Removes Duplicates | β Yes | β Yes |
Aggregation | β No | β Yes (e.g., COUNT, SUM) |
Usage | Simple uniqueness check | Grouping with functions |
π Performance Considerations
DISTINCT
can be resource-intensive on large datasets, especially if:
- Youβre selecting many columns.
- The table has millions of rows.
- There are no indexes on the columns used.
π‘ Tip: Use DISTINCT
only when necessary and avoid overusing it in subqueries or complex joins.
π Combine DISTINCT
with Other Clauses
π With WHERE
Clause
SELECT DISTINCT courseFROM studentsWHERE name = 'Alice';
This filters records before applying DISTINCT
.
π With ORDER BY
SELECT DISTINCT courseFROM studentsORDER BY course ASC;
This sorts the distinct results alphabetically.
π With Joins
SELECT DISTINCT c.customer_nameFROM customers cJOIN orders o ON c.customer_id = o.customer_id;
This retrieves unique customers who have placed at least one order.
π§ Summary Table
Clause | Description |
---|---|
DISTINCT | Removes duplicate rows from the result set |
Works on | One or more columns |
Can combine with | WHERE , ORDER BY , JOIN , COUNT() |
Caution | Can impact performance on large tables |
β Final Thoughts
The DISTINCT
keyword in SQL is an incredibly useful tool when you want to remove duplicate records from your query results. Whether youβre looking to show a clean list of products, avoid repetition in reports, or simply count unique users, DISTINCT
has your back.
Itβs easy to use, beginner-friendly, and can be combined with other clauses like WHERE
, ORDER BY
, and even COUNT()
to make your SQL queries more powerful and efficient.
Start experimenting with DISTINCT
in your own queries today β your reports and dashboards will look much cleaner!