Question: Median of Search Queries
"Given users' total number of searches in a month, write a query to determine the median number of searches submitted."
Step-by-Step Explanation:
You have a table called monthly_searches
with:
user_id
total_searches
(number of searches by the user in a month)
You need to:
- Find the median of all users’ total_searches for the month.
Sample Data (monthly_searches
):
user_id | total_searches |
---|---|
1 | 40 |
2 | 50 |
3 | 30 |
4 | 60 |
5 | 20 |
Step 1: Understand What Median Means
- Median = the middle number when sorted.
- If odd number of rows → the middle value.
- If even number of rows → the average of the two middle values.
Sorted total_searches
:
- [20, 30, 40, 50, 60]
Middle value = 40 → Median is 40.
Step 2: Sort total_searches
You must sort the search counts in ascending order.
Step 3: Assign Row Numbers
Assign row numbers after sorting, using ROW_NUMBER()
.
Example:
Row Number | user_id | total_searches |
---|---|---|
1 | 5 | 20 |
2 | 3 | 30 |
3 | 1 | 40 |
4 | 2 | 50 |
5 | 4 | 60 |
Step 4: Find the Middle Row (or Rows)
- Total rows = 5
- Middle row = (5 + 1)/2 = 3
So pick row number 3.
If there were 6 rows, we would need to pick rows 3 and 4 and take the average of their values.
SQL Query:
WITH ordered_searches AS (
SELECT
total_searches,
ROW_NUMBER() OVER (ORDER BY total_searches) AS rn
FROM
monthly_searches
),
counts AS (
SELECT COUNT(*) AS total_rows FROM monthly_searches
)
SELECT
AVG(total_searches) AS median_searches
FROM
ordered_searches, counts
WHERE
rn IN (FLOOR((total_rows + 1)/2), CEIL((total_rows + 1)/2));
Explanation of SQL:
ROW_NUMBER() OVER (ORDER BY total_searches)
:
Assigns a row number after sorting total_searches in ascending order.COUNT(*)
:
Counts how many users (how many rows).FLOOR((total_rows + 1)/2)
andCEIL((total_rows + 1)/2)
:- If the number of rows is odd → both floor and ceil are the same → pick the middle row.
- If the number of rows is even → pick two middle rows and average them.
AVG(total_searches)
:- If one row is selected, it returns the value.
- If two rows are selected, it averages them.
Expected Output:
median_searches |
---|
40 |
Important Points:
- Always sort by
total_searches
before assigning row numbers. - Use
AVG
even if it's a single row because it will work in both odd and even row cases. - In an interview, make sure to explain how you are handling both odd and even number of rows.