Question: Median of Search Queries
"Given users' total number of searches in a month, write a query to determine the median number of searches submitted."
Step-by-Step Explanation:
You have a table called monthly_searches with:
user_idtotal_searches(number of searches by the user in a month)
You need to:
- Find the median of all users’ total_searches for the month.
Sample Data (monthly_searches):
| user_id | total_searches |
|---|---|
| 1 | 40 |
| 2 | 50 |
| 3 | 30 |
| 4 | 60 |
| 5 | 20 |
Step 1: Understand What Median Means
- Median = the middle number when sorted.
- If odd number of rows → the middle value.
- If even number of rows → the average of the two middle values.
Sorted total_searches:
- [20, 30, 40, 50, 60]
Middle value = 40 → Median is 40.
Step 2: Sort total_searches
You must sort the search counts in ascending order.
Step 3: Assign Row Numbers
Assign row numbers after sorting, using ROW_NUMBER().
Example:
| Row Number | user_id | total_searches |
|---|---|---|
| 1 | 5 | 20 |
| 2 | 3 | 30 |
| 3 | 1 | 40 |
| 4 | 2 | 50 |
| 5 | 4 | 60 |
Step 4: Find the Middle Row (or Rows)
- Total rows = 5
- Middle row = (5 + 1)/2 = 3
So pick row number 3.
If there were 6 rows, we would need to pick rows 3 and 4 and take the average of their values.
SQL Query:
WITH ordered_searches AS (
SELECT
total_searches,
ROW_NUMBER() OVER (ORDER BY total_searches) AS rn
FROM
monthly_searches
),
counts AS (
SELECT COUNT(*) AS total_rows FROM monthly_searches
)
SELECT
AVG(total_searches) AS median_searches
FROM
ordered_searches, counts
WHERE
rn IN (FLOOR((total_rows + 1)/2), CEIL((total_rows + 1)/2));
Explanation of SQL:
ROW_NUMBER() OVER (ORDER BY total_searches):
Assigns a row number after sorting total_searches in ascending order.COUNT(*):
Counts how many users (how many rows).FLOOR((total_rows + 1)/2)andCEIL((total_rows + 1)/2):- If the number of rows is odd → both floor and ceil are the same → pick the middle row.
- If the number of rows is even → pick two middle rows and average them.
AVG(total_searches):- If one row is selected, it returns the value.
- If two rows are selected, it averages them.
Expected Output:
| median_searches |
|---|
| 40 |
Important Points:
- Always sort by
total_searchesbefore assigning row numbers. - Use
AVGeven if it's a single row because it will work in both odd and even row cases. - In an interview, make sure to explain how you are handling both odd and even number of rows.