Question: Median of Search Queries
"Given users' total number of searches in a month, write a query to determine the median number of searches submitted."

Step-by-Step Explanation:

You have a table called monthly_searches with:

  • user_id
  • total_searches (number of searches by the user in a month)

You need to:

  1. Find the median of all users’ total_searches for the month.

Sample Data (monthly_searches):

user_idtotal_searches
140
250
330
460
520

Step 1: Understand What Median Means

  • Median = the middle number when sorted.
  • If odd number of rows → the middle value.
  • If even number of rows → the average of the two middle values.

Sorted total_searches:

  • [20, 30, 40, 50, 60]

Middle value = 40 → Median is 40.


Step 2: Sort total_searches

You must sort the search counts in ascending order.


Step 3: Assign Row Numbers

Assign row numbers after sorting, using ROW_NUMBER().

Example:

Row Numberuser_idtotal_searches
1520
2330
3140
4250
5460

Step 4: Find the Middle Row (or Rows)

  • Total rows = 5
  • Middle row = (5 + 1)/2 = 3

So pick row number 3.

If there were 6 rows, we would need to pick rows 3 and 4 and take the average of their values.


SQL Query:

WITH ordered_searches AS (
  SELECT 
    total_searches,
    ROW_NUMBER() OVER (ORDER BY total_searches) AS rn
  FROM 
    monthly_searches
),
counts AS (
  SELECT COUNT(*) AS total_rows FROM monthly_searches
)

SELECT 
  AVG(total_searches) AS median_searches
FROM 
  ordered_searches, counts
WHERE 
  rn IN (FLOOR((total_rows + 1)/2), CEIL((total_rows + 1)/2));


Explanation of SQL:

  • ROW_NUMBER() OVER (ORDER BY total_searches):
    Assigns a row number after sorting total_searches in ascending order.
  • COUNT(*):
    Counts how many users (how many rows).
  • FLOOR((total_rows + 1)/2) and CEIL((total_rows + 1)/2):
    • If the number of rows is odd → both floor and ceil are the same → pick the middle row.
    • If the number of rows is even → pick two middle rows and average them.
  • AVG(total_searches):
    • If one row is selected, it returns the value.
    • If two rows are selected, it averages them.

Expected Output:

median_searches
40

Important Points:

  • Always sort by total_searches before assigning row numbers.
  • Use AVG even if it's a single row because it will work in both odd and even row cases.
  • In an interview, make sure to explain how you are handling both odd and even number of rows.