Question: Product Usage Metrics
"Analyze logins and usage across multiple Microsoft products (Outlook, Teams, Excel). Find the top 3 most used products per user."
Step-by-Step Explanation:
You have a table called product_usage
with:
user_id
product_name
(e.g., 'Outlook', 'Teams', 'Excel')login_time
(timestamp when the user logged into the product)
You need to:
- Count how many times each user used each product.
- Find the top 3 most used products for each user.
Sample Data (product_usage
):
user_id | product_name | login_time |
---|---|---|
1 | Outlook | 2024-02-01 08:00:00 |
1 | Teams | 2024-02-01 09:00:00 |
1 | Outlook | 2024-02-02 08:00:00 |
1 | Excel | 2024-02-02 10:00:00 |
1 | Outlook | 2024-02-03 08:00:00 |
2 | Teams | 2024-02-01 11:00:00 |
2 | Excel | 2024-02-01 12:00:00 |
Step 1: Count Logins per User and Product
Use GROUP BY user_id, product_name
and COUNT(*)
to find how many times each product was used by each user.
Example for user 1:
- Outlook: 3 logins
- Teams: 1 login
- Excel: 1 login
Step 2: Rank Products per User
Use ROW_NUMBER()
or RANK()
partitioned by user_id
to rank products based on the number of logins for each user.
Step 3: Select Top 3 Products per User
Pick products where the rank is 1, 2, or 3 for each user.
SQL Query:
WITH product_counts AS (
SELECT
user_id,
product_name,
COUNT(*) AS usage_count
FROM
product_usage
GROUP BY
user_id, product_name
),
ranked_products AS (
SELECT
user_id,
product_name,
usage_count,
ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY usage_count DESC) AS product_rank
FROM
product_counts
)
SELECT
user_id,
product_name,
usage_count
FROM
ranked_products
WHERE
product_rank <= 3
ORDER BY
user_id, product_rank;
Explanation of SQL:
product_counts
CTE:- Counts the number of times each user used each product.
ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY usage_count DESC)
:- Ranks the products for each user based on usage count.
- Final
SELECT
:- Picks the top 3 products per user by filtering
WHERE product_rank <= 3
.
- Picks the top 3 products per user by filtering
Expected Output:
user_id | product_name | usage_count |
---|---|---|
1 | Outlook | 3 |
1 | Teams | 1 |
1 | Excel | 1 |
2 | Teams | 1 |
2 | Excel | 1 |
(For user 2, there are only 2 products used.)
Important Points:
ROW_NUMBER()
is good when you want to strictly pick top N products without ties.- If ties should be handled (e.g., multiple products with the same usage count), use
RANK()
instead. - Always
PARTITION BY user_id
when ranking because you want separate rankings for each user.