Question: Product Usage Metrics
"Analyze logins and usage across multiple Microsoft products (Outlook, Teams, Excel). Find the top 3 most used products per user."

Step-by-Step Explanation:

You have a table called product_usage with:

  • user_id
  • product_name (e.g., 'Outlook', 'Teams', 'Excel')
  • login_time (timestamp when the user logged into the product)

You need to:

  1. Count how many times each user used each product.
  2. Find the top 3 most used products for each user.

Sample Data (product_usage):

user_idproduct_namelogin_time
1Outlook2024-02-01 08:00:00
1Teams2024-02-01 09:00:00
1Outlook2024-02-02 08:00:00
1Excel2024-02-02 10:00:00
1Outlook2024-02-03 08:00:00
2Teams2024-02-01 11:00:00
2Excel2024-02-01 12:00:00

Step 1: Count Logins per User and Product

Use GROUP BY user_id, product_name and COUNT(*) to find how many times each product was used by each user.

Example for user 1:

  • Outlook: 3 logins
  • Teams: 1 login
  • Excel: 1 login

Step 2: Rank Products per User

Use ROW_NUMBER() or RANK() partitioned by user_id to rank products based on the number of logins for each user.


Step 3: Select Top 3 Products per User

Pick products where the rank is 1, 2, or 3 for each user.


SQL Query:

WITH product_counts AS (
  SELECT
    user_id,
    product_name,
    COUNT(*) AS usage_count
  FROM
    product_usage
  GROUP BY
    user_id, product_name
),

ranked_products AS (
  SELECT
    user_id,
    product_name,
    usage_count,
    ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY usage_count DESC) AS product_rank
  FROM
    product_counts
)

SELECT
  user_id,
  product_name,
  usage_count
FROM
  ranked_products
WHERE
  product_rank <= 3
ORDER BY
  user_id, product_rank;


Explanation of SQL:

  • product_counts CTE:
    • Counts the number of times each user used each product.
  • ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY usage_count DESC):
    • Ranks the products for each user based on usage count.
  • Final SELECT:
    • Picks the top 3 products per user by filtering WHERE product_rank <= 3.

Expected Output:

user_idproduct_nameusage_count
1Outlook3
1Teams1
1Excel1
2Teams1
2Excel1

(For user 2, there are only 2 products used.)


Important Points:

  • ROW_NUMBER() is good when you want to strictly pick top N products without ties.
  • If ties should be handled (e.g., multiple products with the same usage count), use RANK() instead.
  • Always PARTITION BY user_id when ranking because you want separate rankings for each user.