Question: Customer Retention Analysis
"Determine how many users made repeat purchases within 30 days of their first transaction."
Step-by-Step Explanation:
You have a table called transactions
with:
user_id
transaction_time
(timestamp of the purchase)
You need to:
- For each user, find their first transaction.
- Check if they made another purchase within 30 days after their first transaction.
- Count how many users made a repeat purchase within those 30 days.
Sample Data (transactions
):
user_id | transaction_time |
---|---|
1 | 2024-01-01 10:00:00 |
1 | 2024-01-20 09:00:00 |
2 | 2024-01-05 12:00:00 |
3 | 2024-01-10 14:00:00 |
3 | 2024-03-15 16:00:00 |
Step 1: Find Each User’s First Transaction
Use MIN(transaction_time)
grouped by user_id
.
Example:
- User 1: 2024-01-01
- User 2: 2024-01-05
- User 3: 2024-01-10
Step 2: Join Back to Transactions
Join each user's first transaction date back to the original transactions table.
This way, you can compare every later transaction with the first transaction.
Step 3: Find Transactions Within 30 Days After First Purchase
For each transaction (other than the first one):
- Check if
transaction_time > first_transaction_time
- AND
transaction_time <= first_transaction_time + 30 days
Step 4: Count How Many Users Had at Least One Repeat Purchase Within 30 Days
Use DISTINCT user_id
to avoid counting the same user multiple times.
SQL Query:
WITH first_transaction AS (
SELECT
user_id,
MIN(transaction_time) AS first_transaction_time
FROM
transactions
GROUP BY
user_id
)
SELECT
COUNT(DISTINCT t.user_id) AS users_with_repeat_purchase
FROM
transactions t
JOIN
first_transaction f
ON
t.user_id = f.user_id
WHERE
t.transaction_time > f.first_transaction_time
AND t.transaction_time <= DATE_ADD(f.first_transaction_time, INTERVAL 30 DAY);
Explanation of SQL:
first_transaction
CTE:- Gets the first purchase date for every user.
- In the main query:
- We join transactions with their first transaction date.
WHERE
conditions:t.transaction_time > f.first_transaction_time
: must be after the first purchase.t.transaction_time <= first_transaction_time + 30 days
: within 30 days.
COUNT(DISTINCT user_id)
:- Counts how many users made at least one repeat purchase within 30 days.
Expected Output:
users_with_repeat_purchase |
---|
1 |
Explanation:
- User 1 made a second purchase on 2024-01-20 → within 30 days → counted.
- User 2 had no second transaction → not counted.
- User 3 made second purchase much later (March), not within 30 days → not counted.
Important Points:
- Be careful to only look at transactions after the first one.
- Use
DATE_ADD(first_transaction_time, INTERVAL 30 DAY)
to get the 30-day window. - Always use
DISTINCT user_id
because a user may have multiple repeat transactions but should be counted only once.