Question: Transaction Anomalies
"Write a query to detect potential fraudulent transactions — specifically looking for users who made purchases above 2x their average spend within a week."
Step-by-Step Explanation:
You have a table called transactions
with:
user_id
transaction_time
(timestamp when the transaction happened)amount
You need to:
- Find users whose individual transaction amount is more than 2 times their average transaction amount.
- Only consider transactions that happened within the last 7 days.
Sample Data (transactions
):
user_id | transaction_time | amount |
---|---|---|
1 | 2024-02-18 10:00:00 | 50 |
1 | 2024-02-19 11:00:00 | 120 |
2 | 2024-02-18 12:00:00 | 30 |
2 | 2024-02-20 13:00:00 | 100 |
3 | 2024-02-15 09:00:00 | 200 |
Assume today's date is 2024-02-20.
Step 1: Filter Transactions in the Last 7 Days
Only work with transactions that happened between:
- 2024-02-13 and 2024-02-20.
Use a WHERE
clause to filter by transaction_time
.
Step 2: Calculate Each User's Average Transaction Amount
For each user, calculate:
AVG(amount)
over the last 7 days.
Step 3: Compare Each Transaction to the User’s Average
Check if:
amount > 2 × average_amount
If yes, then it’s flagged as a potential anomaly.
SQL Query:
WITH last_week_transactions AS (
SELECT
user_id,
transaction_time,
amount
FROM
transactions
WHERE
transaction_time >= DATE_SUB(CURRENT_DATE(), INTERVAL 7 DAY)
)
, user_avg_spend AS (
SELECT
user_id,
AVG(amount) AS avg_amount
FROM
last_week_transactions
GROUP BY
user_id
)
SELECT
t.user_id,
t.transaction_time,
t.amount,
u.avg_amount
FROM
last_week_transactions t
JOIN
user_avg_spend u
ON
t.user_id = u.user_id
WHERE
t.amount > 2 * u.avg_amount;
Explanation of SQL:
last_week_transactions
CTE:- Selects only the last 7 days of transactions.
user_avg_spend
CTE:- Calculates each user's average spend over the last 7 days.
- Main query:
- Joins transaction data with user average spend.
- Filters where the transaction amount is greater than 2 times the average.
Expected Output:
user_id | transaction_time | amount | avg_amount |
---|---|---|---|
1 | 2024-02-19 11:00:00 | 120 | 50 |
2 | 2024-02-20 13:00:00 | 100 | 30 |
Explanation:
- For User 1:
- Average spend = 50
- 2 × 50 = 100
- 120 > 100 → Flagged
- For User 2:
- Average spend = 30
- 2 × 30 = 60
- 100 > 60 → Flagged
Important Points:
- Always filter first for the correct time window (7 days).
- Calculate per user average, not overall average.
- Compare each transaction individually to the user’s average.
DATE_SUB(CURRENT_DATE(), INTERVAL 7 DAY)
helps automatically move the time window.