Question: Transaction Anomalies
"Write a query to detect potential fraudulent transactions — specifically looking for users who made purchases above 2x their average spend within a week."

Step-by-Step Explanation:

You have a table called transactions with:

  • user_id
  • transaction_time (timestamp when the transaction happened)
  • amount

You need to:

  1. Find users whose individual transaction amount is more than 2 times their average transaction amount.
  2. Only consider transactions that happened within the last 7 days.

Sample Data (transactions):

user_idtransaction_timeamount
12024-02-18 10:00:0050
12024-02-19 11:00:00120
22024-02-18 12:00:0030
22024-02-20 13:00:00100
32024-02-15 09:00:00200

Assume today's date is 2024-02-20.


Step 1: Filter Transactions in the Last 7 Days

Only work with transactions that happened between:

  • 2024-02-13 and 2024-02-20.

Use a WHERE clause to filter by transaction_time.


Step 2: Calculate Each User's Average Transaction Amount

For each user, calculate:

  • AVG(amount) over the last 7 days.

Step 3: Compare Each Transaction to the User’s Average

Check if:

  • amount > 2 × average_amount

If yes, then it’s flagged as a potential anomaly.


SQL Query:

WITH last_week_transactions AS (
  SELECT
    user_id,
    transaction_time,
    amount
  FROM
    transactions
  WHERE
    transaction_time >= DATE_SUB(CURRENT_DATE(), INTERVAL 7 DAY)
)

, user_avg_spend AS (
  SELECT
    user_id,
    AVG(amount) AS avg_amount
  FROM
    last_week_transactions
  GROUP BY
    user_id
)

SELECT
  t.user_id,
  t.transaction_time,
  t.amount,
  u.avg_amount
FROM
  last_week_transactions t
JOIN
  user_avg_spend u
ON
  t.user_id = u.user_id
WHERE
  t.amount > 2 * u.avg_amount;


Explanation of SQL:

  • last_week_transactions CTE:
    • Selects only the last 7 days of transactions.
  • user_avg_spend CTE:
    • Calculates each user's average spend over the last 7 days.
  • Main query:
    • Joins transaction data with user average spend.
    • Filters where the transaction amount is greater than 2 times the average.

Expected Output:

user_idtransaction_timeamountavg_amount
12024-02-19 11:00:0012050
22024-02-20 13:00:0010030

Explanation:

  • For User 1:
    • Average spend = 50
    • 2 × 50 = 100
    • 120 > 100 → Flagged
  • For User 2:
    • Average spend = 30
    • 2 × 30 = 60
    • 100 > 60 → Flagged

Important Points:

  • Always filter first for the correct time window (7 days).
  • Calculate per user average, not overall average.
  • Compare each transaction individually to the user’s average.
  • DATE_SUB(CURRENT_DATE(), INTERVAL 7 DAY) helps automatically move the time window.