Question:Average Post Hiatus
"Given user posts with post timestamps, calculate the average number of days between consecutive posts for each user."

Step-by-Step Explanation:

You are given a table called user_posts with:

  • user_id
  • post_time (timestamp when the post was made)

You need to:

  1. Find the gap (in days) between each user's consecutive posts.
  2. Then, for each user, calculate the average number of days between their posts.

Sample Data (user_posts):

user_idpost_time
12024-01-01 10:00:00
12024-01-03 10:00:00
12024-01-10 10:00:00
22024-01-02 12:00:00
22024-01-05 14:00:00

Step 1: Compare Each Post to the Previous Post

For each user:

  • Find the previous post_time.

We can use the SQL function LAG(post_time) OVER (PARTITION BY user_id ORDER BY post_time).

This function lets you look at the previous row for the same user.


Step 2: Calculate the Difference in Days

For each post, calculate the difference between post_time and the previous post_time.

Use DATEDIFF(post_time, previous_post_time).

Example for user 1:

  • 2024-01-01 → No previous post → Skip
  • 2024-01-03 → 2 days after 2024-01-01
  • 2024-01-10 → 7 days after 2024-01-03

Step 3: Average the Gaps for Each User

After calculating the day differences for all posts, average the gaps for each user.


SQL Query:

WITH post_gaps AS (
  SELECT
    user_id,
    post_time,
    LAG(post_time) OVER (PARTITION BY user_id ORDER BY post_time) AS previous_post_time
  FROM
    user_posts
)

SELECT
  user_id,
  AVG(DATEDIFF(post_time, previous_post_time)) AS avg_days_between_posts
FROM
  post_gaps
WHERE
  previous_post_time IS NOT NULL
GROUP BY
  user_id;


Explanation of SQL:

  • In the CTE post_gaps:
    • LAG(post_time) gets the previous post time for each user.
  • In the main query:
    • DATEDIFF(post_time, previous_post_time) calculates the gap in days.
    • AVG() finds the average gap per user.
  • We filter WHERE previous_post_time IS NOT NULL because the very first post for each user has no previous post to compare.

Expected Output:

user_idavg_days_between_posts
14.5
23

User 1:

  • Gaps: 2 days, 7 days
  • Average = (2 + 7) ÷ 2 = 4.5 days

User 2:

  • Gaps: 3 days
  • Average = 3 days

Important Points:

  • Always PARTITION BY user_id in LAG() because each user's posts must be compared separately.
  • ORDER BY post_time is critical to ensure correct time sequence.
  • The first post for a user does not have a previous post, so it is ignored when calculating gaps.
  • DATEDIFF(end, start) calculates how many days between two dates.