Question:Average Post Hiatus
"Given user posts with post timestamps, calculate the average number of days between consecutive posts for each user."
Step-by-Step Explanation:
You are given a table called user_posts
with:
user_id
post_time
(timestamp when the post was made)
You need to:
- Find the gap (in days) between each user's consecutive posts.
- Then, for each user, calculate the average number of days between their posts.
Sample Data (user_posts
):
user_id | post_time |
---|---|
1 | 2024-01-01 10:00:00 |
1 | 2024-01-03 10:00:00 |
1 | 2024-01-10 10:00:00 |
2 | 2024-01-02 12:00:00 |
2 | 2024-01-05 14:00:00 |
Step 1: Compare Each Post to the Previous Post
For each user:
- Find the previous post_time.
We can use the SQL function LAG(post_time) OVER (PARTITION BY user_id ORDER BY post_time)
.
This function lets you look at the previous row for the same user.
Step 2: Calculate the Difference in Days
For each post, calculate the difference between post_time
and the previous post_time
.
Use DATEDIFF(post_time, previous_post_time)
.
Example for user 1:
- 2024-01-01 → No previous post → Skip
- 2024-01-03 → 2 days after 2024-01-01
- 2024-01-10 → 7 days after 2024-01-03
Step 3: Average the Gaps for Each User
After calculating the day differences for all posts, average the gaps for each user.
SQL Query:
WITH post_gaps AS (
SELECT
user_id,
post_time,
LAG(post_time) OVER (PARTITION BY user_id ORDER BY post_time) AS previous_post_time
FROM
user_posts
)
SELECT
user_id,
AVG(DATEDIFF(post_time, previous_post_time)) AS avg_days_between_posts
FROM
post_gaps
WHERE
previous_post_time IS NOT NULL
GROUP BY
user_id;
Explanation of SQL:
- In the CTE
post_gaps
:LAG(post_time)
gets the previous post time for each user.
- In the main query:
DATEDIFF(post_time, previous_post_time)
calculates the gap in days.AVG()
finds the average gap per user.
- We filter
WHERE previous_post_time IS NOT NULL
because the very first post for each user has no previous post to compare.
Expected Output:
user_id | avg_days_between_posts |
---|---|
1 | 4.5 |
2 | 3 |
User 1:
- Gaps: 2 days, 7 days
- Average = (2 + 7) ÷ 2 = 4.5 days
User 2:
- Gaps: 3 days
- Average = 3 days
Important Points:
- Always
PARTITION BY user_id
inLAG()
because each user's posts must be compared separately. ORDER BY post_time
is critical to ensure correct time sequence.- The first post for a user does not have a previous post, so it is ignored when calculating gaps.
DATEDIFF(end, start)
calculates how many days between two dates.