Question: User Watchlist Behavior
"Analyze the behavior of users on the Watchlist feature: how many items they add, how many they actually watch, and how long they keep them unwatched."
Step-by-Step Explanation:
You have two tables:
watchlist_activity
user_id
content_id
added_time
(timestamp when item was added to the watchlist)
watch_history
user_id
content_id
watch_time
(timestamp when the content was actually watched)
You need to:
- Find out for each user:
- How many items they added to their watchlist.
- How many items they actually watched.
- Average time difference between when they added and watched (for items that were watched).
Sample Data:
watchlist_activity
user_id | content_id | added_time |
---|---|---|
1 | 101 | 2024-02-01 08:00:00 |
1 | 102 | 2024-02-02 09:00:00 |
2 | 103 | 2024-02-01 10:00:00 |
2 | 104 | 2024-02-03 11:00:00 |
watch_history
user_id | content_id | watch_time |
---|---|---|
1 | 101 | 2024-02-05 08:00:00 |
2 | 104 | 2024-02-10 11:00:00 |
Step 1: How Many Items Added
Simply count rows in watchlist_activity
per user.
Step 2: How Many Items Watched
Join watchlist_activity
and watch_history
on user_id
and content_id
.
Count how many matches (watched after adding).
Step 3: How Long Items Stayed Unwatched (Average Delay)
For each matched item:
- Calculate the difference between
watch_time
andadded_time
(in days). - Find the average delay per user.
SQL Query:
WITH watchlist_counts AS (
SELECT
user_id,
COUNT(*) AS items_added
FROM
watchlist_activity
GROUP BY
user_id
),
watched_items AS (
SELECT
w.user_id,
w.content_id,
w.added_time,
h.watch_time,
DATEDIFF(h.watch_time, w.added_time) AS days_to_watch
FROM
watchlist_activity w
JOIN
watch_history h
ON
w.user_id = h.user_id
AND w.content_id = h.content_id
)
, watched_counts AS (
SELECT
user_id,
COUNT(*) AS items_watched,
AVG(days_to_watch) AS average_days_to_watch
FROM
watched_items
GROUP BY
user_id
)
SELECT
a.user_id,
a.items_added,
COALESCE(b.items_watched, 0) AS items_watched,
COALESCE(b.average_days_to_watch, 0) AS average_days_to_watch
FROM
watchlist_counts a
LEFT JOIN
watched_counts b
ON
a.user_id = b.user_id;
Explanation of SQL:
watchlist_counts
:- Counts how many items each user added.
watched_items
:- Joins added and watched data.
- Calculates how many days it took to watch.
watched_counts
:- Counts how many items each user actually watched.
- Averages the days taken to watch.
- Final
SELECT
:- Combines added and watched data.
COALESCE()
is used to show 0 when a user has no watched items.
Expected Output:
user_id | items_added | items_watched | average_days_to_watch |
---|---|---|---|
1 | 2 | 1 | 4.0 |
2 | 2 | 1 | 7.0 |
Explanation:
- User 1 added 2 items, watched 1 item in 4 days.
- User 2 added 2 items, watched 1 item in 7 days.
Important Points:
DATEDIFF(watch_time, added_time)
measures delay in days.- Users who haven't watched any items will show
items_watched = 0
. - Always use a
LEFT JOIN
when combining counts to not lose users who watched 0 items.