Q: Average Review Ratings
"From a reviews table (product_id, rating), write a query to compute the average rating per product, and return only those products with more than 50 reviews."
Step By Step Answer:
You are given a reviews
table with two important columns:
product_id
rating
(for example, 1 to 5 stars)
You need to:
- Find the average rating for each product.
- Only show products that have more than 50 reviews.
Sample Data (reviews
):
product_id | rating |
---|---|
1001 | 5 |
1001 | 4 |
1002 | 3 |
1001 | 5 |
1003 | 2 |
1002 | 4 |
... | ... |
(Assume many more rows. Some products have 3 reviews, some have 100.)
Step-by-Step Explanation:
Step 1: Calculate the Average Rating per Product
Group all rows with the same product_id
, and for each group:
- Find the average of
rating
values.
Use AVG(rating)
to do this.
Example:
For product 1001:
- Ratings = [5, 4, 5]
- Average = (5 + 4 + 5) ÷ 3 = 4.67
For product 1002:
- Ratings = [3, 4]
- Average = (3 + 4) ÷ 2 = 3.5
Step 2: Count How Many Reviews per Product
While calculating the average, also count how many reviews (COUNT(*)
).
Example:
- Product 1001 → 3 reviews
- Product 1002 → 2 reviews
Step 3: Only Keep Products with More Than 50 Reviews
Use HAVING COUNT(*) > 50
to filter after grouping.
We use HAVING
(not WHERE
) because we are filtering after grouping.
SQL Query:
SELECT
product_id,
AVG(rating) AS average_rating,
COUNT(*) AS review_count
FROM
reviews
GROUP BY
product_id
HAVING
review_count > 50;
Or, if your SQL system doesn't let you use column alias (review_count
) in HAVING
, then:
SELECT
product_id,
AVG(rating) AS average_rating,
COUNT(*) AS review_count
FROM
reviews
GROUP BY
product_id
HAVING
COUNT(*) > 50;
Explanation of SQL:
AVG(rating)
: Calculates the average rating.COUNT(*)
: Counts the number of reviews.GROUP BY product_id
: Groups the reviews by product.HAVING COUNT(*) > 50
: Keeps only the products that have more than 50 reviews.
Expected Final Output Example:
product_id | average_rating | review_count |
---|---|---|
1001 | 4.67 | 120 |
1005 | 4.20 | 75 |
1010 | 3.95 | 66 |
(Only products with more than 50 reviews are shown.)
Important Points:
- Use
GROUP BY
whenever you want to calculate something per group (like per product). - Use
HAVING
when you filter after aggregation (AVG
,COUNT
). AVG(rating)
gives decimal numbers unless you round it.