Q: Average Review Ratings
"From a reviews table (product_id, rating), write a query to compute the average rating per product, and return only those products with more than 50 reviews."

Step By Step Answer:

You are given a reviews table with two important columns:

  • product_id
  • rating (for example, 1 to 5 stars)

You need to:

  1. Find the average rating for each product.
  2. Only show products that have more than 50 reviews.

Sample Data (reviews):

product_idrating
10015
10014
10023
10015
10032
10024
......

(Assume many more rows. Some products have 3 reviews, some have 100.)


Step-by-Step Explanation:

Step 1: Calculate the Average Rating per Product

Group all rows with the same product_id, and for each group:

  • Find the average of rating values.

Use AVG(rating) to do this.

Example:

For product 1001:

  • Ratings = [5, 4, 5]
  • Average = (5 + 4 + 5) ÷ 3 = 4.67

For product 1002:

  • Ratings = [3, 4]
  • Average = (3 + 4) ÷ 2 = 3.5

Step 2: Count How Many Reviews per Product

While calculating the average, also count how many reviews (COUNT(*)).

Example:

  • Product 1001 → 3 reviews
  • Product 1002 → 2 reviews

Step 3: Only Keep Products with More Than 50 Reviews

Use HAVING COUNT(*) > 50 to filter after grouping.

We use HAVING (not WHERE) because we are filtering after grouping.


SQL Query:

SELECT 
  product_id,
  AVG(rating) AS average_rating,
  COUNT(*) AS review_count
FROM 
  reviews
GROUP BY 
  product_id
HAVING 
  review_count > 50;

Or, if your SQL system doesn't let you use column alias (review_count) in HAVING, then:

SELECT 
  product_id,
  AVG(rating) AS average_rating,
  COUNT(*) AS review_count
FROM 
  reviews
GROUP BY 
  product_id
HAVING 
  COUNT(*) > 50;


Explanation of SQL:

  • AVG(rating): Calculates the average rating.
  • COUNT(*): Counts the number of reviews.
  • GROUP BY product_id: Groups the reviews by product.
  • HAVING COUNT(*) > 50: Keeps only the products that have more than 50 reviews.

Expected Final Output Example:

product_idaverage_ratingreview_count
10014.67120
10054.2075
10103.9566

(Only products with more than 50 reviews are shown.)


Important Points:

  • Use GROUP BY whenever you want to calculate something per group (like per product).
  • Use HAVING when you filter after aggregation (AVG, COUNT).
  • AVG(rating) gives decimal numbers unless you round it.