LeetCode-in-Java

3554. Find Category Recommendation Pairs

Table: ProductPurchases

+-------------+------+
| Column Name | Type |
+-------------+------+
| user_id     | int  |
| product_id  | int  |
| quantity    | int  |
+-------------+------+
(user_id, product_id) is the unique identifier for this table.
Each row represents a purchase of a product by a user in a specific quantity. 

Table: ProductInfo

+-------------+---------+
| Column Name | Type    |
+-------------+---------+
| product_id  | int     |
| category    | varchar |
| price       | decimal |
+-------------+---------+
product_id is the unique identifier for this table.
Each row assigns a category and price to a product. 

Amazon wants to understand shopping patterns across product categories. Write a solution to:

  1. Find all category pairs (where category1 < category2)
  2. For each category pair, determine the number of unique customers who purchased products from both categories

A category pair is considered reportable if at least 3 different customers have purchased products from both categories.

Return the result table of reportable category pairs ordered by customer_count in descending order, and in case of a tie, by category1 in ascending order lexicographically, and then by category2 in ascending order.

The result format is in the following example.

Example:

Input:

ProductPurchases table:

+---------+------------+----------+
| user_id | product_id | quantity |
+---------+------------+----------+
| 1       | 101        | 2        |
| 1       | 102        | 1        |
| 1       | 201        | 3        |
| 1       | 301        | 1        |
| 2       | 101        | 1        |
| 2       | 102        | 2        |
| 2       | 103        | 1        |
| 2       | 201        | 5        |
| 3       | 101        | 2        |
| 3       | 103        | 1        |
| 3       | 301        | 4        |
| 3       | 401        | 2        |
| 4       | 101        | 1        |
| 4       | 201        | 3        |
| 4       | 301        | 1        |
| 4       | 401        | 2        |
| 5       | 102        | 2        |
| 5       | 103        | 1        |
| 5       | 201        | 2        |
| 5       | 202        | 3        | 
+---------+------------+----------+ 

ProductInfo table:

+------------+-------------+-------+
| product_id | category    | price |
+------------+-------------+-------+
| 101        | Electronics | 100   |
| 102        | Books       | 20    |
| 103        | Books       | 35    |
| 201        | Clothing    | 45    |
| 202        | Clothing    | 60    |
| 301        | Sports      | 75    |
| 401        | Kitchen     | 50    |
+------------+-------------+-------+ 

Output:

+-------------+-------------+----------------+
| category1   | category2   | customer_count |
+-------------+-------------+----------------+
| Books       | Clothing    | 3              |
| Books       | Electronics | 3              |
| Clothing    | Electronics | 3              |
| Electronics | Sports      | 3              |
+-------------+-------------+----------------+ 

Explanation:

The result is ordered by customer_count in descending order. Since all pairs have the same customer_count of 3, they are ordered by category1 (then category2) in ascending order.

Solution

# Write your MySQL query statement below
SELECT
    pi1.category AS category1,
    pi2.category AS category2,
    COUNT(DISTINCT pp1.user_id) AS customer_count
FROM
    ProductPurchases pp1,
    ProductPurchases pp2,
    ProductInfo pi1,
    ProductInfo pi2
WHERE
    pp1.user_id = pp2.user_id
    AND pi1.category < pi2.category
    AND pp1.product_id = pi1.product_id
    AND pp2.product_id = pi2.product_id
GROUP BY
    pi1.category,
    pi2.category
HAVING
    COUNT(DISTINCT pp1.user_id) >= 3
ORDER BY
    customer_count DESC,
    category1 ASC,
    category2 ASC;