SQL Patterns
- When you wrap an expression in parentheses, it returns a table in and of itself, so you can capture sequential selects in this way. See here
PostgreSQL
In PostgreSQL, object names are case insensitive by default. So if you have an INT named
Salary
being returned, and a column namedsalary
, the parser will throw a runtime error stating "column reference 'salary' is ambiguous"
Problems
LC 175 Combine Two Tables
Solution: Left join on personId
SELECT Person.firstName, Person.lastName, Address.city, Address.state
FROM Person
LEFT JOIN Address
ON Person.personId = Address.personId
SELECT Person.firstName, Person.lastName, Address.city, Address.state
FROM Person
LEFT JOIN Address
ON Person.personId = Address.personId
LC 176 Second Highest Salary
Solution: Use a select distinct, order descending so the top salary is first, limit the output to 1 and offset by 1 to return the second highest entry.
Trick 1: Ensure that you use the DISTINCT
keyword when gathering salaries, otherwise, if the two highest salaries are equal then you'll still return the highest salary. Trick 2: You have to select the data from a sub-query so that if there is no second highest salary you'll get a NULL
table instead of an empty table. As stated here "When a sub-query doesn't produce any result, its return value is NULL
. This behaviour is consistent with SQL standards."
MySQL code
SELECT (
SELECT DISTINCT salary
FROM Employee
ORDER BY salary DESC
LIMIT 1 OFFSET 1)
AS SecondHighestSalary;
SELECT (
SELECT DISTINCT salary
FROM Employee
ORDER BY salary DESC
LIMIT 1 OFFSET 1)
AS SecondHighestSalary;
Pandas code
def second_highest_salary(employee: pd.DataFrame) -> pd.DataFrame:
salaries = employee['salary'].drop_duplicates()
second_highest = salaries.nlargest(2).iloc[-1] if len(salaries) >=2 else None
return pd.DataFrame({'SecondHighestSalary': [second_highest]})
def second_highest_salary(employee: pd.DataFrame) -> pd.DataFrame:
salaries = employee['salary'].drop_duplicates()
second_highest = salaries.nlargest(2).iloc[-1] if len(salaries) >=2 else None
return pd.DataFrame({'SecondHighestSalary': [second_highest]})
LC 178 Nth Highest Salary
This question tests your ability to declare and set variables in SQL.
CREATE FUNCTION getNthHighestSalary(N INT) RETURNS INT
BEGIN
DECLARE M INT;
SET M = N - 1;
RETURN (
# Write your MySQL query statement below.
SELECT DISTINCT salary FROM Employee
ORDER BY salary DESC
LIMIT 1
OFFSET M
);
END
CREATE FUNCTION getNthHighestSalary(N INT) RETURNS INT
BEGIN
DECLARE M INT;
SET M = N - 1;
RETURN (
# Write your MySQL query statement below.
SELECT DISTINCT salary FROM Employee
ORDER BY salary DESC
LIMIT 1
OFFSET M
);
END
Some alternative solutions use the DENSE_RANK()
function to get the ranking, then pass the constructed table to a table that filters out distinct entries.
LC 1527 Patients With a Condition
Solution: Use a WHERE ... LIKE ... AND ... LIKE ...
style statement to catch occurrences when the target string is at the beginning or somewhere in the middle.
SELECT * FROM Patients
WHERE
conditions LIKE 'DIAB1%' OR
conditions LIKE '% DIAB1%';
SELECT * FROM Patients
WHERE
conditions LIKE 'DIAB1%' OR
conditions LIKE '% DIAB1%';
LC 1484 Group Sold Products By The Date
Solution: For this problem, it is important to use the DISTINCT
keyword in the right location. Start by selecting the sell date, distinct count of the products, and then calling a string aggregator function.
Tricks: The tricks to this problem are to know how to concatenate the string correctly and order the outputs.
Postgres code
SELECT
sell_date,
COUNT(DISTINCT product) as num_sold,
STRING_AGG(DISTINCT product, ',' ORDER BY product ASC) as products
FROM Activities
GROUP BY sell_date
ORDER BY sell_date ASC;
SELECT
sell_date,
COUNT(DISTINCT product) as num_sold,
STRING_AGG(DISTINCT product, ',' ORDER BY product ASC) as products
FROM Activities
GROUP BY sell_date
ORDER BY sell_date ASC;
MySQL code
SELECT
sell_date,
COUNT(DISTINCT product) as num_sold,
GROUP_CONCAT(DISTINCT product ORDER BY product ASC separator ',') as products
FROM Activities
GROUP BY sell_date
ORDER BY sell_date ASC;
SELECT
sell_date,
COUNT(DISTINCT product) as num_sold,
GROUP_CONCAT(DISTINCT product ORDER BY product ASC separator ',') as products
FROM Activities
GROUP BY sell_date
ORDER BY sell_date ASC;
LC 595 Big Countries
This is a simple problem that tests your ability to filter results given multiple conditions.
Solution: Select all of the requested columns and filter using a WHERE ... OR ...
clause.
LC 178 Rank Scores
This tests your ability to use the RANK()
and DENSE_RANK()
functions.
SELECT
score,
DENSE_RANK() OVER(ORDER BY score DESC) AS "rank"
FROM Scores;
SELECT
score,
DENSE_RANK() OVER(ORDER BY score DESC) AS "rank"
FROM Scores;
LC 180 Consecutive Numbers
This question tests your ability to use windowing functions, specifically the lead and lag functions.
Solution: Write one query, selecting the id
, current num
, and the next two numbers using the LEAD()
window function. Bind this table selection using WITH ... AS
and then use it to filter out any entries where the three numbers don't match
WITH consecutives AS (
SELECT
id,
num as "num1",
LEAD(num, 1) OVER(ORDER BY id) as "num2",
LEAD(num, 2) OVER(ORDER BY id) as "num3"
FROM Logs
)
SELECT DISTINCT num1 as "ConsecutiveNums"
FROM consecutives
WHERE
num1 = num2 AND
num2 = num3;
WITH consecutives AS (
SELECT
id,
num as "num1",
LEAD(num, 1) OVER(ORDER BY id) as "num2",
LEAD(num, 2) OVER(ORDER BY id) as "num3"
FROM Logs
)
SELECT DISTINCT num1 as "ConsecutiveNums"
FROM consecutives
WHERE
num1 = num2 AND
num2 = num3;
LC 185 Department Top Three Salaries
This question is difficult, and tests your ability to conduct complex sub-queries, aggregate functions, and filtering.
Solution 1: Use a sub-query to establish ranks for each department, then a join to replace the department ID with the actual department name. Finally, filter out any rankings less than 3 using a WHERE
statement.
SELECT
D.name AS "Department",
E.name AS "Employee",
E.salary AS "Salary"
FROM (
SELECT
*,
DENSE_RANK() OVER(PARTITION BY departmentId ORDER BY salary DESC) AS "rank"
FROM Employee
) E
LEFT JOIN Department D
ON E.departmentId = D.id
WHERE E.rank <= 3;
SELECT
D.name AS "Department",
E.name AS "Employee",
E.salary AS "Salary"
FROM (
SELECT
*,
DENSE_RANK() OVER(PARTITION BY departmentId ORDER BY salary DESC) AS "rank"
FROM Employee
) E
LEFT JOIN Department D
ON E.departmentId = D.id
WHERE E.rank <= 3;
Solution 2: (not recommended, see WITH
) Establish a rank for the salaries in each department and bind it to a new table using a WITH
statement. Next, join this table with the department table and filter out any rankings less than 3 using a WHERE
statement.
WITH EmployeeRankings AS (
SELECT
name,
salary,
departmentId,
DENSE_RANK() OVER(PARTITION BY departmentId ORDER BY salary DESC) AS "rank"
FROM Employee
)
SELECT
Department.name AS "Department",
EmployeeRankings.name AS "Employee",
EmployeeRankings.salary AS "Salary"
FROM EmployeeRankings
JOIN Department
ON EmployeeRankings.departmentId = Department.id
WHERE EmployeeRankings.rank <= 3;
WITH EmployeeRankings AS (
SELECT
name,
salary,
departmentId,
DENSE_RANK() OVER(PARTITION BY departmentId ORDER BY salary DESC) AS "rank"
FROM Employee
)
SELECT
Department.name AS "Department",
EmployeeRankings.name AS "Employee",
EmployeeRankings.salary AS "Salary"
FROM EmployeeRankings
JOIN Department
ON EmployeeRankings.departmentId = Department.id
WHERE EmployeeRankings.rank <= 3;
LC 1873 Calculate Special Bonus
This tests you ability to use the CASE WHEN ... THEN ELSE END
clause.
Solution: Select the employee ID and match on a case whereby you check if the ID is odd and their name doesn't start with an 'M' via id % 2 = 1 AND name NOT LIKE "M%"
.
SELECT
employee_id,
CASE
WHEN employee_id % 2 = 1 AND name NOT LIKE "M%" THEN salary
ELSE 0
END AS "bonus"
FROM Employees;
SELECT
employee_id,
CASE
WHEN employee_id % 2 = 1 AND name NOT LIKE "M%" THEN salary
ELSE 0
END AS "bonus"
FROM Employees;
LC 511 Game Play Analysis
This question tests your ability to recognize when to use the MIN()
function on dates and adjust the output using a GROUP BY
call
Solution: Select player ID and the first game played via MIN(event_date)
.
SELECT
player_id,
MIN(event_date) AS "first_login"
FROM Activity
GROUP BY player_id;
SELECT
player_id,
MIN(event_date) AS "first_login"
FROM Activity
GROUP BY player_id;
LC 1667 Fix Names in a Table
This question tests your ability to manipulate and concatenate strings.
Solution: In MySQL, you don't have access to INITCAP()
so you can't directly capitalize the first character. Instead combine UPPER
and LOWER
calls with the SUBSTRING
method, and concatenate with CONCAT
.
SELECT
user_id,
CONCAT(UPPER(SUBSTRING(name, 1, 1)), LOWER(SUBSTRING(name, 2, LENGTH(name)))) AS "name"
FROM Users
ORDER BY user_id;
SELECT
user_id,
CONCAT(UPPER(SUBSTRING(name, 1, 1)), LOWER(SUBSTRING(name, 2, LENGTH(name)))) AS "name"
FROM Users
ORDER BY user_id;
LC 1581 Customer Who Visited but Did Not Make Any Transactions
This tests you ability to conduct complex table joins and counting.
SELECT
V.customer_id,
COUNT(V.visit_id) AS "count_no_trans"
FROM Visits V
LEFT JOIN Transactions T
ON V.visit_id = T.visit_id
WHERE T.transaction_id IS NULL
GROUP BY V.customer_id;
SELECT
V.customer_id,
COUNT(V.visit_id) AS "count_no_trans"
FROM Visits V
LEFT JOIN Transactions T
ON V.visit_id = T.visit_id
WHERE T.transaction_id IS NULL
GROUP BY V.customer_id;
LC 197 Rising Temperature
This questions tests you knowledge of date management and knowledge of that the LAG() OVER(ORDER BY ...)
clause.
Trick 1: The records may not be ordered. You cannot sort by id
. Trick 2: There are missing records. You need to add a final condition to assert that the previous record was indeed yesterday, and not some day last week.
SELECT id
FROM (
SELECT
id,
recordDate,
LAG(recordDate, 1) OVER(ORDER BY recordDate) AS 'last_record',
LAG(temperature, 1) OVER(ORDER BY recordDate) AS 't_yesterday',
temperature AS 't_today'
FROM Weather
) W
WHERE
t_yesterday < t_today
AND
recordDate = DATE_ADD(last_record, INTERVAL 1 DAY);
SELECT id
FROM (
SELECT
id,
recordDate,
LAG(recordDate, 1) OVER(ORDER BY recordDate) AS 'last_record',
LAG(temperature, 1) OVER(ORDER BY recordDate) AS 't_yesterday',
temperature AS 't_today'
FROM Weather
) W
WHERE
t_yesterday < t_today
AND
recordDate = DATE_ADD(last_record, INTERVAL 1 DAY);
LC 1084 Sales Analysis III
SQL code with a sub-query
SELECT
P.product_id,
P.product_name
FROM Product P
LEFT JOIN (
SELECT
product_id,
MAX(sale_date) AS "max_sale_date",
MIN(sale_date) AS "min_sale_date"
FROM Sales
GROUP BY product_id
) S
ON P.product_id = S.product_id
WHERE
S.max_sale_date <= '2019-03-31'
AND
S.min_sale_date >= '2019-01-01'
SELECT
P.product_id,
P.product_name
FROM Product P
LEFT JOIN (
SELECT
product_id,
MAX(sale_date) AS "max_sale_date",
MIN(sale_date) AS "min_sale_date"
FROM Sales
GROUP BY product_id
) S
ON P.product_id = S.product_id
WHERE
S.max_sale_date <= '2019-03-31'
AND
S.min_sale_date >= '2019-01-01'
Optimized code without a sub-query
SELECT
P.product_id,
P.product_name
FROM Product P
LEFT JOIN Sales S
ON P.product_id = S.product_id
GROUP BY P.product_id
HAVING
MAX(sale_date) <= '2019-03-31'
AND
MIN(sale_date) >= '2019-01-01'
SELECT
P.product_id,
P.product_name
FROM Product P
LEFT JOIN Sales S
ON P.product_id = S.product_id
GROUP BY P.product_id
HAVING
MAX(sale_date) <= '2019-03-31'
AND
MIN(sale_date) >= '2019-01-01'
LC 608 Tree Node
...redo...
LC 1050 Actors and Directors Who Cooperated At Least Three Times
This question tests your ability to use GROUP BY
with more than one grouping column.
SELECT
actor_id,
director_id,
COUNT(timestamp)
FROM ActorDirector
GROUP BY actor_id, director_id;
SELECT
actor_id,
director_id,
COUNT(timestamp)
FROM ActorDirector
GROUP BY actor_id, director_id;
LC 1667 Fix Names in a Table
This question tests your understanding of string manipulation via functions like CONCAT
, SUBSTRING
, UPPER
, and LOWER
.
SELECT
user_id,
CONCAT( UPPER(SUBSTRING(name, 1, 1)), LOWER(SUBSTRING(name, 2, LENGTH(name))) ) AS "name"
FROM Users
ORDER BY user_id;
SELECT
user_id,
CONCAT( UPPER(SUBSTRING(name, 1, 1)), LOWER(SUBSTRING(name, 2, LENGTH(name))) ) AS "name"
FROM Users
ORDER BY user_id;
LC 181 Employees Earning More Than Their Managers
This question tests your ability to conduct a self-join and correctly select items to filter.
SELECT
E1.name AS "Employee"
FROM Employee E1
JOIN Employee E2
ON E1.managerId = E2.id
WHERE
E1.salary > E2.salary;
SELECT
E1.name AS "Employee"
FROM Employee E1
JOIN Employee E2
ON E1.managerId = E2.id
WHERE
E1.salary > E2.salary;
LC 1158 Market Analysis I
This question tests your understanding of joins and how to filter specific entries so that a zero value is returned instead of a null.
SELECT
U.user_id,
U.join_date,
IFNULL(COUNT(O.order_id), 0) AS 'orders_in_2019'
FROM Users U
LEFT JOIN Orders O
ON O.buyer_id = U.user_id
AND YEAR(O.order_date) = '2019'
GROUP BY U.user_id;
SELECT
U.user_id,
U.join_date,
IFNULL(COUNT(O.order_id), 0) AS 'orders_in_2019'
FROM Users U
LEFT JOIN Orders O
ON O.buyer_id = U.user_id
AND YEAR(O.order_date) = '2019'
GROUP BY U.user_id;
LC 1873 Calculate Special Bonus
This question tests your ability to conditionally return rows depending on the other entries in that row using CASE
or potentially IF()
.
SELECT
employee_id,
CASE
WHEN employee_id % 2 = 1 AND name NOT LIKE 'M%' THEN salary
ELSE 0
END AS 'bonus'
FROM Employees
ORDER BY employee_id;
SELECT
employee_id,
CASE
WHEN employee_id % 2 = 1 AND name NOT LIKE 'M%' THEN salary
ELSE 0
END AS 'bonus'
FROM Employees
ORDER BY employee_id;
LC 626 Exchange Seats
This question tests your ability to manipulate entries with a CASE
or and IF()
statement.
Solution: My solution. This solution swaps the names of the students, however most solutions found on LeetCode prefer to swap the id directly. The other solutions may perhaps be simpler since you don't have to depend on a lead and lag column to conduct the swapping.
SELECT
id,
IFNULL(swapped, student) AS 'student'
FROM (
SELECT
id,
student,
IF(
id % 2 = 1,
LEAD(student, 1) OVER(ORDER BY id),
LAG(student, 1) OVER(ORDER BY id)
) AS 'swapped'
FROM Seat
) S
ORDER BY id ASC;
SELECT
id,
IFNULL(swapped, student) AS 'student'
FROM (
SELECT
id,
student,
IF(
id % 2 = 1,
LEAD(student, 1) OVER(ORDER BY id),
LAG(student, 1) OVER(ORDER BY id)
) AS 'swapped'
FROM Seat
) S
ORDER BY id ASC;