高阶SQL

2024/01/10 sql 共 9385 字,约 27 分钟

Null

不等于 不包含 null 会把 null的过滤掉,需要 先把null转成 0 再进行判断

select  name from Customer where ifnull(referee_id,0) <> 2

left join是一种连接方式,它以左表为基准,返回左表中所有的行,同时返回右表中与左表匹配的行。如果右表中没有匹配的行,则用NULL填充。

select unique_id, name from Employees  as e left join
EmployeeUNI as eu on e.id = eu.id

每个奖金 少于 1000 的员工的姓名和奖金数额

select name,bonus from Employee as ee  left join Bonus as b
on  ee.empId = b.empId where ifnull(bonus,0) < 1000

Join

找出与之前(昨天的)日期相比温度更高的所有日期的 id

select w.id from Weather w inner join Weather a
on w.recordDate = DATE_ADD(a.recordDate,INTERVAL 1 day)
Where w.Temperature > a.Temperature;

有一个工厂网站由几台机器运行,每台机器上运行着 相同数量的进程 。编写解决方案,计算每台机器各自完成一个进程任务的平均耗时。 AVG 函数

select start.machine_id,round(avg(end.timestamp-start.timestamp),3) as processing_time 
from Activity start
inner join Activity `end` on start.machine_id= end.machine_id
and start.process_id = end.process_id
where start.activity_type='start' and end.activity_type='end'
group by machine_id;

查询出每个学生参加每一门科目测试的次数,结果按 student_id 和 subject_name 排序。

select Students.student_id ,
student_name,
Subjects.subject_name,
count(Examinations.subject_name) as attended_exams
from Students
join Subjects
left join Examinations
on Students.student_id = Examinations.student_id
and Subjects.subject_name= Examinations.subject_name
group by student_id,student_name,subject_name
order by Students.student_id,Subjects.subject_name

找出至少有五个直接下属的经理

select m.name
from Employee e
left join Employee m on e.managerId = m.id
where m.id is not null
group by e.managerId
having count(*) >= 5;

奇偶

找出所有影片描述为 非 boring (不无聊) 的并且 id 为奇数 的影片

select *
from cinema
where description != 'boring'
and mod(id, 2) != 0
order by rating desc;

平均不能用AVG

以查找每种产品的平均售价。average_price 应该 四舍五入到小数点后两位。

select p.product_id, ifnull(round(sum(price * units) / sum(units), 2), 0) as average_price
from Prices p
left join UnitsSold u on u.product_id = p.product_id
and u.purchase_date between p.start_date and p.end_date
group by p.product_id

查询每一个项目中员工的 平均 工作年限,精确到小数点后两位

子查询

出各赛事的用户注册百分率,保留两位小数 在统计函数里面可以写个子查询

select
r.contest_id,round(count(*)/(select count(*) from users)*100,2) percentage
from Users u
join Register r
on u.user_id=r.user_id
group by r.contest_id
order by percentage desc,r.contest_id;

将查询结果的质量 quality 定义为:

各查询结果的评分与其位置之间比率的平均值。

将劣质查询百分比 poor_query_percentage 为:

评分小于 3 的查询结果占全部查询结果的百分比。

编写解决方案,找出每次的 query_name 、 quality 和 poor_query_percentage。

quality 和 poor_query_percentage 都应 四舍五入到小数点后两位 。

SELECT
query_name,
ROUND(AVG(rating/position), 2) quality,
ROUND(SUM(IF(rating < 3, 1, 0))*100/COUNT(*), 2) poor_query_percentage
FROM Queries
WHERE query_name IS NOT NULL #这句记得跑
GROUP BY query_name

时间维度关联聚合的统计

sql 查询来查找每个月和每个国家/地区的事务数及其总金额、已批准的事务数及其总金额。

SELECT
CONCAT(YEAR(t.trans_date), '-', LPAD(MONTH(t.trans_date), 2, '0')) as month,
t.country as country,
COUNT(*) as trans_count,
SUM(CASE WHEN t.state=1 THEN 1 ELSE 0 END) as approved_count,
SUM(t.amount) as trans_total_amount,
SUM(CASE WHEN t.state=1 THEN t.amount ELSE 0 END) as approved_total_amount
FROM
Transactions t
GROUP BY
YEAR(t.trans_date), MONTH(t.trans_date), t.country

取即时订单在所有用户的首次订单中的比例。保留两位小数

select
round(sum(order_date = customer_pref_delivery_date)/count(*)*100,2) immediate_percentage
from Delivery
where (customer_id,order_date) in
(select customer_id,
min(order_date)
from Delivery group by customer_id)

报告在首次登录的第二天再次登录的玩家的 比率,四舍五入到小数点后两位。换句话说,你需要计算从首次登录日期开始至少连续两天登录的玩家的数量,然后除以玩家总数。

SELECT ROUND(COUNT(1)/(SELECT COUNT(DISTINCT player_id) FROM activity),2) AS fraction
FROM activity a1
JOIN(
SELECT player_id, min(event_date) as mdate FROM activity
    GROUP BY player_id
) a2
ON a1.player_id = a2.player_id
AND DATEDIFF(a1.event_date, a2.mdate) = 1;

排序和分组

查询每位老师在大学里教授的科目种类的数量。

select teacher_id,count(distinct subject_id) as cnt
from teacher
group by teacher_id

近 30 天的每日活跃用户数(当天只要有一条活动记录,即为活跃用户)

select activity_date'day',count(distinct user_id)'active_users'
from activity
where activity_date between '2019-06-28' and '2019-07-27' -- DateDiff 30天
group by activity_date

报告2019年春季才售出的产品。即仅在2019-01-01至2019-03-31(含)之间出售的商品。 Having 做条件删选

select p.product_id, product_name
from product p join sales s
on p.product_id = s.product_id
group by product_id
having min(sale_date) >= '2019-01-01' and max(sale_date) <= '2019-03-31';

至少有5个学生 的所有班级

select Class
from Courses
group by Class
having count(student) >= 5

对于每一个用户,返回该用户的关注者数量

select
user_id,
count(user_id) followers_count
from Followers
group by user_id
order by user_id

单一数字 是在 MyNumbers 表中只出现一次的数字。

找出最大的 单一数字 。如果不存在 单一数字 ,则返回 null

select (select num
from MyNumbers
group by num
having count(num)=1
order by num desc limit 1) as num

报告 Customer 表中购买了 Product 表中所有产品的客户的 id

select distinct customer_id
from Customer
group by customer_id
having count(distinct product_key) = (select count(*) from Product)

高级查询和连接

需要听取汇报的所有经理的ID、名称、直接向该经理汇报的员工人数,以及这些员工的平均年龄,其中该平均年龄需要四舍五入到最接近的整数。

select Result.employee_id, name ,reports_count  ,round(average_age, 0) average_age
from (
select reports_to as employee_id, avg(age) as average_age, count(*) reports_count
from Employees
where reports_to is not null
group by reports_to
) Result left join Employees on Result.employee_id = Employees.employee_id
order by employee_id asc

查出员工所属的直属部门

select employee_id,department_id from Employee
where primary_flag='Y'
union
select employee_id,department_id from Employee
group by employee_id having count(1)=1

对每三个线段报告它们是否可以形成一个三角形。

select
x,
y,
z,
case
when x + y > z and x + z > y and y + z > x then 'Yes'
else
'No'
end triangle
from
Triangle

所有至少连续出现三次的数字

select num ConsecutiveNums from Logs
where (id+1,num) in (select * from Logs)
and (id+2,num) in (select *from Logs)
group by num

找出在 2019-08-16 时全部产品的价格,假设所有产品在修改前的价格都是 10 。

select t.product_id, t.new_price as price
from (select *, row_number() over (PARTITION BY product_id order by change_date desc) as row_num
from Products
where change_date<='2019-08-16') as t
where t.row_num=1
union
select product_id, 10 as price
from Products
group by product_id
having min(change_date)>'2019-08-16'

最后一个 上巴士且不超过重量限制的乘客,并报告 person_name 。题目测试用例确保顺位第一的人可以上巴士且不会超重。

select
person_name
from
(select person_name, turn, sum(weight) over(order by turn) as total from Queue) t
where total <= 1000
order by turn desc
limit 1

按分类统计薪水

select 'Low Salary' category,count(*) accounts_count from Accounts
where income<20000
union
select 'Average Salary' category,count(*) accounts_count from Accounts
where income between 20000 and 50000
union
select 'High Salary' category,count(*) accounts_count from Accounts
where income>50000

子查询

查找这些员工的id,他们的薪水严格少于$30000 并且他们的上级经理已离职。 当一个经理离开公司时,他们的信息需要从员工表中删除掉,但是表中的员工的manager_id,这一列还是设置的离职经理的id 。

select e.employee_id
from Employees e
left join Employees ee
on e.manager_id = ee.employee_id
where e.manager_id is not null and  ee.employee_id is null and e.salary < 30000
order by e.employee_id

解决方案来交换每两个连续的学生的座位号。如果学生的数量是奇数,则最后一个学生的id不交换。

select s1.id, ifnull(s2.student, s1.student) student
from Seat s1
left join Seat s2
on s1.id = s2.id - 1
where s1.id % 2 != 0
union
select s1.id, s2.student
from Seat s1
inner join Seat s2
on s1.id = s2.id + 1
where s1.id % 2 = 0
order by id

查找评论电影数量最多的用户名。如果出现平局,返回字典序较小的用户名。 查找在 February 2020 平均评分最高 的电影名称。如果出现平局,返回字典序较小的电影名称。

(
select u.name results
from MovieRating mr
left join Users u
on mr.user_id = u.user_id
group by mr.user_id
order by count(1) desc, u.name asc
limit 1
)
union all
(
select m.title results
from MovieRating mr
left join Movies m
on mr.movie_id = m.movie_id
where created_at between Date('2020-02-01') and Date('2020-02-29')
group by mr.movie_id
order by avg(mr.rating) desc, m.title asc
limit 1
)

计算以 7 天(某日期 + 该日期前的 6 天)为一个时间段的顾客消费平均值。average_amount 要 保留两位小数。

结果按 visited_on 升序排序。

select
visited_on, amount, average_amount
from (
select
distinct visited_on,  # 因为窗口函数是按照日期计算的。所以相同日期的结果也是相同的,直接去重即可
sum(amount) over(order by visited_on range interval 6 day preceding) amount, # 按照日期排序,范围是当前日期和当前日期的前六天
round(sum(amount) over(order by visited_on range interval 6 day preceding)/7, 2) average_amount # 同理
from
Customer) t
where datediff(visited_on, (select min(visited_on) from Customer))>=6 #去除日期不足7日的结果
order by visited_on;

找出拥有最多的好友的人和他拥有的好友数目

select t.requester_id id, count(1) num
from (
select requester_id
from requestaccepted
union all
select accepter_id
from requestaccepted) t
group by t.requester_id
order by num desc
limit 1;

解决方案报告 2016 年 (tiv_2016) 所有满足下述条件的投保人的投保金额之和:

他在 2015 年的投保额 (tiv_2015) 至少跟一个其他投保人在 2015 年的投保额相同。 他所在的城市必须与其他投保人都不同(也就是说 (lat, lon) 不能跟其他任何一个投保人完全相同)。

select round(sum(tiv_2016), 2) tiv_2016
from (
select tiv_2016, count(*) over(partition by tiv_2015) c1,
count(*) over(partition by concat(lat, lon)) c2
from Insurance
) t1
where c1 != 1 and c2 = 1;

公司的主管们感兴趣的是公司每个部门中谁赚的钱最多。一个部门的 高收入者 是指一个员工的工资在该部门的 不同 工资中 排名前三 。

编写解决方案,找出每个部门中 收入高的员工 。

SELECT
d.NAME AS Department,
e.NAME AS Employee,
e.salary AS Salary
FROM
employee e
LEFT JOIN department d ON e.departmentId = d.id
WHERE
e.id IN (
SELECT
e1.id
FROM
employee e1,
employee e2
WHERE
e1.salary <= e2.salary
AND e1.departmentId = e2.departmentId
GROUP BY
e1.id
HAVING
count( DISTINCT e2.salary ) <= 3
)
ORDER BY
Department,
Salary DESC

高级字符串函数

修复名字,使得只有第一个字符是大写的,其余都是小写的。

返回按 user_id 排序的结果表。

select user_id,
concat(upper(left(name, 1)), lower(right(name, length(name) - 1))) name
from Users
order by user_id

查询患有 I 类糖尿病的患者 ID (patient_id)、患者姓名(patient_name)以及其患有的所有疾病代码(conditions)。I 类糖尿病的代码总是包含前缀 DIAB1 。

SELECT patient_id, patient_name, conditions
FROM Patients
WHERE conditions REGEXP '\\bDIAB1'

删除 所有重复的电子邮件,只保留一个具有最小 id 的唯一电子邮件。

(对于 SQL 用户,请注意你应该编写一个 DELETE 语句而不是 SELECT 语句。)

(对于 Pandas 用户,请注意你应该直接修改 Person 表。)

运行脚本后,显示的答案是 Person 表。驱动程序将首先编译并运行您的代码片段,然后再显示 Person 表。Person 表的最终顺序 无关紧要 。

DELETE FROM Person
WHERE id NOT IN (
SELECT id FROM (
SELECT MIN(id) AS id FROM Person GROUP BY email
) AS u
);

查询并返回 Employee 表中第二高的薪水 。如果不存在第二高的薪水,查询应该返回 null(Pandas 则返回 None) 。

select max(salary) as SecondHighestSalary
from Employee
where salary < (select max(salary) from Employee);

找出每个日期、销售的不同产品的数量及其名称。 每个日期的销售产品名称应按词典序排列。 返回按 sell_date 排序的结果表。

SELECT sell_date,
COUNT(DISTINCT product) AS num_sold,
group_concat(DISTINCT product) AS products
FROM Activities
GROUP BY sell_date;

要求获取在 2020 年 2 月份下单的数量不少于 100 的产品的名字和数目。

select p.product_name,sum(o.unit) as unit from Products as p join Orders as o
on o.order_date between ('2020-02-01') and ('2020-02-29') and p.product_id = o.product_id
group by product_name
having unit >= 100;

一个解决方案,以查找具有有效电子邮件的用户。

一个有效的电子邮件具有前缀名称和域,其中:

前缀 名称是一个字符串,可以包含字母(大写或小写),数字,下划线 ‘_’ ,点 ‘.’ 和/或破折号 ‘-‘ 。前缀名称 必须 以字母开头。 域 为 ‘@leetcode.com’ 。

select user_id,name,mail from Users
where mail regexp '^[a-zA-Z][a-zA-Z0-9\\_\\.\\-]*@leetcode\\.com$'

文档信息

Search

    Table of Contents