Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

52 寻找面试候选人 #64

Open
astak16 opened this issue Mar 16, 2022 · 0 comments
Open

52 寻找面试候选人 #64

astak16 opened this issue Mar 16, 2022 · 0 comments
Labels

Comments

@astak16
Copy link
Owner

astak16 commented Mar 16, 2022

题目

题目连接:寻找面试候选人

contest_id 是该 contest 的主键。该表包含 LeetCode 竞赛的 id 和该场比赛中金牌、银牌、铜牌的 user_id。可以保证,所有连续的比赛都有连续的 id,没有 id 被跳过。

user_id 是该 user_id 的主键。该表包含用户信息。

编写 SQL 语句来返回面试候选人的 姓名和 邮件.当用户满足以下两个要求中的任意一条,其成为面试候选人:

  • 该用户在连续三场及更多比赛中赢得奖牌。
  • 该用户在三场及更多不同的比赛中赢得金牌(这些比赛可以不是连续的)

可以以任何顺序返回结果。

create table contests (
  contest_id int,
  gold_medal int,
  silver_medal int,
  bronze_medal int
);
insert into contests (contest_id, gold_medal, silver_medal, bronze_medal) values 
(190, 1, 5, 2),
(191, 2, 3, 5),
(192, 5, 2, 3),
(193, 1, 3, 5),
(194, 4, 5, 2),
(195, 4, 2, 1),
(196, 1, 5, 2);

create table users (
  user_id int,
  mail varchar(50),
  name varchar(30)
);
insert into users (user_id, mail, name) values 
(1, '[email protected]', 'Sarah'),
(2, '[email protected]', 'Bob'),
(3, '[email protected]', 'Alice'),
(4, '[email protected]', 'Hercy'),
(5, '[email protected]', 'Quarz');

分析

需要查询出的用户是

  1. 连续三场比赛或者更多比赛的奖牌
  2. 三场或者更多的比赛获得金牌

三场或者更多比赛的金牌比较好查询只需要筛选出金牌数大于等于 3 的用户。

主要的关键点在于如何查询连续三场比赛获得奖牌。

这里使用了四种方法去解题:

  • 方法一使用 **窗口函数,**方法二使用 变量以及 if 函数
    • 思路:判断连续的数方法:将数进行升序或者降序排列,然后用这个数减去排列的序号,如果是连续的数,那么它们的结果是相同的。
  • 方法三使用 **多表联查,**方法四使用 窗口函数
    • 思路: 1 + 1 = 2 and 1 + 2 = 3

SQL:方法一

select
  name, mail
from (
  select
    user_id
  from (
    select
      user_id,
      contest_id - 
        row_number() over(partition by user_id order by contest_id) diff
    from (
      select contest_id, gold_medal user_id from contests
      union all
      select contest_id, silver_medal user_id from contests
      union all
      select contest_id, bronze_medal user_id from contests
    ) temp1
  ) temp2 group by user_id, diff having count(diff) >= 3
  union
  select
    gold_medal user_id
  from contests group by gold_medal having count(gold_medal) >= 3
) temp3 join users using(user_id);

解析

  1. 临时表 temp1 输出 contest_iduser_id 字段。
    • gold_medalsilver_medalbronze_medal 获奖者分别查出来,使用 union all 将它们连接起来
  2. 临时表 temp2 输出 user_iddiff 字段
    • 使用 row_number() 进行排序,然后用 contest_id 减去序号,得到 diff,如果是连续的数 diff 是相同的
  3. 查询临时表 temp2,按照 user_iddiff 进行分组,筛选出 count(diff) >= 3,连续三场及以上的比赛,diff 的数量是大于 3
  4. 筛选出获得三块及三块以上金牌的用户
  5. 临时表 temp3 ,将第 3 步和第 4 步使用 union 连接
  6. 查询临时表 temp3 ,将它和 users 表连接,连接条件是 user_id,输出 namemail

SQL:方法二

with temp as (
  select contest_id, gold_medal user_id from contests
  union all
  select contest_id, silver_medal user_id from contests
  union all
  select contest_id, bronze_medal user_id from contests
)
select name, mail from (
  select user_id from (
    select
      user_id,
      if(@pre_user_id = user_id,
        if(@pre_contest_id = contest_id - 1, @rank:=@rank, @rank:=@rank + 1), 
        @rank:=1) diff,
      @pre_contest_id:=contest_id,
      @pre_user_id:=user_id
    from temp,(select @pre_contest_id:=null, @pre_user_id:=null, @rank:=1) init
    order by user_id, contest_id
  ) temp1 group by user_id, diff having count(diff) >= 3
  union
  select 
    gold_medal
  from contests group by gold_medal having count(gold_medal) >= 3
) temp2 join users using(user_id);

解析

思路和方法一一样,区别是这里使用变量代替 row_number

因为其他步骤和方法一一样,所以这里只讲变量部分

  1. 设置三个变量, @pre_contest_id@pre_user_id@rank

    • @pre_contest_id 保存的是上一场比赛的 id,初始化为 null
    • @pre_user_id 保存的是上一个用户的 id ,初始化 null
    • @rank 保存的是当前排名,初始化为 1
    select @pre_contest_id:=null, @pre_user_id:=null, @rank:=1
  2. user_id = 2 为例

    • 当运行第一条数据时,@pre_user_id = null,代入 @pre_user_id = user_idnull = 2false 进入 @rank:=1,此时 diff1
      • if 语句结束后, @pre_contest_id = 190@pre_user_id = 2
    • 当运行第二条数据是, @pre_user_id = 2 ,代入 @pre_user_id = user_id2 = 2true ,进入 if(@pre_contest_id = contest_id - 1, @rank:=@rank, @rank:=@rank + 1),此时 @pre_contest_id = 190,代入 @pre_contest_id = contest_id - 1190 = 191 - 1true,进入 @rank:=@rank,代入得 @rank = 1 ,此时 diff1
      • if 语句结束后, @pre_contest_id = 191@pre_user_id = 2
    • 以此类推,直到运行到最后一条数据

    Tips

    if(@pre_contest_id = contest_id - 1, @rank:=@rank, @rank:=@rank + 1) 是关键,通过这步,可以直接算出 diff ,而需要先算出排名rank,再用题号减去 rank 得到 diff

    select
      ...
      if(@pre_user_id = user_id,
        if(@pre_contest_id = contest_id - 1, @rank:=@rank, @rank:=@rank + 1),
        @rank:=1) diff,
      @pre_contest_id:=contest_id,
      @pre_user_id:=user_id
    from ...

SQL:方法三

with temp as (
  select contest_id, gold_medal user_id from contests
  union all
  select contest_id, silver_medal user_id from contests
  union all
  select contest_id, bronze_medal user_id from contests
)
select name, mail from (
  select 
    distinct temp1.user_id 
  from temp temp1, temp temp2, temp temp3
  where temp1.user_id = temp2.user_id and temp2.user_id = temp3.user_id
  and temp1.contest_id + 1 = temp2.contest_id
  and temp2.contest_id + 1 = temp3.contest_id
  union
  select
    gold_medal
  from contests group by gold_medal having count(gold_medal) >= 3
) t join users using(user_id);

解析

  1. 首先创建一个临时表 temp,将 gold_medalsilver_medalbronze_medal 获奖者分别查出来,使用 union all 将它们连接起来。
  2. temp 表自连,连接三次,分别是 temp1temp2temp3,除了 user_id 的基本连接条件,最重要的是 temp1.contest_id + 1 = temp2.contest_id and temp2.contest_id + 1 = temp3.contest_id,意思是: 上一场比赛 + 1 = 当前这场比赛当前这场比赛 + 1 = 下一场比赛
    • 这里把 temp2.contest_id 看成当前这场比赛, temp1.contest_id 是上一场比赛, temp3.contest_id 是下一场比赛
  3. 查询出获得三块及以上金牌的用户
  4. 将第 2 步和第 3 步使用 union 连接起来作为临时表 t
  5. 查询临时表 t ,将它和 users 表连接,连接条件是 user_id,输出 namemail

Tips

select 
  distinct temp1.user_id 
from temp temp1, temp temp2, temp temp3
where temp1.user_id = temp2.user_id and temp2.user_id = temp3.user_id

可以替换成 join

select 
  distinct temp1.user_id 
from temp temp1 join temp temp2 using(user_id) join temp temp3 using(user_id)

方法四

with temp as (
  select contest_id, gold_medal user_id from contests
  union all
  select contest_id, silver_medal user_id from contests
  union all
  select contest_id, bronze_medal user_id from contests
)
select name, mail from (
  select user_id from (
    select
      user_id,
      contest_id,
      lead(contest_id, 1) over(partition by user_id order by contest_id) second,
      lead(contest_id, 2) over(partition by user_id order by contest_id) third
    from temp
  ) temp1 where contest_id + 1 = second and contest_id + 2 = third
  union
  select
    gold_medal
  from contests group by gold_medal having count(gold_medal) >= 3
) t join users using(user_id);

解析

方法四和方法三的思路是一样的,只有第 2 步不一样,所以只讲解第 2 步,其他步骤参考方法三。

  • 使用窗口函数 lead(),计算出第二场比赛的 contest_id 和第三场比赛的 contest_id

    • 举例 lead(contest_id, 1) ... secondlead(contest_id, 2) ... thirdlead 带领或者领导的意思,当前行带领一行,带领两行
    user_id  |  contest_id  |  second  |  third
    2        |  190         |  191     |  192
    2        |  191         |  192     |  194
    2        |  192         |  194     |  195
    2        |  194         |  195     |  196
    2        |  195         |  196     |  null
    2        |  196         |  null    |  null
    • 筛选条件 contest_id + 1 = second and contest_id + 2 = third
  • 也可以使用窗口函数 lag(),计算出上一场比赛的 contest_id 和上两场比赛的 contest_id

    • 举例 lag(contest_id, 1) ... lastlag(contest_id, 2) ... last_twolag 滞后或落后的意思,当前行滞后一行,滞后两行
    user_id  |  contest_id  |  last    |  last_two
    2        |  190         |  null    |  null
    2        |  191         |  190     |  null
    2        |  192         |  191     |  190
    2        |  194         |  192     |  191
    2        |  195         |  194     |  192
    2        |  196         |  195     |  194
    • 筛选条件 contest_id - 1 = last and contest_id - 2 = last_two
  • 也可以同时使用 lag(contest_id, 1) ... lastlead(contest_id, 1) ... second

    user_id  |  last  |  contest_id  |  second
    2        |  null  |  190         |  191
    2        |  190   |  191         |  192
    2        |  191   |  192         |  194
    2        |  192   |  194         |  195
    2        |  194   |  195         |  196
    2        |  195   |  196         |  null
    • 筛选条件 contest_id - 1 = last and contest_id + 1 = second
@astak16 astak16 added the 中等 label Mar 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant