SQL練習——LeetCode解題和總結(1)

只用於我的的學習和總結。

178. Rank Scores

 1、表信息html

 2、題目信息mysql

對上表中的成績由高到低排序,並列出排名。當兩我的得到相同分數時,取並列名次,且名詞中無斷檔。web

Write a SQL query to rank scores. If there is a tie between two scores, both should have the same ranking. Note that after a tie, the next ranking number should be the next consecutive integer value. In other words, there should be no "holes" between ranks.sql

For example, given the above Scores table, your query should generate the following report (order by highest score):app

 3、參考SQL編輯器

(1)方法一:直接表內鏈接ide

1 select s1.Score,count(distinct s2.Score) as 'Rank'
2 from Scores s1
3 inner join Scores s2
4 on s1.Score<=s2.Score
5 group by s1.id
6 order by count(distinct s2.Score);

分組字段和查詢字段不一致,能夠在嵌套一層select。函數

解題思路:學習

一、欲獲得排名,確定用count進行統計,一個表確定不行;優化

二、鏈接條件:獲得大於或等於某個數的集合,好比大於等於3.50的集合就是{3.50,3.65,4.00,3.85,4.00,3.65}

三、分組:獲得大於或等於某個數的6個集合組

四、去重統計:由於是排名無斷檔,須要進行去重再統計,否則就變成統計集合的個數(即大於等於某個值的個數),而不是該值在集合中排名

(2)方法二:窗口函數——dense_rank()(MySQL8.0)

1 SELECT  Score,
2  DENSE_RANK() OVER(ORDER BY Score DESC) AS 'Rank'
3 FROM Scores;

窗口函數複習:https://zhuanlan.zhihu.com/p/135119865

180. Consecutive Numbers

1、表信息

 2、題目信息

找出連續出現三次及以上的數字。例如,上表中,應該返回數字 1。

Write a SQL query to find all numbers that appear at least three times consecutively. For example, given the above Logs table, 1 is the only number that appears consecutively for at least three times.

 

 

 

3、參考SQL

方法一:屢次鏈接

select distinct a.num as ConsecutiveNums
from Logs a
inner join Logs b on a.id=b.id+1
inner join Logs c on a.id=c.id+2
where a.num=b.num and a.num=c.num;

思路總結:

1.連續三次出現,意味着ID連續、值相等。

 2.屢次鏈接時,讓當前記錄、下條記錄、下下條記錄拼接在一塊兒

3.篩選值相等的行記錄,有可能連續出現大於3次,去重便可獲得該num。

方法二:窗口函數——行向下偏移lead()

select distinct Num as ConsecutiveNums
from
    (select Num,
    lead(num,1) over(order by id) as next_num,
    lead(num,2) over(order by id) as next_next_num
    from Logs) t
where t.Num=t.next_num
and t.Num=t.next_next_num;

窗口函數lead():https://www.begtut.com/mysql/mysql-lead-function.html

181. Employees Earning More Than Their Managers[e]

1、表信息

以下 Employee 表中包含所有的員工以及其對應的經理。

The Employee table holds all employees including their managers. Every employee has an Id, and there is also a column for the manager Id.

 2、題目信息

基於如上 Employee 表,查出薪水比其經理薪水高的員工姓名。

Given the Employee table, write a SQL query that finds out employees who earn more than their managers. For the above table, Joe is the only employee who earns more than his manager.

 3、參考SQL

自鏈接:

1 select e1.Name as Employee
2 from Employee e1
3 inner join Employee e2
4 on e1.ManagerId=e2.Id
5 where e1.Salary>e2.Salary;

182. Duplicate Emails[e]

1、表信息

2、題目信息

查詢重複的郵箱

Write a SQL query to find all duplicate emails in a table named Person. For example, your query should return the following for the above table:

3、參考SQL

方法一:本身寫的

select Email
from Person
group by Email
having count(*)>1;

方法二:官方答案

1 SELECT Email FROM
2  (SELECT Email, COUNT(id) AS num
3   FROM Person
4   GROUP BY Email) AS tmp
5 WHERE num > 1;

 

183. Customers Who Never Order[e]

1、表信息

假設一個網站上包含以下兩張表:顧客表和訂單表

Suppose that a website contains two tables, the Customers table and the Orders table.

表一:Customers 

表二:Orders 

2、題目信息

找出沒有下過訂單的顧客姓名。

Write a SQL query to find all customers who never order anything. Using the above tables as an example, return the following:

3、參考SQL

方法一:左外鏈接

1 select Name as Customers
2 from Customers c
3 left join Orders o
4 on c.Id=o.CustomerId 
5 where o.Id is null;

方法二:子查詢(官方方法)

select Name as Customers
from Customers
where Id not in(
    select CustomerId from Orders
);

184. Department Highest Salary[M]

1、表信息

表一:Employee

 

 表二:Department

2、題目信息

查詢每一個部門中,薪水最高的員工姓名及其薪水。

Write a SQL query to find employees who have the highest salary in each of the departments. For the above tables, your SQL query should return the following rows (order of rows does not matter).

 

3、參考SQL

方法一:窗口函數——dense_rank()

select d.Name as Department,t.Name as Employee,t.Salary
from
    (
    select *,
    dense_rank() over(partition by DepartmentId order by Salary DESC) as ranking
    from Employee
    ) t
inner join Department d
on t.DepartmentId=d.Id
where ranking=1;

同一層select下,字段別名不用能與條件篩選!!!執行順序問題from——.....——where——.....——select——.....

思路:

1.用dense_rank()不斷檔的方式,給各個部門分組的工資大小排名

2.取排名爲1的都是最大工資

方法二:關聯子查詢

 1 select 
 2     d.Name as Department,
 3     t.Name as Employee,
 4     t.Salary
 5 from Department d
 6 inner join
 7     (select Name,DepartmentId,Salary
 8      from Employee e
 9      where (e.DepartmentId,Salary) in
10         (select DepartmentId,max(Salary)
11         from Employee
12         group by DepartmentId)
13     ) t
14 on d.Id=t.DepartmentId;

in能夠進行多屬性值(column1_name, column2_name,....)進行篩選,一一對應所篩選的字段。

思路:

1.找出部門中最大的工資

2.讓原始表中各部門的工資等於最大工資,羅列出全部最大工資。

3.內鏈接查詢相關信息

185. Department Top Three Salaries[h]

經典topN問題:記錄每組最大的N條記錄,既要分組,又要排序。

1、表信息

表一:Employee

 表二:Department

2、題目信息

查詢各部門薪水排名前三名的員工姓名及薪水。

Write a SQL query to find employees who earn the top three salaries in each of the department. For the above tables, your SQL query should return the following rows (order of rows does not matter).

3、參考SQL

方法一:窗口函數——dense_rank()

 1 select 
 2     d.name as Department,
 3     e.name as Employee,
 4     Salary
 5 from Department d
 6 inner join 
 7 (
 8     select name,Salary,DepartmentId,
 9     dense_rank() over(partition by DepartmentId order by Salary desc) as ranking
10     from Employee
11 ) e
12 on d.Id=e.DepartmentId
13 where ranking<=3;

思路:

1.用dense_rank(),按照部門分組並降序排列,不間斷編上排名

2.篩選排名小於等於3的記錄,就是前三工資的記錄。

(ps:用窗口函數作,思路和上題差很少,區別只是後面篩選的條數)

方法二:自鏈接分組篩選

 1 select 
 2     d.name as Department,
 3     e.name as Employee,
 4     Salary
 5 from Department d
 6 inner join
 7 (
 8     select e1.Id,e1.DepartmentId,e1.name,e1.Salary
 9     from Employee e1
10     inner join Employee e2
11     on e1.DepartmentId=e2.DepartmentId
12     and e1.Salary<=e2.Salary
13     group by e1.Id
14     having count(distinct e2.Salary)<=3
15 ) e
16 on d.Id=e.DepartmentId;

思路:

1.關鍵是要找出各部門前三工資的記錄:

  自鏈接,鏈接條件爲部門相等,工資比我大或者相等;

  按員工分組,則組記錄爲比我大或者相等所有員工記錄;

  統計組記錄條數,少於等於3條,則表示我必定是工資第三的,這裏有一點注意,不能用count(*),由於和我工資相等的員工除了我自己,還有可能有其餘員工,若是不去重,就會致使記錄條數大於3(假設我恰好是第三),從而篩選掉,這不是想要的結果;

2.再按需求作相關查詢便可

(ps這題的自鏈接條件思路和178題差很少)

196. Delete Duplicate Emails[E]

1、表信息

2、題目信息

刪除郵件重複的行,當有重複值時,保留Id最小的行。

Write a SQL query to delete all duplicate email entries in a table named Person, keeping only unique emails based on its smallest Id. For example, after running your query, the above Person table should have the following rows:

3、參考SQL

方法一:子查詢

1 delete from Person
2 where Id not in(
3     select Id from
4     (
5         select min(Id) AS Id
6         from Person
7         group by Email
8     ) t
9 );

思路:

1.子查詢找出不用刪除的郵箱Id集合(重複郵箱的最小Id加上郵箱不重複的Id):郵箱分組,取最小Id便可。

2.刪除時,判斷Id不在此集合便可

(ps:MySQL不讓同時對統一表進行修改和查詢操做,因此須要外層嵌套一層輔助表;min(Id)要記得起別名)

方法二:自鏈接

1 delete p1 from Person p1
2 inner join Person p2
3 on p1.Email=p2.Email
4 and p1.Id>p2.Id;

思路:

1.郵箱相等進行鏈接獲得的集合爲:(1)郵箱相等,Id也相等。即不重複的(2)郵箱相等,p1.Id>p2.Id(3)郵箱相等,p1.Id<p2.Id

2.把郵箱相等,p1.Id>p2.Id提取出來,刪除端便可。這樣就保留了小和不重複的。

(ps:聯級刪除也有這種語法——delete 表名 from .....)

197. Rising Temperature[E]

1、表信息

2、題目信息

如下圖爲例,找出比前一天溫度高的id。

Write an SQL query to find all dates' id with higher temperature compared to its previous dates (yesterday).

Return the result table in any order.

The query result format is in the following example:

            

3、參考SQL

1 select w1.id as Id from Weather w1
2 inner join Weather w2
3 on datediff(w1.recordDate,w2.recordDate)=1
4 where w1.temperature>w2.temperature;

思路:

1.自內連的笛卡爾積中,去取出間隔相差一天的記錄,用datadiff()函數。

2.再篩選出溫度比上一天高ID便可。

(ps;日期不能進行簡單的相加相減,最好使用日期函數。https://www.w3school.com.cn/sql/sql_dates.asp

複習日期函數:https://leetcode-cn.com/problems/rising-temperature/solution/tu-jie-sqlmian-shi-ti-ru-he-bi-jiao-ri-qi-shu-ju-b/

262. Trips and Users[H]

1、表信息

表一:Trips

該表包含所有出租車信息的記錄。每一條記錄有一個 Id,ClientId 和 Drive_Id 都是與 Users 表聯結的外鍵。Status 包含 completed, cancelled_by_driver, 和 cancelled_by_client 三種狀態。

The Trips table holds all taxi trips. Each trip has a unique Id, while Client_Id and Driver_Id are both foreign keys to the Users_Id at the Users table. Status is an ENUM type of (‘completed’, ‘cancelled_by_driver’, ‘cancelled_by_client’).

 

 

 表二:Users

該表包含所有的用戶信息。每個用戶都有一個 Id,Role有三種狀態:client, driver 以及 partner。

The Users table holds all users. Each user has an unique Users_Id, and Role is an ENUM type of (‘client’, ‘driver’, ‘partner’).

2、題目信息

找出2013年10月1日至2013年10月3日期間,每一天 未被禁止的 (unbanned) 用戶的訂單取消率。

Write a SQL query to find the cancellation rate of requests made by unbanned users (both client and driver must be unbanned) between Oct 1, 2013 and Oct 3, 2013. The cancellation rate is computed by dividing the number of canceled (by client or driver) requests made by unbanned users by the total number of requests made by unbanned users.

取消率的計算方式以下:(被司機或乘客取消的非禁止用戶生成的訂單數量) / (非禁止用戶生成的訂單總數)

For the above tables, your SQL query should return the following rows with the cancellation rate being rounded to two decimal places.

3、參考SQL

方法一:子查詢篩選出有效訂單記錄

SELECT
    Request_at AS 'Day',
    round( count( CASE t_Status WHEN 'completed' THEN NULL ELSE 1 END ) / count( * ), 2 ) AS 'Cancellation Rate' 
FROM
    (
SELECT
    Request_at,
    Status as t_Status
FROM
    Trips 
WHERE
    Client_Id NOT IN ( SELECT Users_Id FROM Users WHERE Banned = 'Yes' ) 
    AND Driver_Id NOT IN ( SELECT Users_Id FROM Users WHERE Banned = 'Yes' ) 
    ) t 
WHERE
    Request_at BETWEEN '2013-10-01' 
    AND '2013-10-03' 
GROUP BY
    Request_at

思路:

1.重要一點是篩選出有效訂單記錄集合:顧客和司機都是未被禁止的!

子查詢:

  Client_Id NOT IN ( SELECT Users_Id FROM Users WHERE Banned ='Yes' ) 

  AND Driver_Id NOT IN ( SELECT Users_Id FROM Users WHERE Banned = 'Yes')

2.在上一步基礎上,統計天天分組被取消的訂單,用case when語句:當訂單是complated完成狀態時,返回null,這樣count就不會計數。

方法二:鏈接查詢篩選出有效訂單記錄集合

https://leetcode-cn.com/problems/trips-and-users/solution/san-chong-jie-fa-cong-nan-dao-yi-zong-you-gua-he-n/

計算訂單取消率還能夠用avg(Status!='completed'):

https://leetcode-cn.com/problems/trips-and-users/solution/ci-ti-bu-nan-wei-fu-za-er-by-luanz/

511. Game Play Analysis I[E]

1、表信息

Activity表記錄了遊戲用戶的行爲信息,主鍵爲(player_id, event_date)的組合。每一行記錄每一個遊戲用戶登陸狀況以及玩的遊戲數(玩的遊戲多是0)。

(player_id, event_date) is the primary key of this table. This table shows the activity of players of some game. Each row is a record of a player who logged in and played a number of games (possibly 0) before logging out on some day using some device.

2、題目信息

查詢每一個用戶首次登錄的日期

Write an SQL query that reports the first login date for each player.

The query result format is in the following example:

3、參考SQL

1 SELECT
2     player_id,
3     MIN( event_date ) AS first_login 
4 FROM
5     Activity 
6 GROUP BY
7     play_id 
8 ORDER BY
9     play_id;

512. Game Play Analysis II[E]

1、表信息

同上題

2、題目信息

查詢每一個用戶首次登錄的日期所使用的設備。

Write a SQL query that reports the device that is first logged in for each player.

3、參考SQL

方法一:內鏈接+子查詢

1 SELECT
2     a.player_id,
3     a.device_id 
4 FROM
5     Activity AS a
6     INNER JOIN ( SELECT player_id, MIN( event_date ) AS first_login FROM Activity GROUP BY player_id ORDER BY player_id ) AS b 
7     ON a.player_id = b.player_id AND a.event_date = b.first_login 
8 ORDER BY
9     a.player_id;

思路:

1.經過子查詢查出表b:每一個玩家最先登陸的日期

2.再進行內鏈接(或者where篩選均可以)。

方法二:窗口函數dense_rank()

1 SELECT
2     player_id,
3     device_id 
4 FROM
5     ( SELECT player_id, device_id, RANK ( ) OVER ( PARTITION BY player_id ORDER BY event_date ) AS rnk FROM Activity ) AS tmp 
6 WHERE
7     rnk = 1;

思路:

1.player_id分組,event_date升序,不間斷排名

2.取排名爲1即爲玩家最先登陸的信息記錄

534. Game Play Analysis III[M]——分組累加和

1、表信息

同上

2、題目信息

按照日期,查詢每一個用戶到目前爲止累積玩的遊戲數。

Write an SQL query that reports for each player and date, how many games played so far by the player. That is, the total number of games played by the player until that date. Check the example for clarity.

3、參考SQL

方法一:窗口函數sum()

1 SELECT
2     player_id,
3     event_date,
4     SUM( games_played ) over ( PARTITION BY player_id ORDER BY event_date ) AS games_played_so_far 
5 FROM
6     activity;

思路:

1.player_id分組,event_date升序

2.對分組後的games_played進行累計求和

方法二:內鏈接後分組統計

 1 SELECT
 2     a1.player_id,
 3     a1.event_date,
 4     SUM( a2.games_played ) AS games_played_so_far 
 5 FROM
 6     activity a1
 7     INNER JOIN activity a2 ON a1.event_date >= a2.event_date 
 8     AND a1.player_id = a2.player_id 
 9 GROUP BY
10     a1.player_id,
11     a1.event_date;

思路:

1.對於這種須要分組累計統計的(求和、計數也好),內鏈接的鏈接條件通常都是非等值鏈接,讓主表的某個字段的值對應鏈接從表的一樣字段的多個值

   這樣對主表的該字段進行分組後,就能夠對從表的某個字段進行統計操做。

2.涉及到須要分組兩次的話,還要注意鏈接條件要加多一個等值判斷,避免組內的字段的值鏈接到其餘組的字段值

   沒有進行組內的等值鏈接條件的限定,不一樣組的值亂鏈接匹配。致使最後分組統計結果不正確。

3.獲得鏈接總表後,按要求進行統計便可。

550. Game Play Analysis IV[M]

1、表信息

同上題

2、題目信息

查詢首次登陸後次日也登陸的用戶比例。

Write an SQL query that reports the fraction of players that logged in again on the day after the day they first logged in, rounded to 2 decimal places. In other words, you need to count the number of players that logged in for at least two consecutive days starting from their first login date, then divide that number by the total number of players.

The query result format is in the following example:

   

3、參考SQL

 方法一:內鏈接(統計連續兩天登陸)

1 SELECT
2     ROUND(
3     COUNT( CASE datediff( a1.event_date, a2.event_date ) WHEN 1 THEN 1 ELSE NULL END ) / COUNT(DISTINCT a1.player_id),2) AS fraction
4 FROM
5     activity a1
6     INNER JOIN activity a2 
7     ON a1.player_id = a2.player_id 
8     AND a1.event_date >= a2.event_date;

思路:

1.內鏈接的條件和思路和上題同樣

2.統計連續兩天登入,只須要同一用戶,登陸日期相差一天便可。(注意:這裏不是統計首次登陸次日也登陸的記錄,而是隻要連續兩天登陸的記錄,由於開始日期可能不是最小日期)

方法二:子查詢+外鏈接(統計統計首次登陸次日也登陸)

SELECT
	ROUND( COUNT( DISTINCT t.player_id ) / COUNT( DISTINCT a1.player_id ), 2 ) AS fraction 
FROM
	activity a1
	LEFT JOIN ( SELECT player_id, MIN( event_date ) AS first_login FROM activity GROUP BY player_id ) t 
	ON a1.player_id = t.player_id 
	AND DATEDIFF( a1.event_date, t.first_login ) = 1;

思路:

1.先用子查詢查出每一個用戶登陸的最先時間first_login

2.左外鏈接:id相等,和最先登陸時間相差一天。獲得的表爲:次日用戶也登陸的記錄會和first_login鏈接,次日不登陸用戶irst_login則爲null(不是相差一天)

3.統計次日登陸的用戶:COUNT( DISTINCT t.player_id ),null值不統計;不能用COUNT( t.player_id is not null )

(PS:原則上 t.player_id記錄是惟一的,除非一個用戶次日登陸會產生多條記錄,而不是記錄最後一次登陸)

方法三:窗口函數_FIRST_VALUE()

1 SELECT
2     ROUND(COUNT(DISTINCT t.player_id)/COUNT(DISTINCT a1.player_id),2) AS fraction
3 FROM activity a1
4 LEFT JOIN (SELECT player_id,first_value(event_date) over(partition by player_id ORDER BY event_date) AS first_login FROM activity) t
5 ON a1.player_id=t.player_id
6 AND DATEDIFF(a1.event_date,t.first_login)=1

569. Median Employee Salary[H]

1、表信息

下面的員工表包含所有的員工ID,公司名稱以及每一個員工的薪水。

The Employee Table holds all employees. The employee table has three columns: Employee Id, Company Name, and Salary.

2、題目信息

找出各公司薪水的中位數。不用SQL內建函數。

Write a SQL query to find the median salary of each company. Bonus points if you can solve it without using any built-in SQL functions.

3、參考SQL

方法一:根據中位數最原始的定義

 1 SELECT
 2     e1.Id,
 3     e1.company,
 4     e1.salary
 5 FROM
 6     (SELECT Id,company,salary,@rnk:=IF(@pre=company, @rnk:=@rnk+1,1) AS rnk,@pre:=company 
 7     FROM employee,(SELECT @rnk:=0, @pre:=NULL) AS init 
 8     ORDER BY company,salary,Id
 9     ) e1
10     INNER JOIN 
11     (SELECT company,COUNT(*) AS cnt FROM employee GROUP BY Company
12     ) e2
13     ON e1.company=e2.company
14 WHERE e1.rnk IN (cnt/2+0.5,cnt/2,cnt/2+1);

思路:

中位數定義:奇數個數字時,中位數是中間的數字;偶數個數字時,中位數中間兩個數的均值(這裏只列出兩個數,不求值)。即,數列總個數爲N,則:

  • N爲奇數,中位數排序編號是(N+1)/2=N/2+0.5

  • N爲偶數,中位數排序編號是N/2和N/2+1

因爲一個數列N總個數不是奇就是偶(互斥),因此(N/2+0.5)和(N/二、N/2+1)也是互斥,兩個元組的元素不可能同時爲整數,也就是說不管數列總個數N是奇還偶,均可以直接這樣判斷:

  中位數位置序號  IN (N/2+0.5,N/2,N/2+1)

基於上述能夠獲得大體的思路:

  1.對薪水進行分組排序(不間斷連續),用自定義變量方法或者MySQL8.0的ROW_NUMBER()窗口函數

  2.求總個數cnt,count(*)

  3.篩選出中位數的位置序號e1.rnk IN (cnt/2+0.5,cnt/2,cnt/2+1)

其餘方法:

http://www.javashuo.com/article/p-yyewjial-nw.html

https://zhuanlan.zhihu.com/p/257081415

570. Managers with at Least 5 Direct Reports[M]

1、表信息

下面員工表中包含各部門員工信息及其對應的經理。

The Employee table holds all employees including their managers. Every employee has an Id, and there is also a column for the manager Id.

2、題目信息

查詢出至少管理5個員工的經理的名稱。

Given the Employee table, write a SQL query that finds out managers with at least 5 direct report. For the above table, your SQL query should return:

3、參考SQL

方法一:子查詢

1 SELECT NAME 
2 FROM
3     employee 
4 WHERE
5     Manager_id IN ( SELECT Manager_id FROM employee GROUP BY Manger_id HAVING COUNT( * ) >= 5 );

571. Find Median Given Frequency of Numbers[H]

1、表信息

下表記錄了每一個數字及其出現的頻率。

The Numbers table keeps the value of number and its frequency.

2、題目信息

根據每一個數字出現的頻率找出中位數。

Write a query to find the median of all numbers and name the result as median. In this table, the numbers are 0, 0, 0, 0, 0, 0, 0, 1, 2, 2, 2, 3, so the median is (0 + 0)/2 = 0.

3、參考SQL

 參考答案:

https://www.e-learn.cn/topic/3843270

574. Winning Candidate[M]

1、表信息

表一:Candidate

該表中包含候選人的id和姓名。

 

 

 表二:Vote

該表中id是自增列,CandidateId 對應 Candidate 表中的id。

2、題目信息

找到當選者的名字。注意:本題目中不考慮平票的狀況,也就是說只有一個當選者。

Write a sql to find the name of the winning candidate, the above example will return the winner B.

Notes: You may assume there is no tie, in other words there will be only one winning candidate.

3、參考SQL

方法一:子查詢

1 SELECT NAME 
2 FROM
3     candidate 
4 WHERE
5     id = ( SELECT candidateid FROM vote GROUP BY candidateid ORDER BY COUNT( * ) DESC LIMIT 1 );

1.題目要求當選者只有一個,也就是獲票數最高只有一個

2.求分組最高:按candidateid分組,統計每一個人票數,降序排列取第一行記錄

3.只有一個,讓‘id=’便可篩選出來。

(ps:多個票數相等的話,要用‘id in’,可是直接對limit字句的子查詢用‘id in’會報錯誤:This version of MySQL doesn't yet support 'LIMIT & IN/ALL/ANY/SOME subquery',須要再嵌套一層子查詢https://blog.csdn.net/sjzs5590/article/details/7337552

方法二:內鏈接

1 SELECT
2     c.NAME 
3 FROM
4     candidate c
5     INNER JOIN ( SELECT candidateid FROM vote GROUP BY candidateid ORDER BY COUNT( * ) DESC LIMIT 1 ) t 
6     ON c.id = t.candidateid;

拓展:倘若有多個票數相等的怎麼辦?加多一個投票(6,5),即vote表變成;

則查詢結果爲:

 

思路:

1.分組統計票數,而後用窗口函數dense_rank()對票數進行不間斷連續排名

2.排名爲1的便是票數最高的

 

 1 SELECT 
 2     NAME 
 3 FROM
 4     candidate 
 5 WHERE
 6     id IN 
 7     (
 8         SELECT
 9             candidateid 
10         FROM
11                 (
12             SELECT
13                 candidateid,
14                 dense_rank ( ) over ( ORDER BY poll DESC ) AS ranking 
15             FROM
16                 ( SELECT candidateid, COUNT( * ) AS poll FROM vote GROUP BY candidateid ) t1 
17                 ) t2 
18             WHERE
19                 ranking = 1 
20     );

小結:到目前爲止dense_rank()窗口函數已完成以下問題

  1. 直接對某字段排名,取topN——只有一個組。178題
  2. 先分組,再對分組後的某字段進行組內排名——取組內topN。185題
  3. 先分組,再對分組進行統計,再對統計後的結果進行排名——取組某個統計屬性(組的條數、組的某個字段和、平均等等)的topN。本題拓展

577. Employee Bonus[e]

1、表信息

表一:Employee

Employee表中,empId 是主鍵。

表二:Bonus

Bonus 表中 empId 是主鍵。

2、題目信息

選出獎金小於1000元的員工姓名及其得到的獎金數。

Select all employee's name and bonus whose bonus is < 1000.

3、參考SQL

方法一:左鏈接

1 SELECT
2     e.NAME,
3     b.bonus 
4 FROM
5     Employee e
6     LEFT JOIN Bonus b ON e.empId = b.empId 
7 WHERE
8     b.bonus < 1000 
9     OR b.bonus IS NULL;

思路:

1.獎金少於1000:包括沒獎金爲null,和有獎金但<1000

2.Bonus 表只記錄有獎金的,天然想到左外鏈接

方法二:子查詢

1 SELECT
2     a.NAME,
3     b.bonus 
4 FROM
5     Employee AS a
6     LEFT JOIN Bonus AS b ON a.empId = b.empId 
7 WHERE
8     a.empId NOT IN ( SELECT empId FROM Bonus WHERE bonus >= 1000 );

思路:

獎金少於1000有兩種狀況,但獎金大於等1000只有一種狀況

578. Get Highest Answer Rate Question[M]

1、表信息

下面 survey_log 表中包含id, action, question_id, answer_id, qnum, timestamp。其中id表明用戶編號,action 爲 "show", "answer", 以及 "skip" 中的一個值,當 action 值爲 "answer" 時,answer_id 則不是空值,不然 answerid 爲空值。q_num 是該問題出現的順序

2、題目信息

找到回答率最高的問題編號。其中,回答率 = (action 爲 answer 的次數)/(action 爲 show 的次數)。

Write a sql query to identify the question which has the highest answer rate.

Note:The highest answer rate meaning is: answer number's ratio in show number in the same question.

3、參考SQL

我的理解題目:大概相似於知乎的問題推送。系統或其餘用戶廣播式的推送(show)某個問題給若干個同一畫像用戶羣體,收到推送用戶可能只是瀏覽skip,或者回答answer。統計回答率最高的問題,能夠知道該用戶羣體喜歡什麼樣類型的問題,下次推送時,能夠爲這類用戶羣體多推送該類問題。最終達到問題被回答最大化,提升問題回答率。甚至能夠給用戶作基礎分類或分類優化,若是用戶開始沒有主動選擇便籤的話。

方法一:分組+按需統計+按需排序後截取

 1 SELECT question_id AS survey_log
 2 FROM
 3     (SELECT
 4         question_id,
 5         SUM(IF(action='answer',1,0)) as answer_num,
 6         SUM(IF(action='show',1,0))as show_num
 7 --         SUM(case when action="answer" THEN 1 ELSE 0 END) AS num_answer,
 8 --    SUM(case when action="show" THEN 1 ELSE 0 END) AS num_show,
 9     FROM survey_log
10     GROUP BY question_id) t
11 ORDER BY answer_num/show_num DESC
12 LIMIT 1;

思路:

1.在子查詢中,經過對問題進行分組,統計問題被show了多少次,被answer了多少次(不能對用戶分組,題目的表只是一部分,同一個問題可能會show多個用戶,但不是每一個都會answer,研究的對象是某個問題)

2.外層查詢,常見的降序取最大值。

方法二:簡化版

1 SELECT question_id AS survey_log
2 FROM
3 survey_log
4 GROUP BY question_id
5 ORDER BY COUNT(answer_id)/COUNT(IF(action='show',1,NULL)) DESC
6 LIMIT 1

思路:answer_id只有在問題被回答了纔有信息。count不會統計爲NULL的值。

579. Find Cumulative Salary of an Employee[H]

1、表信息

下面的 Employee 表包含了員工一年中薪水的狀況。

The Employee table holds the salary information in a year.

2、題目信息

查詢出每一個員工三個月的累積工資,其中不包含最近一個月,且按照員工id升序排列,月份降序排列。

Write a SQL to get the cumulative sum of an employee's salary over a period of 3 months but exclude the most recent month.

The result should be displayed by 'Id' ascending, and then by 'Month' descending.

3、參考SQL

 方法一:窗口函數sum()和dense_rank()

 1 SELECT Id,MONTH,cum_sum AS Salary 
 2 FROM
 3         (
 4     SELECT *,dense_rank ( ) over ( PARTITION BY id ORDER BY cum_sum DESC ) AS ranking 
 5     FROM
 6         ( SELECT id, MONTH, SUM( Salary ) over ( PARTITION BY Id ORDER BY month ) AS cum_sum FROM employee_579 ) t1 
 7         ) t2 
 8 WHERE
 9     ranking <> 1 
10 ORDER BY Id,MONTH DESC;

思路:(分組——統計——排名——篩選)

1.子查詢t1:窗口聚合函數sum,按員工id分組,month降序進行工資累加,起別名爲cum_sum

2.爲了剔除最後一個月(最近一個月)的工資累加記錄,子查詢t2:用dense_rank()對cum_sum進行降序不間斷連續排名,則最後一個月記錄排名爲1

3.加多一層查詢篩選掉最後一個月記錄的數據(ranking<>1),最後按需查詢顯示便可

(PS:假如最近一個月工資爲0,那麼用dense_rank()排名就會出現兩個1,因此最好用row_number()函數吧。沒試過,應該邏輯沒錯的!)

以爲太醜的話,能夠用with as 語句進行美化:

1 WITH s AS
2 (SELECT Id, month, Salary,
3 Sum(Salary) OVER (PARTITION BY Id ORDER BY Month) as SumSal,
4 ROW_NUMBER() OVER (PARTITION BY id ORDER BY id ASC, month DESC) rn
5 FROM employee_579)
6 
7 SELECT Id,Month,SumSal as Salary
8 FROM s
9 WHERE rn > 1;

方法二:官方答案

 1 SELECT E1.id, E1.month,
 2 (IFNULL(E1.salary, 0) + IFNULL(E2.salary, 0) + IFNULL(E3.salary, 0)) AS Salary
 3 FROM
 4 (
 5 SELECT id, MAX(month) AS month FROM Employee
 6 GROUP BY id
 7 HAVING COUNT(*) > 1
 8 ) AS maxmonth
 9 LEFT JOIN Employee AS E1 
10 ON (maxmonth.id = E1.id AND maxmonth.month > E1.month)
11 LEFT JOIN Employee AS E2 
12 ON (E2.id = E1.id AND E2.month = E1.month - 1)
13 LEFT JOIN Employee AS E3 
14 ON (E3.id = E1.id AND E3.month = E1.month - 2)
15 ORDER BY id ASC, month DESC;

思路:

1.分組求最大月份(最近一個月),只有一個月的having掉,由於最近一個不統計

2.第一個left,鏈接條件的目的是想把最大月份給篩選掉

3.後面幾個left則是爲了,爲累加作準備,造成金字塔類型表結構:

  E1.salary    E2.salary   E3.salary  ..........

  第一月   NULL   NULL

  第一月  第二月  NULL

  第一月  第二月  第三個月

  .........

而後,就能夠這樣進行計算了(IFNULL(E1.salary, 0) + IFNULL(E2.salary, 0) + IFNULL(E3.salary, 0)) AS Salary,只想說,真秒!雖然我不會。。。。

580. Count Student Number in Departments[M]

1、表信息

表一:Student

 

 

 表二:Department

2、題目信息

查詢每一個部門下的學生數,要列出全部部門,即便該部門沒有學生。結果按學生數降序、部門名稱升序排列。

3、參考SQL

1 SELECT a.dept_name, COUNT(b.student_id) AS student_number FROM department AS a
2 LEFT JOIN student AS b
3 ON a.dept_id = b.dept_id
4 GROUP BY a.dept_name
5 ORDER BY student_number DESC, a.dept_name;

思路:左鏈接注意從表主表;分組時,注意分段字段和查詢字段一致

584. Find Customer Referee[E]

1、表信息

customer表中包含顧客編號、顧客名稱、以及推薦人編號。

Given a table customer holding customers information and the referee.

2、題目信息

找出不是被2號顧客推薦來的顧客姓名。

Write a query to return the list of customers NOT referred by the person with id '2'.

For the sample data above, the result is:

3、參考SQL

方法一:子查詢

1 SELECT name FROM customer WHERE
2 id NOT IN
3 (SELECT id FROM customer WHERE referee_id = 2);

(ps:子查詢若是有limit等字句,記得要多加一層查詢)

方法二:OR IS NULL

1 SELECT name FROM customer 
2 WHERE referee_id <> 2 OR referee_id IS NULL;

(PS:因爲 SQL 的三值邏輯,若是條件只是 WHERE referee_id <> 2,則返回不出 referee_id 爲 null 的顧客。此外,若是將條件寫成 referee_id = NULL 一樣也是錯誤的,由於判斷空值必須使用 IS NULL/IS NOT NULL。)

方法三:IFNULL()函數

SELECT name FROM customer
WHERE IFNULL(referee_id, 0) <> 2;

(PS:關於各類null函數——http://www.javashuo.com/article/p-cudkdsya-ba.html

方法四:連環判斷NULL函數——coalesce()

1 SELECT name FROM customer
2 WHERE COALESCE(referee_id, 0) <> 2;

(PS:關於此函數——https://blog.csdn.net/weixin_38750084/article/details/83034294

585. Investments in 2016[M]

1、表信息

 

2、題目信息

3、參考SQL

586. Customer Placing the Largest Number of Orders[E]

1、表信息

下圖中的訂單表orders包含了訂單號,顧客編號,下單日期,要求日期,發貨日期,狀態,以及評論。

 

2、題目信息

找出下單數最多的顧客,列出customer_number。注意:結果只有一個值,不會存在多個值,也就是默認最大值只有一個。

3、參考SQL

方法一:

1 SELECT
2     customer_number 
3 FROM
4     orders 
5 GROUP BY
6     customer_number 
7 ORDER BY
8     COUNT( * ) DESC 
9     LIMIT 1;

思路:

分組後,在order by 後使用聚合函數進行排序,這種用法能夠節省不少SQL語句。

方法二:

1 SELECT customer_number FROM orders 
2 GROUP BY customer_number
3 HAVING COUNT(customer_number) >= ALL 
4 (SELECT COUNT(customer_number) FROM orders GROUP BY customer_number);

595. Big Countries[e]

1、表信息

World表中包含世界各國家、所屬洲、國土面積、人口數以及GDP信息。

2、題目信息

找到國土面積大於300萬平方千米或人口數超過2500萬的國家,並顯示器人口數和國土面積。

A country is big if it has an area of bigger than 3 million square km or a population of more than 25 million.

Write a SQL solution to output big countries' name, population and area.

For example, according to the above table, we should output:

3、參考SQL

1 SELECT name, population, area FROM World
2 WHERE area > 3000000
3 OR population > 25000000;

596. Classes More Than 5 Students[E]

1、表信息

courses表中包含學生 id 和 課程名稱。

2、題目信息

列出至少有5名學生的課程名稱。注意:每一個學生只算做一次。

Please list out all classes which have more than or equal to 5 students.

Note: The students should not be counted duplicate in each course.

For example, the table:

3、參考SQL

方法一:

1 select class from courses
2 group by class
3 having count(distinct student)>=5;

思路:

直接在having用聚合函數進行篩選合適的組(課程)

方法二:子查詢

1 select class
2 from 
3 (
4     select count(distinct student) as num,class
5     from courses
6     group by class
7 ) t
8 where num>=5;

注意事項:

1.where不能直接用聚合函數進行篩選,須要起別名num

2.子查詢派生表要起別名t

597. Friend Requests I: Overall Acceptance Rate[E]

1、表信息

In social network like Facebook or Twitter, people send friend requests and accept others’ requests as well. Now given two tables as below:

表一:friend_request好友請求

 

 表二:request_accepted申請請過

2、題目信息

找出申請經過率,結果保留兩位小數。

Write a query to find the overall acceptance rate of requests rounded to 2 decimals, which is the number of acceptance divides the number of requests.

注意:

  • 接受的申請不單來源於 friend_request 表。所以,只需對兩張表分別進行計數,而後求出經過率便可。即經過率=接受請求的總數/請求數量
  • 邀請人對同一我的可能不止發送過一次邀請;接受人也能夠屢次接受同一個邀請。所以要移除重複記錄。
  • 若是徹底沒有邀請記錄,則結果返回0.00。

針對如上的兩張表,結果應該返回0.80。

3、參考SQL

1 SELECT
2     ROUND(
3         IFNULL(
4             (SELECT COUNT(DISTINCT requester_id,accepter_id) FROM request_accepted597)
5             /
6             (SELECT COUNT(DISTINCT sender_id,send_to_id) FROM friend_request597)
7         ,0)
8     ,3)
9     AS accept_rate;

思路:

一、用兩個子查詢分別在兩個表中計算接受的請求總數、發出的請求總數;

二、求二者比率便可

(PS:去重統計數量的時候,兩個字段相同纔是真的重複,平時count函數裏面通常就只加一個字段進行去重)

還能夠用group by進行統計,看看那個字段組重複,好比,

1 SELECT requester_id,accepter_id,COUNT(*) FROM request_accepted597 GROUP BY requester_id,accepter_id;

601. Human Traffic of Stadium[H]

1、表信息

某城市建了一個新的體育館,天天有許多人來參觀。下表記錄了該自增編號、參觀日期以及參觀人數的信息。

X city built a new stadium, each day many people visit it and the stats are saved as these columns: id, visit_date, people

2、題目信息

找出連續至少三條記錄體育館參觀人數至少爲100人的狀況。

Please write a query to display the records which have 3 or more consecutive rows and the amount of people more than 100(inclusive).

注意:每一天只有一行記錄,且參觀日期隨編號列增長。

Note: Each day only have one row record, and the dates are increasing with id increasing.

3、參考SQL

方法一:

 1 SELECT s1.* FROM
 2 stadium601 s1
 3 LEFT JOIN stadium601 s2
 4 ON s1.id=s2.id-1
 5 LEFT JOIN stadium601 s3
 6 ON s2.id=s3.id-1
 7 WHERE s1.people>=100
 8 AND (s2.people>=100 OR s2.people IS NULL)
 9 AND (s3.people>=100 OR s3.people IS NULL)
10 ORDER BY s1.id;

思路:

一、連續兩次左鏈接,鏈接條件爲id-1,至關於與把表向上平移一個單位,把三天的記錄拼接到一行記錄,獲得下面的表:

二、篩選:咱們注意到,鏈接後獲得的表中。s1的people字段不可能爲null,若s3的people字段爲null,說明該行記錄爲倒數第二行,若s2的people字段爲null,說明該行記錄爲最後一行,這就是爲何用左鏈接的緣由。因此,篩選的時候要注意s2.people和s3.people能夠爲null的狀況。

三、查出s1.*記錄便可

(PS:這種連續出現的問題,鏈接條件應該考慮讓表向上平移一個1單位,相似題有180題。真實的業務場景大概就是找出知足某個條件的某個時間段,好比用戶的活躍時間週期、商場火爆週期等等)

 方法二:

 1 WITH 
 2 tmp AS (
 3 SELECT a.visit_date AS date1,
 4 b.visit_date AS date2,
 5 c.visit_date AS date3
 6 FROM stadium601 AS a
 7 LEFT JOIN stadium601 AS b 
 8 ON b.id = a.id + 1
 9 LEFT JOIN stadium601 AS c
10 ON c.id = a.id + 2
11 WHERE a.people >= 100
12 AND b.people >= 100
13 AND c.people >= 100
14 ),
15 
16 tmp1 AS (
17 SELECT date1 AS total_date FROM tmp
18 UNION
19 SELECT date2 AS total_date FROM tmp
20 UNION
21 SELECT date3 AS total_date FROM tmp
22 )
23 
24 SELECT * FROM stadium601
25 WHERE visit_date IN 
26 (SELECT * FROM tmp1);

思路:

一、注意id-1和id+1的連接效果,一個往上平移,一個往下平移。不要想固然id+1是下移!!!

二、注意union的用法:

1 SELECT column_list
2 UNION [DISTINCT | ALL]
3 SELECT column_list
4 UNION [DISTINCT | ALL]
5 SELECT column_list

  默認是distinct,結果集是去重的,而 all是不去重的!

602. Friend Requests II: Who Has the Most Friends[M]

1、表信息

 卡卡卡卡卡卡,怎麼博客園編輯器那麼卡。大概是圖片太多了。

2、題目信息

 

3、參考SQL

相關文章
相關標籤/搜索