新特性解讀 | MySQL 8.0 窗口函數詳解

時間 2019-11-16

標籤特性解讀 mysql 8.0 窗口函數詳解欄目 MySQL 简体版

原文原文鏈接

原創做者：楊濤濤mysql

背景

一直以來，MySQL 只有針對聚合函數的彙總類功能，好比MAX, AVG 等，沒有從 SQL 層針對聚合類每組展開處理的功能。不過 MySQL 開放了 UDF 接口，能夠用 C 來本身寫UDF，這個就增長了功能行難度。sql

這種針對每組展開處理的功能就叫窗口函數，有的數據庫叫分析函數。mongodb

在 MySQL 8.0 以前，咱們想要獲得這樣的結果，就得用如下幾種方法來實現：數據庫

1. session 變量數組

2. group_concat 函數組合session

3. 本身寫 store routinesoracle

接下來咱們用經典的 學生/課程/成績 來作窗口函數演示函數

準備

學生表post

mysql> show create table student \G
*************************** 1. row ***************************
Table: student
Create Table: CREATE TABLE student (
sid int(10) unsigned NOT NULL,
sname varchar(64) DEFAULT NULL,
PRIMARY KEY (sid)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
1 row in set (0.00 sec)

課程表測試

mysql> show create table course\G
*************************** 1. row ***************************
Table: course
Create Table: CREATE TABLE `course` (
`cid` int(10) unsigned NOT NULL,
`cname` varchar(64) DEFAULT NULL,
PRIMARY KEY (`cid`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
1 row in set (0.00 sec)

成績表

mysql> show create table score\G
*************************** 1. row ***************************
Table: score
Create Table: CREATE TABLE `score` (
`sid` int(10) unsigned NOT NULL,
`cid` int(10) unsigned NOT NULL,
`score` tinyint(3) unsigned DEFAULT NULL,
PRIMARY KEY (`sid`,`cid`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
1 row in set (0.00 sec)

測試數據

mysql> select * from student;
+-----------+--------------+
| sid | sname |
+-----------+--------------+
| 201910001 | 張三 |
| 201910002 | 李四 |
| 201910003 | 武松 |
| 201910004 | 潘金蓮 |
| 201910005 | 菠菜 |
| 201910006 | 楊發財 |
| 201910007 | 歐陽修 |
| 201910008 | 郭靖 |
| 201910009 | 黃蓉 |
| 201910010 | 東方不敗 |
+-----------+--------------+
10 rows in set (0.00 sec)

mysql> select * from score;;
+-----------+----------+-------+
| sid | cid | score |
+-----------+----------+-------+
| 201910001 | 20192001 | 50 |
| 201910001 | 20192002 | 88 |
| 201910001 | 20192003 | 54 |
| 201910001 | 20192004 | 43 |
| 201910001 | 20192005 | 89 |
| 201910002 | 20192001 | 79 |
| 201910002 | 20192002 | 97 |
| 201910002 | 20192003 | 82 |
| 201910002 | 20192004 | 85 |
| 201910002 | 20192005 | 80 |
| 201910003 | 20192001 | 48 |
| 201910003 | 20192002 | 98 |
| 201910003 | 20192003 | 47 |
| 201910003 | 20192004 | 41 |
| 201910003 | 20192005 | 34 |
| 201910004 | 20192001 | 81 |
| 201910004 | 20192002 | 69 |
| 201910004 | 20192003 | 67 |
| 201910004 | 20192004 | 99 |
| 201910004 | 20192005 | 61 |
| 201910005 | 20192001 | 40 |
| 201910005 | 20192002 | 52 |
| 201910005 | 20192003 | 39 |
| 201910005 | 20192004 | 74 |
| 201910005 | 20192005 | 86 |
| 201910006 | 20192001 | 42 |
| 201910006 | 20192002 | 52 |
| 201910006 | 20192003 | 36 |
| 201910006 | 20192004 | 58 |
| 201910006 | 20192005 | 84 |
| 201910007 | 20192001 | 79 |
| 201910007 | 20192002 | 43 |
| 201910007 | 20192003 | 79 |
| 201910007 | 20192004 | 98 |
| 201910007 | 20192005 | 88 |
| 201910008 | 20192001 | 45 |
| 201910008 | 20192002 | 65 |
| 201910008 | 20192003 | 90 |
| 201910008 | 20192004 | 89 |
| 201910008 | 20192005 | 74 |
| 201910009 | 20192001 | 73 |
| 201910009 | 20192002 | 42 |
| 201910009 | 20192003 | 95 |
| 201910009 | 20192004 | 46 |
| 201910009 | 20192005 | 45 |
| 201910010 | 20192001 | 58 |
| 201910010 | 20192002 | 52 |
| 201910010 | 20192003 | 55 |
| 201910010 | 20192004 | 87 |
| 201910010 | 20192005 | 36 |
+-----------+----------+-------+
50 rows in set (0.00 sec)

mysql> select * from course;
+----------+------------+
| cid | cname |
+----------+------------+
| 20192001 | mysql |
| 20192002 | oracle |
| 20192003 | postgresql |
| 20192004 | mongodb |
| 20192005 | dble |
+----------+------------+
5 rows in set (0.00 sec)

MySQL 8.0 以前

好比咱們求成績排名前三的學生排名，我來舉個用 session 變量和 group_concat 函數來分別實現的例子：

session 變量方式

每組開始賦一個初始值序號和初始分組字段。

SELECT 
   b.cname,
   a.sname,
   c.score, c.ranking_score
FROM
   student a,
   course b,
   (
   SELECT
    c.*,
    IF(
        @cid = c.cid,
        @rn := @rn + 1,
        @rn := 1
      ) AS ranking_score,
      @cid := c.cid AS tmpcid
    FROM
      (  
      SELECT
           *
      FROM
        score
        ORDER BY cid,
        score DESC
       ) c,
      (
       SELECT
          @rn := 0 rn,
          @cid := ''
         ) initialize_table 
       ) c
WHERE a.sid = c.sid
AND b.cid = c.cid
AND c.ranking_score <= 3
ORDER BY b.cname,c.ranking_score;

+------------+-----------+-------+---------------+
| cname | sname | score | ranking_score |
+------------+-----------+-------+---------------+
| dble | 張三 | 89 | 1 |
| dble | 歐陽修 | 88 | 2 |
| dble | 菠菜 | 86 | 3 |
| mongodb | 潘金蓮 | 99 | 1 |
| mongodb | 歐陽修 | 98 | 2 |
| mongodb | 郭靖 | 89 | 3 |
| mysql | 李四 | 100 | 1 |
| mysql | 潘金蓮 | 81 | 2 |
| mysql | 歐陽修 | 79 | 3 |
| oracle | 武松 | 98 | 1 |
| oracle | 李四 | 97 | 2 |
| oracle | 張三 | 88 | 3 |
| postgresql | 黃蓉 | 95 | 1 |
| postgresql | 郭靖 | 90 | 2 |
| postgresql | 李四 | 82 | 3 |
+------------+-----------+-------+---------------+
15 rows in set, 5 warnings (0.01 sec)

group_concat 函數方式

利用 findinset 內置函數來返回下標做爲序號使用。

SELECT
    *
FROM
     (
     SELECT
       b.cname,
       a.sname,
       c.score,
       FIND_IN_SET(c.score, d.gp) score_ranking
     FROM
       student a,
       course b,
       score c,
      (
       SELECT
          cid,
          GROUP_CONCAT(
             score
             ORDER BY score DESC SEPARATOR ','
           ) gp
       FROM
          score
          GROUP BY cid
          ORDER BY score DESC
       ) d
   WHERE a.sid = c.sid
   AND b.cid = c.cid
   AND c.cid = d.cid
   ORDER BY d.cid,
   score_ranking
   ) ytt
WHERE score_ranking <= 3；

+------------+-----------+-------+---------------+
| cname | sname | score | score_ranking |
+------------+-----------+-------+---------------+
| dble       | 張三   | 89   | 1 |
| dble       | 歐陽修 | 88   | 2 |
| dble       | 菠菜   | 86   | 3 |
| mongodb    | 潘金蓮 | 99   | 1 |
| mongodb    | 歐陽修 | 98   | 2 |
| mongodb    | 郭靖   | 89   | 3 |
| mysql      | 李四   | 100  | 1 |
| mysql      | 潘金蓮 | 81   | 2 |
| mysql      | 歐陽修 | 79   | 3 |
| oracle     | 武松   | 98   | 1 |
| oracle     | 李四   | 97   | 2 |
| oracle     | 張三   | 88   | 3 |
| postgresql | 黃蓉   | 95   | 1 |
| postgresql | 郭靖   | 90   | 2 |
| postgresql | 李四   | 82   | 3 |
+------------+-----------+-------+---------------+
15 rows in set (0.00 sec)

MySQL 8.0 窗口函數

MySQL 8.0 後提供了原生的窗口函數支持，語法和大多數數據庫同樣，好比仍是以前的例子：

用 row_number() over () 直接來檢索排名。

mysql> 
SELECT
    *
FROM
    (
     SELECT
       b.cname,
       a.sname,
       c.score,
       row_number() over (
         PARTITION BY b.cname
         ORDER BY c.score DESC
       ) score_rank
     FROM
       student AS a,
       course AS b,
       score AS c
     WHERE a.sid = c.sid
     AND b.cid = c.cid
     ) ytt
WHERE score_rank <= 3;

+------------+-----------+-------+------------+
| cname | sname | score | score_rank |
+------------+-----------+-------+------------+
| dble       | 張三   | 89  | 1 |
| dble       | 歐陽修 | 88  | 2 |
| dble       | 菠菜   | 86  | 3 |
| mongodb    | 潘金蓮 | 99  | 1 |
| mongodb    | 歐陽修 | 98  | 2 |
| mongodb    | 郭靖   | 89  | 3 |
| mysql      | 李四   | 100 | 1 |
| mysql      | 潘金蓮 | 81  | 2 |
| mysql      | 歐陽修 | 79  | 3 |
| oracle     | 武松   | 98  | 1 |
| oracle     | 李四   | 97  | 2 |
| oracle     | 張三   | 88  | 3 |
| postgresql | 黃蓉   | 95  | 1 |
| postgresql | 郭靖   | 90  | 2 |
| postgresql | 李四   | 82  | 3 |
+------------+-----------+-------+------------+
15 rows in set (0.00 sec)

那咱們再找出課程 MySQL 和 DBLE 裏不及格的倒數前兩名學生名單。

mysql> 
SELECT
    *
FROM
    (
    SELECT
       b.cname,
       a.sname,
       c.score,
       row_number () over (
       PARTITION BY b.cid
       ORDER BY c.score ASC
       ) score_ranking
    FROM
       student AS a,
       course AS b,
       score AS c
    WHERE a.sid = c.sid
    AND b.cid = c.cid
    AND b.cid IN (20192005, 20192001)
    AND c.score < 60
    ) ytt
WHERE score_ranking < 3;

+-------+--------------+-------+---------------+
| cname | sname | score | score_ranking |
+-------+--------------+-------+---------------+
| mysql | 菠菜     | 40  | 1 |
| mysql | 楊發財   | 42  | 2 |
| dble  | 武松    | 34  | 1 |
| dble  | 東方不敗 | 36  | 2 |
+-------+--------------+-------+---------------+
4 rows in set (0.00 sec)

到此爲止，咱們只是演示了row_number() over() 函數的使用方法，其餘的函數有興趣的朋友能夠本身體驗體驗，方法都差很少。