譯文主要是介紹如何用MySQL來存儲嵌套集合數據。在其中會增長一些本身的理解,也會刪除掉一些自認爲無用的廢話。
這篇文章主要講的是嵌套集合模型,因此鄰接表不是本文的重點,簡單略過就好。node
也許這是原文地址,由於我也不知道這是否是原文。mysql
什麼是分層數據?
算法
相似於樹形結構,除了根節點和葉子節點外,全部節點都有用一個父節點和多個子節點。sql
那麼,在MySQL中如何處理分層數據呢?函數
原文中介紹了兩種分層結構模型:鄰接表模型
和嵌套集合模型
。測試
首先,創建測試表,導入測試數據,網站
CREATE TABLE category( category_id INT AUTO_INCREMENT PRIMARY KEY, name VARCHAR(20) NOT NULL, parent INT DEFAULT NULL ); INSERT INTO category VALUES (1,'ELECTRONICS',NULL), (2,'TELEVISIONS',1), (3,'TUBE',2), (4,'LCD',2), (5,'PLASMA',2), (6,'PORTABLE ELECTRONICS',1), (7,'MP3 PLAYERS',6), (8,'FLASH',7), (9,'CD PLAYERS',6), (10,'2 WAY RADIOS',6); SELECT * FROM category ORDER BY category_id; +-------------+----------------------+--------+ | category_id | name | parent | +-------------+----------------------+--------+ | 1 | ELECTRONICS | NULL | | 2 | TELEVISIONS | 1 | | 3 | TUBE | 2 | | 4 | LCD | 2 | | 5 | PLASMA | 2 | | 6 | PORTABLE ELECTRONICS | 1 | | 7 | MP3 PLAYERS | 6 | | 8 | FLASH | 7 | | 9 | CD PLAYERS | 6 | | 10 | 2 WAY RADIOS | 6 | +-------------+----------------------+--------+ 10 rows in set (0.00 sec)
在鄰接表中,全部的數據均擁有一個Parent字段,用來存儲它的父節點。當前節點爲根節點的話,它的父節點則爲NULL。
那麼在遍歷的時候,可使用遞歸來實現查詢整棵樹,從根節點開始,不斷尋找子節點(父節點->子節點->父節點->子節點)。3d
通常須要獲取一個分層結構的路徑問題,那麼code
SELECT t1.name AS lev1, t2.name as lev2, t3.name as lev3, t4.name as lev4 FROM category AS t1 LEFT JOIN category AS t2 ON t2.parent = t1.category_id LEFT JOIN category AS t3 ON t3.parent = t2.category_id LEFT JOIN category AS t4 ON t4.parent = t3.category_id WHERE t1.name = 'ELECTRONICS'; +-------------+----------------------+--------------+-------+ | lev1 | lev2 | lev3 | lev4 | +-------------+----------------------+--------------+-------+ | ELECTRONICS | TELEVISIONS | TUBE | NULL | | ELECTRONICS | TELEVISIONS | LCD | NULL | | ELECTRONICS | TELEVISIONS | PLASMA | NULL | | ELECTRONICS | PORTABLE ELECTRONICS | MP3 PLAYERS | FLASH | | ELECTRONICS | PORTABLE ELECTRONICS | CD PLAYERS | NULL | | ELECTRONICS | PORTABLE ELECTRONICS | 2 WAY RADIOS | NULL | +-------------+----------------------+--------------+-------+ 6 rows in set (0.00 sec)
SELECT t1.name FROM category AS t1 LEFT JOIN category as t2 ON t1.category_id = t2.parent WHERE t2.category_id IS NULL; +--------------+ | name | +--------------+ | TUBE | | LCD | | PLASMA | | FLASH | | CD PLAYERS | | 2 WAY RADIOS | +--------------+
SELECT t1.name AS lev1, t2.name as lev2, t3.name as lev3, t4.name as lev4 FROM category AS t1 LEFT JOIN category AS t2 ON t2.parent = t1.category_id LEFT JOIN category AS t3 ON t3.parent = t2.category_id LEFT JOIN category AS t4 ON t4.parent = t3.category_id WHERE t1.name = 'ELECTRONICS' AND t4.name = 'FLASH'; +-------------+----------------------+-------------+-------+ | lev1 | lev2 | lev3 | lev4 | +-------------+----------------------+-------------+-------+ | ELECTRONICS | PORTABLE ELECTRONICS | MP3 PLAYERS | FLASH | +-------------+----------------------+-------------+-------+ 1 row in set (0.01 sec)
在檢索路徑的過程當中,除了本層外,每一層都會對應一個LEFT JOIN
,那麼若是層數不定怎麼辦?或者層數過多?
在刪除中間層的節點時,須要同時刪除該節點下的全部節點,不然會出現孤立節點。blog
原文中主要的目的是介紹嵌套集合模型,以下
經過集合的包含關係,嵌套結合模型能夠表示分層結構,每個分層能夠用一個Set來表示(一個圈),父節點所在的圈包含全部子節點所在的圈。
爲了用MySQL來表示集合關係,須要定義連個字段left
和right
(表示一個集合的範圍)。
CREATE TABLE nested_category ( category_id INT AUTO_INCREMENT PRIMARY KEY, name VARCHAR(20) NOT NULL, lft INT NOT NULL, rgt INT NOT NULL ); INSERT INTO nested_category VALUES (1,'ELECTRONICS',1,20), (2,'TELEVISIONS',2,9), (3,'TUBE',3,4), (4,'LCD',5,6), (5,'PLASMA',7,8), (6,'PORTABLE ELECTRONICS',10,19), (7,'MP3 PLAYERS',11,14), (8,'FLASH',12,13), (9,'CD PLAYERS',15,16), (10,'2 WAY RADIOS',17,18); SELECT * FROM nested_category ORDER BY category_id; +-------------+----------------------+-----+-----+ | category_id | name | lft | rgt | +-------------+----------------------+-----+-----+ | 1 | ELECTRONICS | 1 | 20 | | 2 | TELEVISIONS | 2 | 9 | | 3 | TUBE | 3 | 4 | | 4 | LCD | 5 | 6 | | 5 | PLASMA | 7 | 8 | | 6 | PORTABLE ELECTRONICS | 10 | 19 | | 7 | MP3 PLAYERS | 11 | 14 | | 8 | FLASH | 12 | 13 | | 9 | CD PLAYERS | 15 | 16 | | 10 | 2 WAY RADIOS | 17 | 18 | +-------------+----------------------+-----+-----+
因爲left
和right
是MySQL的保留字,所以,字段名稱用lft和rgt代替。每個集合都是從lft開始到rgt結束,也就是集合的兩個邊界。
在樹中也一樣適用,
當爲樹狀結構編號時,咱們從左到右,一次一層,賦值按照從左到右的順序遍歷其子節點,這種方法稱爲先序遍歷算法
。
因爲子節點的lft值總在父節點的lft和rgt值之間,因此能夠經過父節點鏈接到子節點上來檢索整棵樹。
SELECT node.name FROM nested_category AS node, nested_category AS parent WHERE node.lft BETWEEN parent.lft AND parent.rgt AND parent.name = 'ELECTRONICS' ORDER BY node.lft; +----------------------+ | name | +----------------------+ | ELECTRONICS | | TELEVISIONS | | TUBE | | LCD | | PLASMA | | PORTABLE ELECTRONICS | | MP3 PLAYERS | | FLASH | | CD PLAYERS | | 2 WAY RADIOS | +----------------------+</pre>
這個方法並不須要考慮層數,並且不須要考慮節點的rgt。
因爲每個葉子節點的rgt=lft+1
,那麼只須要這一個條件便可。
SELECT name FROM nested_category WHERE rgt = lft + 1; +--------------+ | name | +--------------+ | TUBE | | LCD | | PLASMA | | FLASH | | CD PLAYERS | | 2 WAY RADIOS | +--------------+
再也不須要多個join鏈接操做。
SELECT parent.name FROM nested_category AS node, nested_category AS parent WHERE node.lft BETWEEN parent.lft AND parent.rgt AND node.name = 'FLASH' ORDER BY node.lft; +----------------------+ | name | +----------------------+ | ELECTRONICS | | PORTABLE ELECTRONICS | | MP3 PLAYERS | | FLASH | +----------------------+
經過COUNT
和GROUP BY
函數來獲取父節點的個數。
SELECT node.name, (COUNT(parent.name) - 1) AS depth FROM nested_category AS node, nested_category AS parent WHERE node.lft BETWEEN parent.lft AND parent.rgt GROUP BY node.name ORDER BY node.lft; +----------------------+-------+ | name | depth | +----------------------+-------+ | ELECTRONICS | 0 | | TELEVISIONS | 1 | | TUBE | 2 | | LCD | 2 | | PLASMA | 2 | | PORTABLE ELECTRONICS | 1 | | MP3 PLAYERS | 2 | | FLASH | 3 | | CD PLAYERS | 2 | | 2 WAY RADIOS | 2 | +----------------------+-------+
甚至能夠獲得分層的縮進結果,
SELECT CONCAT( REPEAT(' ', COUNT(parent.name) - 1), node.name) AS name FROM nested_category AS node, nested_category AS parent WHERE node.lft BETWEEN parent.lft AND parent.rgt GROUP BY node.name ORDER BY node.lft; +-----------------------+ | name | +-----------------------+ | ELECTRONICS | | TELEVISIONS | | TUBE | | LCD | | PLASMA | | PORTABLE ELECTRONICS | | MP3 PLAYERS | | FLASH | | CD PLAYERS | | 2 WAY RADIOS | +-----------------------+
考慮到檢索中須要自鏈接的node或parent,所以須要增長一個額外的鏈接來做爲子查詢來限制子樹。
SELECT node.name, (COUNT(parent.name) - (sub_tree.depth + 1)) AS depth FROM nested_category AS node, nested_category AS parent, nested_category AS sub_parent, ( SELECT node.name, (COUNT(parent.name) - 1) AS depth FROM nested_category AS node, nested_category AS parent WHERE node.lft BETWEEN parent.lft AND parent.rgt AND node.name = 'PORTABLE ELECTRONICS' GROUP BY node.name ORDER BY node.lft )AS sub_tree WHERE node.lft BETWEEN parent.lft AND parent.rgt AND node.lft BETWEEN sub_parent.lft AND sub_parent.rgt AND sub_parent.name = sub_tree.name GROUP BY node.name ORDER BY node.lft; +----------------------+-------+ | name | depth | +----------------------+-------+ | PORTABLE ELECTRONICS | 0 | | MP3 PLAYERS | 1 | | FLASH | 2 | | CD PLAYERS | 1 | | 2 WAY RADIOS | 1 | +----------------------+-------+
假設一個場景,當用戶點擊網站上電子產品的一個分類時,將呈現該分類下的產品,同時須要列出全部子分類,並非所有分類。
爲了限制顯示分類的層數,須要使用HAVING
字句,
SELECT node.name, (COUNT(parent.name) - (sub_tree.depth + 1)) AS depth FROM nested_category AS node, nested_category AS parent, nested_category AS sub_parent, ( SELECT node.name, (COUNT(parent.name) - 1) AS depth FROM nested_category AS node, nested_category AS parent WHERE node.lft BETWEEN parent.lft AND parent.rgt AND node.name = 'PORTABLE ELECTRONICS' GROUP BY node.name ORDER BY node.lft )AS sub_tree WHERE node.lft BETWEEN parent.lft AND parent.rgt AND node.lft BETWEEN sub_parent.lft AND sub_parent.rgt AND sub_parent.name = sub_tree.name GROUP BY node.name HAVING depth <= 1 ORDER BY node.lft; +----------------------+-------+ | name | depth | +----------------------+-------+ | PORTABLE ELECTRONICS | 0 | | MP3 PLAYERS | 1 | | CD PLAYERS | 1 | | 2 WAY RADIOS | 1 | +----------------------+-------+
上面已經介紹瞭如何檢索結果,那麼如何才能增長新的節點呢?
若是但願在TELEVISIONS和PROTABLE ELECTRONICS節點之間增長一個新的節點,那麼新節點的lft和rgt的值應該是10和11,那麼全部大於10的節點(新節點右側的節點)的lft和rgt都應該加2,如上圖所示。
LOCK TABLE nested_category WRITE; SELECT @myRight := rgt FROM nested_category WHERE name = 'TELEVISIONS'; UPDATE nested_category SET rgt = rgt + 2 WHERE rgt > @myRight; UPDATE nested_category SET lft = lft + 2 WHERE lft > @myRight; INSERT INTO nested_category(name, lft, rgt) VALUES('GAME CONSOLES', @myRight + 1, @myRight + 2); UNLOCK TABLES
若是但願在葉子節點下增長節點,須要修改下查詢語句,
LOCK TABLE nested_category WRITE; SELECT @myLeft := lft FROM nested_category WHERE name = '2 WAY RADIOS'; UPDATE nested_category SET rgt = rgt + 2 WHERE rgt > @myLeft; UPDATE nested_category SET lft = lft + 2 WHERE lft > @myLeft; INSERT INTO nested_category(name, lft, rgt) VALUES('FRS', @myLeft + 1, @myLeft + 2); UNLOCK TABLES;``` ###刪除節點 刪除葉子節點比較容易,只須要刪除本身,而刪除一箇中間層節點就須要刪除其全部子節點。在這個模型中,全部子節點的節點正好在lft和rgt之間。
LOCK TABLE nested_category WRITE;
SELECT @myLeft := lft, @myRight := rgt, @myWidth := rgt - lft + 1
FROM nested_category
WHERE name = 'GAME CONSOLES';
DELETE FROM nested_category WHERE lft BETWEEN @myLeft AND @myRight;
UPDATE nested_category SET rgt = rgt - @myWidth WHERE rgt > @myRight;
UPDATE nested_category SET lft = lft - @myWidth WHERE lft > @myRight;
UNLOCK TABLES;
在某些狀況下,只須要刪除某個節點,可是並不但願刪除該節點下的子節點數據。 經過把右側全部節點的左右值-2,當前節點的子節點左右值-1
LOCK TABLE nested_category WRITE;
SELECT @myLeft := lft, @myRight := rgt, @myWidth := rgt - lft + 1
FROM nested_category
WHERE name = 'PORTABLE ELECTRONICS';
DELETE FROM nested_category WHERE lft = @myLeft;
UPDATE nested_category SET rgt = rgt - 1, lft = lft - 1 WHERE lft BETWEEN @myLeft AND @myRight;
UPDATE nested_category SET rgt = rgt - 2 WHERE rgt > @myRight;
UPDATE nested_category SET lft = lft - 2 WHERE lft > @myRight;
UNLOCK TABLES;
```
原做者推薦了一本名爲《Joe Celko's Trees and Hierarchies in SQL for Smarties》的書籍,該書的做者是SQL領域的大神Joe Celko(嵌套幾何模型的創造者)。這本書涵蓋了本文中未涉及到的一些高級話題。