在計算機科學中,基數樹,或稱Patricia trie/tree,或crit bit tree,壓縮前綴樹,是一種更節省空間的Trie(前綴樹)。對於基數樹的每一個節點,若是該節點是惟一的子樹的話,就和父節點合併。node
golang的web框架echo
和gin
都使用了radix tree
做爲路由查找的算法,咱們以gin的實現來分析一下。git
在gin的路由中,每個Http Method
(GET, PUT, POST…)都對應了一棵 radix tree
github
func (engine *Engine) addRoute(method, path string, handlers HandlersChain) {
// ...
// 獲取method對應的樹,若是沒有就建立
root := engine.trees.get(method)
if root == nil {
// 建立radix tree,只有根節點
root = new(node)
engine.trees = append(engine.trees, methodTree{method: method, root: root})
}
root.addRoute(path, handlers)
}
複製代碼
radix tree
能夠被認爲是一棵簡潔版的前綴樹。擁有共同前綴的節點也共享同一個父節點。下面是一個GET
方法對應的路由樹的結構:golang
Priority Path Handle
9 \ *<1>
3 ├s nil
2 |├earch\ *<2>
1 |└upport\ *<3>
2 ├blog\ *<4>
1 | └:post nil
1 | └\ *<5>
2 ├about-us\ *<6>
1 | └team\ *<7>
1 └contact\ *<8>
複製代碼
*<num>
是方法(handler)對應的指針,從根節點遍歷到葉子節點咱們就能獲得完整的路由表,圖中的示例實現瞭如下路由:web
GET("/", func1)
GET("/search/", func2)
GET("/support/", func3)
GET("/blog/", func4)
GET("/blog/:post/", func5)
GET("/about-us/", func6)
GET("/about-us/team/", func7)
GET("/contact/", func8)
複製代碼
:post
是真實的post name
的一個佔位符(就是一個參數)。這裏體現了radix tree相較於hash-map的一個優勢,樹結構容許咱們的路徑中存在動態的部分(參數),由於咱們匹配的是路由的模式而不是hash值算法
爲了更具擴展性,每一層的節點按照priority排序,priority是節點的子節點(兒子節點,孫子節點等)註冊的handler的數量,這樣作有兩個好處:後端
├------------
├---------
├-----
├----
├--
├--
└-
複製代碼
節點的數據結構以下:bash
type node struct {
// 節點路徑,好比上面的s,earch,和upport
path string
// 節點是不是參數節點,好比上面的:post
wildChild bool
// 節點類型,包括static, root, param, catchAll
// static: 靜態節點,好比上面的s,earch等節點
// root: 樹的根節點
// catchAll: 有*匹配的節點
// param: 參數節點
nType nodeType
// 路徑上最大參數個數
maxParams uint8
// 和children字段對應, 保存的是分裂的分支的第一個字符
// 例如search和support, 那麼s節點的indices對應的"eu"
// 表明有兩個分支, 分支的首字母分別是e和u
indices string
// 兒子節點
children []*node
// 處理函數
handlers HandlersChain
// 優先級,子節點註冊的handler數量
priority uint32
}
複製代碼
func (n *node) addRoute(path string, handlers HandlersChain) {
fullPath := path
n.priority++
numParams := countParams(path)
// non-empty tree
if len(n.path) > 0 || len(n.children) > 0 {
walk:
for {
// Update maxParams of the current node
if numParams > n.maxParams {
n.maxParams = numParams
}
// Find the longest common prefix.
// This also implies that the common prefix contains no ':' or '*'
// since the existing key can't contain those chars.
i := 0
max := min(len(path), len(n.path))
for i < max && path[i] == n.path[i] {
i++
}
// Split edge
// 開始分裂,好比一開始path是search,新來了support,s是他們匹配的部分,
// 那麼會將s拿出來做爲parent節點,增長earch和upport做爲child節點
if i < len(n.path) {
child := node{
path: n.path[i:], // 不匹配的部分做爲child節點
wildChild: n.wildChild,
indices: n.indices,
children: n.children,
handlers: n.handlers,
priority: n.priority - 1, // 降級成子節點,priority減1
}
// Update maxParams (max of all children)
for i := range child.children {
if child.children[i].maxParams > child.maxParams {
child.maxParams = child.children[i].maxParams
}
}
// 當前節點的子節點變成剛剛分裂的出來的節點
n.children = []*node{&child}
// []byte for proper unicode char conversion, see #65
n.indices = string([]byte{n.path[i]})
n.path = path[:i]
n.handlers = nil
n.wildChild = false
}
// Make new node a child of this node
// 將新來的節點插入新的parent節點做爲子節點
if i < len(path) {
path = path[i:]
// 若是是參數節點(包含:或*)
if n.wildChild {
n = n.children[0]
n.priority++
// Update maxParams of the child node
if numParams > n.maxParams {
n.maxParams = numParams
}
numParams--
// Check if the wildcard matches
// 例如:/blog/:pp 和 /blog/:ppp,須要檢查更長的通配符
if len(path) >= len(n.path) && n.path == path[:len(n.path)] {
// check for longer wildcard, e.g. :name and :names
if len(n.path) >= len(path) || path[len(n.path)] == '/' {
continue walk
}
}
panic("path segment '" + path +
"' conflicts with existing wildcard '" + n.path +
"' in path '" + fullPath + "'")
}
// 首字母,用來與indices作比較
c := path[0]
// slash after param
if n.nType == param && c == '/' && len(n.children) == 1 {
n = n.children[0]
n.priority++
continue walk
}
// Check if a child with the next path byte exists
// 判斷子節點中是否有和當前path有匹配的,只須要查看子節點path的第一個字母便可,即indices
// 好比s的子節點如今是earch和upport,indices爲eu
// 若是新來的路由爲super,那麼就是和upport有匹配的部分u,將繼續分類如今的upport節點
for i := 0; i < len(n.indices); i++ {
if c == n.indices[i] {
i = n.incrementChildPrio(i)
n = n.children[i]
continue walk
}
}
// Otherwise insert it
if c != ':' && c != '*' {
// []byte for proper unicode char conversion, see #65
// 記錄第一個字符,放在indices中
n.indices += string([]byte{c})
child := &node{
maxParams: numParams,
}
// 增長子節點
n.children = append(n.children, child)
n.incrementChildPrio(len(n.indices) - 1)
n = child
}
n.insertChild(numParams, path, fullPath, handlers)
return
} else if i == len(path) { // Make node a (in-path) leaf
// 路徑相同,若是已有handler就報錯,沒有就賦值
if n.handlers != nil {
panic("handlers are already registered for path ''" + fullPath + "'")
}
n.handlers = handlers
}
return
}
} else { // Empty tree,空樹,插入節點,節點種類是root
n.insertChild(numParams, path, fullPath, handlers)
n.nType = root
}
}
複製代碼
此函數的主要目的是找到插入節點的位置,若是和現有節點存在相同的前綴,那麼要將現有節點進行分裂,而後再插入,下面是insertChild
函數數據結構
// @1: 參數個數
// @2: 路徑
// @3: 完整路徑
// @4: 處理函數
func (n *node) insertChild(numParams uint8, path string, fullPath string, handlers HandlersChain) {
var offset int // already handled bytes of the path
// find prefix until first wildcard (beginning with ':'' or '*'')
// 找到前綴,只要匹配到wildcard
for i, max := 0, len(path); numParams > 0; i++ {
c := path[i]
if c != ':' && c != '*' {
continue
}
// find wildcard end (either '/' or path end)
end := i + 1
for end < max && path[end] != '/' {
switch path[end] {
// the wildcard name must not contain ':' and '*'
case ':', '*':
panic("only one wildcard per path segment is allowed, has: '" +
path[i:] + "' in path '" + fullPath + "'")
default:
end++
}
}
// check if this Node existing children which would be
// unreachable if we insert the wildcard here
if len(n.children) > 0 {
panic("wildcard route '" + path[i:end] +
"' conflicts with existing children in path '" + fullPath + "'")
}
// check if the wildcard has a name
if end-i < 2 {
panic("wildcards must be named with a non-empty name in path '" + fullPath + "'")
}
if c == ':' { // param
// split path at the beginning of the wildcard
if i > 0 {
n.path = path[offset:i]
offset = i
}
child := &node{
nType: param,
maxParams: numParams,
}
n.children = []*node{child}
n.wildChild = true
n = child
n.priority++
numParams--
// if the path doesn't end with the wildcard, then there
// will be another non-wildcard subpath starting with '/'
if end < max {
n.path = path[offset:end]
offset = end
child := &node{
maxParams: numParams,
priority: 1,
}
n.children = []*node{child}
// 下次循環這個新的child節點
n = child
}
} else { // catchAll
if end != max || numParams > 1 {
panic("catch-all routes are only allowed at the end of the path in path '" + fullPath + "'")
}
if len(n.path) > 0 && n.path[len(n.path)-1] == '/' {
panic("catch-all conflicts with existing handle for the path segment root in path '" + fullPath + "'")
}
// currently fixed width 1 for '/'
i--
if path[i] != '/' {
panic("no / before catch-all in path '" + fullPath + "'")
}
n.path = path[offset:i]
// first node: catchAll node with empty path
child := &node{
wildChild: true,
nType: catchAll,
maxParams: 1,
}
n.children = []*node{child}
n.indices = string(path[i])
n = child
n.priority++
// second node: node holding the variable
child = &node{
path: path[i:],
nType: catchAll,
maxParams: 1,
handlers: handlers,
priority: 1,
}
n.children = []*node{child}
return
}
}
// insert remaining path part and handle to the leaf
n.path = path[offset:]
n.handlers = handlers
}
複製代碼
insertChild
函數是根據path
自己進行分割, 將/分開的部分分別做爲節點保存, 造成一棵樹結構. 注意參數匹配中的:
和*
的區別, 前者是匹配一個字段, 後者是匹配後面全部的路徑app
匹配每一個children的path,最長匹配
// Returns the handle registered with the given path (key). The values of
// wildcards are saved to a map.
// If no handle can be found, a TSR (trailing slash redirect) recommendation is
// made if a handle exists with an extra (without the) trailing slash for the
// given path.
func (n *node) getValue(path string, po Params, unescape bool) (handlers HandlersChain, p Params, tsr bool) {
p = po
walk: // Outer loop for walking the tree
for {
// 還沒有到達path的終點
if len(path) > len(n.path) {
// 前面一段須要一致
if path[:len(n.path)] == n.path {
path = path[len(n.path):]
// If this node does not have a wildcard (param or catchAll)
// child, we can just look up the next child node and continue
// to walk down the tree
if !n.wildChild {
c := path[0]
for i := 0; i < len(n.indices); i++ {
if c == n.indices[i] {
n = n.children[i]
continue walk
}
}
// Nothing found.
// We can recommend to redirect to the same URL without a
// trailing slash if a leaf exists for that path.
tsr = (path == "/" && n.handlers != nil)
return
}
// handle wildcard child
n = n.children[0]
switch n.nType {
case param:
// find param end (either '/' or path end)
end := 0
for end < len(path) && path[end] != '/' {
end++
}
// save param value
if cap(p) < int(n.maxParams) {
p = make(Params, 0, n.maxParams)
}
i := len(p)
p = p[:i+1] // expand slice within preallocated capacity
p[i].Key = n.path[1:]
val := path[:end]
if unescape {
var err error
if p[i].Value, err = url.QueryUnescape(val); err != nil {
p[i].Value = val // fallback, in case of error
}
} else {
p[i].Value = val
}
// we need to go deeper!
if end < len(path) {
if len(n.children) > 0 {
path = path[end:]
n = n.children[0]
continue walk
}
// ... but we can't
tsr = (len(path) == end+1)
return
}
if handlers = n.handlers; handlers != nil {
return
}
if len(n.children) == 1 {
// No handle found. Check if a handle for this path + a
// trailing slash exists for TSR recommendation
n = n.children[0]
tsr = (n.path == "/" && n.handlers != nil)
}
return
case catchAll:
// save param value
if cap(p) < int(n.maxParams) {
p = make(Params, 0, n.maxParams)
}
i := len(p)
p = p[:i+1] // expand slice within preallocated capacity
p[i].Key = n.path[2:]
if unescape {
var err error
if p[i].Value, err = url.QueryUnescape(path); err != nil {
p[i].Value = path // fallback, in case of error
}
} else {
p[i].Value = path
}
handlers = n.handlers
return
default:
panic("invalid node type")
}
}
} else if path == n.path {
// We should have reached the node containing the handle.
// Check if this node has a handle registered.
if handlers = n.handlers; handlers != nil {
return
}
if path == "/" && n.wildChild && n.nType != root {
tsr = true
return
}
// No handle found. Check if a handle for this path + a
// trailing slash exists for trailing slash recommendation
for i := 0; i < len(n.indices); i++ {
if n.indices[i] == '/' {
n = n.children[i]
tsr = (len(n.path) == 1 && n.handlers != nil) ||
(n.nType == catchAll && n.children[0].handlers != nil)
return
}
}
return
}
// Nothing found. We can recommend to redirect to the same URL with an
// extra trailing slash if a leaf exists for that path
tsr = (path == "/") ||
(len(n.path) == len(path)+1 && n.path[len(path)] == '/' &&
path == n.path[:len(n.path)-1] && n.handlers != nil)
return
}
}
複製代碼
以前總聽你們說數據結構與算法有什麼用,工做中又用不到,上面就是一個很好的示例。咱們平時仍是要多關注底層原理,作後端的同窗多看看框架的代碼,必定受益不淺~
本文轉載自:michaelyou.github.io/2018/02/10/…
做者:Youmai
感謝閱讀,歡迎你們留言,分享,指正~