算法之trie tree

時間 2019-11-07

標籤算法 trie tree 简体版

原文原文鏈接

introduction

最近在作ASTR，而algorithm是該項目的task 1. 因此開始從新刷LeetCode題。相比其餘ACM題庫而言，LeetCode的難度屬於初、中級，更加的偏向於職場，而弱化了一些高級數學相關的東西。javascript

這段話是我對於刷LeetCode的一些見解。不少人以爲算法對於實際工做過程當中的用途不大，有點相似於「面試造火箭，工做擰螺絲」。可是真的用途不大嗎？我以爲並不是如此，若是你可以發現裏面的數學之美的話。其一，一個優秀的算法能夠極大的提升你的程序運行效率，尤爲在某些極端狀況下面；其二，學習了這些算法，能夠極大的提升咱們的邏輯思惟；其三，這對於面試仍是有不少好處的。java

我作LeetCode的方法是：node

選取某一到特定的題目（通常是淨點贊多的優先）
作題，並研究其抽象出來的原理
觸類旁通運用於這一類題

Note：特別強調，刷題不是目的，會作某一到具體的題更不是完成結果。學會每道題背後的原理，並可以理解、完成全部的這一類問題，以及將其運用於本身的工做當中，這纔是咱們真正須要達到的目標。git

舉個例子：面試

選取題目leetcode 5: Longest Palindromic Substring
由該題目聯想到經典的兩類問題： longest common substring problem 和 longest common subsequence problem。而後，聯繫到longest common substring problem的通常解決思路，generalized suffix tree，在細化到 trie tree。
搜索相關的問題，並解決。

本篇爲trie樹相關第一篇算法

Trie tree

首先Trie 來自於單詞retrieval, 一般發音爲 /ˈtraɪ/ (as "try").數組

In computer science, a trie, also called digital tree, radix tree or prefix tree, is a kind of search tree—an ordered tree data structure used to store a dynamic set or associative array where the keys are usually strings.數據結構

Trie tree，又被稱爲字典樹或前綴樹。從名字咱們能夠推斷出其能夠被用來查找字符串。app

咱們先來看一個例子：學習

給定一個字符串集合cat, cash, app, apple , aply, ok，來構建一顆字典樹，以下圖：

由此咱們引出字典樹的特色：

Trie tree用邊來表示字母
有相同前綴的單詞公用前綴節點。那麼咱們能夠知道，在單詞只包含小寫字母的狀況下，咱們能夠知道每一個節點最多有26個子節點
整棵樹的根節點是空的
每一個單詞結束的時候用特殊字符表示(好比上圖的$)，在代碼中能夠單獨創建一個bool字段來表示是不是單詞結束處

基本操做

最簡單的兩個操做爲: insert 和 search

insert: 插入一個新單詞

從圖中能夠直觀看出來，從左到右掃描新單詞，若是字母在相應根節點下沒有出現過，就插入這個字母；不然沿着字典樹往下走，看單詞的下一個字母。

問題1：字母往哪一個位置插？有兩種編碼方式。第一種能夠按照輸入順序對其進行編碼，這裏相同字母的編碼可能不一樣：

第二種編碼方式: 由於每一個節點最多26個子節點，我能夠能夠按他們的字典序0-25編號，這裏相同字母的編碼相同

一般來說，咱們來實現這個數據結構會有兩種方式：

數組模擬
類的形式

一般，雖然第二種方式更加的浪費空間，可是我會更加的喜歡用第二種方式。好比在處理下面這幾個問題時更加方便：

查詢某個單詞是否存在字典樹中。咱們只須要在節點中添加屬性表示便可。
查詢某個前綴出現的次數。咱們仍然只須要在節點中添加屬性便可。

所以，咱們來看實際的代碼(javascript版):

var TrieNode = function() {
    this.isEnd = false;
    this.links = new Array(26);
}
TrieNode.prototype.containsKey = function(ch) { // 當前節點的子節點中是否包含該字符
    return this.links[ch.charCodeAt(0) - 'a'.charCodeAt(0)] !== undefined; 
}
TrieNode.prototype.get = function(ch) { // 獲取當前節點相關字符的子節點
    return this.links[ch.charCodeAt(0) - 'a'.charCodeAt(0)];
}
TrieNode.prototype.put = function(ch, node) { // 插入當前相關字符的子節點
    this.links[ch.charCodeAt(0) - 'a'.charCodeAt(0)] = node;
}
TrieNode.prototype.setEnd = function() { // 設置當前節點是否爲單詞結尾
    this.isEnd =true
}

/** * Initialize your data structure here. */
var Trie = function() {
   this.root = new TrieNode();
};

/** * Inserts a word into the trie. * @param {string} word * @return {void} */
Trie.prototype.insert = function(word) {
    let node = this.root;
    for (let i = 0; i < word.length; i++) {
        let currentChar = word[i];
        if(!node.containsKey(currentChar)) {
            node.put(currentChar, new TrieNode());
        }
        node = node.get(currentChar);
    }
    node.setEnd();
};

/** * Returns if the word is in the trie. * @param {string} word * @return {boolean} */
Trie.prototype.search = function(word) {
    let node = this.root;
    for (let i = 0; i < word.length; i++) {
        let currentChar = word[i];
        if(node.containsKey(currentChar)) {
            node = node.get(currentChar);
        } else {
            return false;
        }
    }
    return node.isEnd();
};

/** * Returns if there is any word in the trie that starts with the given prefix. * @param {string} prefix * @return {boolean} */
Trie.prototype.startsWith = function(prefix) {
    let node = this.root;
    for (let i = 0; i < word.length; i++) {
        let currentChar = word[i];
        if(node.containsKey(currentChar)) {
            node = node.get(currentChar);
        } else {
            return false;
        }
    }
    return true;    
};

/** */
  var obj = new Trie();
  let words = ["Trie","insert","search","search","startsWith","insert","search"]
  for (let i = 0; i < words.length; i++ ) {
    obj.insert(words[i]);
  }
複製代碼

很明顯，上面的寫法比較偏向於工程化，比較完整類型的，上面的代碼能夠更加的優化,咱們用對象來模擬：

var Trie = function() {
   this.root = {};
};

/** * Inserts a word into the trie. * @param {string} word * @return {void} */
Trie.prototype.insert = function(word) {
    let node = this.root;
    for (let ch of word) {
      if (!(ch in node)) node[ch] = {}
      node = node[ch]
    }
    node['$'] = true // 表示單詞結束位置
};

/** * Returns if the word is in the trie. * @param {string} word * @return {boolean} */
Trie.prototype.search = function(word) {
    let node = this.root;
    for(let ch of word) {
      if(ch in node) node = node[ch]
      else return false
    }
    return node['$'] === true;
};

/** * Returns if there is any word in the trie that starts with the given prefix. * @param {string} prefix * @return {boolean} */
Trie.prototype.startsWith = function(prefix) {
  let node = this.root;
    for(let ch of prefix) {
      if(ch in node) node = node[ch]
      else return false
    }
    return true; 
};
複製代碼