算法（第4版） Chapter 5.2 單詞查找樹

時間 2019-12-07

標籤算法 chapter 5.2 單詞查找欄目應用數學简体版

原文原文鏈接

Algorithms Fourth Edition
Written By Robert Sedgewick & Kevin Wayne
Translated By 謝路雲
Chapter 5 Section 2 單詞查找樹c++

查找所須要的單詞的時間和鍵的長度成正比ide
查找未命中只需檢查若干個單詞spa

單詞查找樹

單詞查找樹API

基本性質

每一個連接對應一個字符code
每一個結點可能有一個值遞歸
- 有值，說明存在從根結點到這個結點的字符串。圖片
- 沒有值，說明不存在從根結點到這個結點的字符串。沒有對應的鍵值。它的存在是爲了簡化查詢。ci

查找字符串
- 命中： 對應結點有值（注意不僅僅是指向該對應結點d連接存在，而且該結點必定要有值！）get
- 未命中： 沒有值 or 連接爲空string

結點的表示
- 每一個結點下面有R個連接，一個連接對應一個字符
- 鍵隱式地保存在結點裏

TrieST 代碼

public class TrieST<Value> {
    private static int R = 256; // radix
    private Node root; // root of trie

    private static class Node {
        private Object val;
        private Node[] next = new Node[R];
    }

    public Value get(String key) {
        Node x = get(root, key, 0);
        if (x == null)
            return null;
        return (Value) x.val;
    }
    
    // Return value associated with key in the subtrie rooted at x.
    private Node get(Node x, String key, int d) { 
        if (x == null)
            return null;
        if (d == key.length())
            return x;
        char c = key.charAt(d); // Use dth key char to identify subtrie.
        return get(x.next[c], key, d + 1);
    }

    public void put(String key, Value val) {
        root = put(root, key, val, 0);
    }
    
    // Change value associated with key if in subtrie rooted at x.
    private Node put(Node x, String key, Value val, int d) { //神通常的遞歸思想
        if (x == null)
            x = new Node();
        if (d == key.length()) {
            x.val = val;
            return x;
        }
        char c = key.charAt(d); // Use dth key char to identify subtrie.
        x.next[c] = put(x.next[c], key, val, d + 1); //神來之筆
        return x;
    }

    public Iterable<String> keys() {
        return keysWithPrefix("");
    }

    public Iterable<String> keysWithPrefix(String pre) {
        Queue<String> q = new Queue<String>();
        collect(get(root, pre, 0), pre, q);
        return q;
    }

    private void collect(Node x, String pre, Queue<String> q) {
        if (x == null)
            return;
        if (x.val != null)
            q.enqueue(pre);
        for (char c = 0; c < R; c++) //這個效率簡直了。。。爛到爆炸
            collect(x.next[c], pre + c, q);
    }
}

方法keys：返回一個Queue，裏面有全部的字符串
方法keysWithPrefix(String pre)：返回一個Queue，裏面有全部以給定字符串pre開頭的全部字符串
方法collect：遞歸用

通配符匹配

結構差別
- 不含通配符的結構。keysWithPrefix(); get(); collect();
- 含通配符的結構。keysWithPrefix(); collect();
爲何結構和以前不同呢？由於get方法要重寫。直接把重寫的get方法歸併進collect方法裏去了。

public Iterable<String> keysThatMatch(String pat) {
        Queue<String> q = new Queue<String>();
        collect(root, "", pat, q);
        return q;
    }

    public void collect(Node x, String pre, String pat, Queue<String> q) {
        int d = pre.length(); 
        // begin修改於原get方法()和collect方法()
        if (x == null) // get&collect
            return;
        if (d == pat.length() && x.val != null) // 前collect，後get
            q.enqueue(pre);
        if (d == pat.length())// get
            return; 
        // end 修改於原get方法()和collect方法()
        
        char next = pat.charAt(d);
        for (char c = 0; c < R; c++)
            if (next == '.' || next == c) //若是next=='.' 就要遍歷接在當前字符後面的全部字符！！！
                collect(x.next[c], pre + c, pat, q); //仍是神通常的遞歸想法
    }

private Node get(Node x, String key, int d) { 
        if (x == null)
            return null;
        if (d == key.length())
            return x;
        char c = key.charAt(d); // Use dth key char to identify subtrie.
        return get(x.next[c], key, d + 1);
    }

最長前綴

public String longestPrefixOf(String s) {
        int length = search(root, s, 0, 0);
        return s.substring(0, length);
    }

    //當前搜索的是第d位字符，返回的是具備最長length位前綴的子字符
    private int search(Node x, String s, int d, int length) {
        if (x == null) // 查到單詞樹的盡頭了
            return length;
        if (x.val != null) // 若是存在這個單詞，就更新length值
            length = d;
        if (d == s.length()) // 查到字符串的盡頭了（必定要先作上一步）
            return length;
        char c = s.charAt(d);
        return search(x.next[c], s, d + 1, length); //遞歸搜索下一位
    }

刪除操做

依舊是神通常的遞歸思路
若是它的連接不爲空
- 刪去這個結點的值便可
若是它的全部連接都爲空
- 刪去這個結點
- 檢查這個結點的父結點的全部連接是否爲空
  - 不爲空，結束
  - 爲空，刪去父結點並檢查父結點的父結點是否爲空
    ……循環如此

public void delete(String key) {
        root = delete(root, key, 0);
    }

    private Node delete(Node x, String key, int d) {
        if (x == null)
            return null;
        if (d == key.length()) //找到了的話就將該結點的值刪去
            x.val = null;
        else {
            char c = key.charAt(d);
            x.next[c] = delete(x.next[c], key, d + 1); //依舊是神通常的遞歸思路
        }
        if (x.val != null) //若是該結點有值，則無論當前結點連接是否爲空，該結點都在樹裏，不能被刪去。
            return x;
        for (char c = 0; c < R; c++) //不然就看該結點連接是否爲空（即該結點沒有值）
            if (x.next[c] != null) //若是當前結點連接不爲空，則返回當前結點
                return x; 
        return null; //不然返回爲空。（由於是返回上一層遞歸， 即把本身置爲空，也即刪除了本身）
    }

複雜度

字母表大小爲R，N個隨機鍵組成的單詞查找樹中

時間：
- 查找和插入最差時間爲：鍵的長度+1
- 查找未命中的平均時間爲：log_RN （查找未命中的時間和鍵的長度無關）

空間
- 與 R和全部鍵的字符總數之積成正比
- 連接
  - 一棵單詞查找樹的連接總數爲RN到RNw之間，w爲鍵的平均長度
  - 當全部鍵較短時，連接總數接近RN
  - 當全部鍵較長時，連接總數接近RNw
  - 縮小R能節省大量空間

三向單詞查找樹

避免空間消耗
每一個結點含有一個字符，三個連接（小於，等於，大於），可能含有一個值
字符是顯式保存在每一個結點中

TST 代碼

public class TST<Value> {
    private Node root; // root of trie

    private class Node {
        char c; // character
        Node left, mid, right; // left, middle, and right subtries
        Value val; // value associated with string
    }

    public Value get(String key) {
        Node x = get(root, key, 0);
        if (x == null)
            return null;
        return (Value) x.val;
    }

    private Node get(Node x, String key, int d) {
        if (x == null)
            return null;
        char c = key.charAt(d);
        if (c < x.c)
            return get(x.left, key, d);
        else if (c > x.c)
            return get(x.right, key, d);
        else if (d < key.length() - 1)
            return get(x.mid, key, d + 1);
        else
            return x;
    }

    public void put(String key, Value val) {
        root = put(root, key, val, 0);
    }

    private Node put(Node x, String key, Value val, int d) {
        char c = key.charAt(d);
        if (x == null) {
            x = new Node();
            x.c = c;
        }
        if (c < x.c)
            x.left = put(x.left, key, val, d);
        else if (c > x.c)
            x.right = put(x.right, key, val, d);
        else if (d < key.length() - 1)
            x.mid = put(x.mid, key, val, d + 1);
        else
            x.val = val;
        return x;
    }
}

相關標籤/搜索

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。