Java 容器源碼分析之 LinkedHashMap

時間 2019-12-07

標籤 java 容器源碼分析 linkedhashmap 欄目 Java 简体版

原文原文鏈接

同 HashMap 同樣，LinkedHashMap 也是對 Map 接口的一種基於鏈表和哈希表的實現。實際上， LinkedHashMap 是 HashMap 的子類，其擴展了 HashMap 增長了雙向鏈表的實現。相較於 HashMap 的迭代器中混亂的訪問順序，LinkedHashMap 能夠提供能夠預測的迭代訪問，即按照插入序 (insertion-order) 或訪問序 (access-order) 來對哈希表中的元素進行迭代。java

1
2
3

public class LinkedHashMap<K,V>
 extends HashMap<K,V>
 implements Map<K,V>

從類聲明中能夠看到，LinkedHashMap 確實是繼承了 HashMap，於是 HashMap 中的一些基本操做，如哈希計算、擴容、查找等，在 LinkedHashMap 中都和父類 HashMap 是一致的。node

可是，和 HashMap 有所區別的是，LinkedHashMap 支持按插入序 (insertion-order) 或訪問序 (access-order) 來訪問其中的元素。所謂插入順序，就是 Entry 被添加到 Map 中的順序，更新一個 Key 關聯的 Value 並不會對插入順序形成影響；而訪問順序則是對全部 Entry 按照最近訪問 (least-recently) 到最遠訪問 (most-recently) 進行排序，讀寫都會影響到訪問順序，可是對迭代器 (entrySet(), keySet(), values()) 的訪問不會影響到訪問順序。訪問序的特性使得能夠很容易經過 LinkedHashMap 來實現一個 LRU(least-recently-used) Cache，後面會給出一個簡單的例子。數組

之因此 LinkedHashMap 可以支持插入序或訪問序的遍歷，是由於 LinkedHashMap 在 HashMap 的基礎上增長了雙向鏈表的實現，下面會結合 JDK 8 的源碼進行簡要的分析。緩存

底層結構

LinkedHashMap 是 HashMap 的子類，於是 HashMap 中的成員在 LinkedHashMap 中也存在，如底層的 table 數組等，這裏就再也不說明了。咱們重點關注一下 LinkedHashMap 中節點發生的變化。app

/**
 * HashMap.Node subclass for normal LinkedHashMap entries.
 */
//LinkedHashMap.Entey 繼承自 HashMap.Node， 
//而 HashMap.TreeNode 又繼承了 LinkedHashMap.Entey 
static class Entry<K,V> extends HashMap.Node<K,V> {
 //在父類的基礎上增長了before 和 after
 //父類中存在 next
 //雙向鏈表的鏈接經過before 和 after，哈希表中全部的元素可看做一個雙向鏈表
 //桶內單向鏈表的鏈接經過 next
 Entry<K,V> before, after;
 //構造方法
 Entry(int hash, K key, V value, Node<K,V> next) {
 //父類構造方法
 super(hash, key, value, next);
 }
}

private static final long serialVersionUID = 3801124242820219131L;

/**
 * The head (eldest) of the doubly linked list.
 */
//head成員爲雙向鏈表的頭
transient LinkedHashMap.Entry<K,V> head;

/**
 * The tail (youngest) of the doubly linked list.
 */
//tail成員爲雙向鏈表的尾
transient LinkedHashMap.Entry<K,V> tail;

/**
 * The iteration ordering method for this linked hash map: <tt>true</tt>
 * for access-order, <tt>false</tt> for insertion-order.
 *
 * @serial
 */
//迭代順序， true 使用最近被訪問的順序， false爲插入順序
//the order in which its entries were last accessed, from least-recently accessed to most-recently (access-order)， well-suited to building LRU caches
final boolean accessOrder;

爲了實現雙向鏈表，LinkedHashMap 的節點在父類的基礎上增長了 before/after 引用，而且使用 head 和 tail 分別保存雙向鏈表的頭和尾。同時，增長了一個標識來保存 LinkedHashMap 的迭代順序是插入序仍是訪問序。ide

因爲父類 HashMap 的節點中存在 next 引用，能夠將每一個桶中的元素都看成一個單鏈表看待；LinkedHashMap 的每一個桶中固然也保留了這個單鏈表關係，不過這個關係由父類進行管理，LinkedHashMap 中只會對雙向鏈表的關係進行管理。LinkedHashMap 中全部的元素都被串聯在一個雙向鏈表中。函數

雙向鏈表

爲了簡化對雙向鏈表的操做，LinkedHashMap 中提供了 linkNodeLast 和 transferLinks 方法，分別以下：post

// link at the end of list
// 將新節點 p 連接到雙向鏈表的末尾
private void linkNodeLast(LinkedHashMap.Entry<K,V> p) {
 LinkedHashMap.Entry<K,V> last = tail;
 tail = p;
 if (last == null)
 //爲空，則爲頭節點
 head = p;
 else {
 //修改before 和 after的指向
 p.before = last;
 last.after = p;
 }
}

// apply src's links to dst
// 將src的連接應用到dst中，就是用dst替換src在雙向鏈表中的位置
private void transferLinks(LinkedHashMap.Entry<K,V> src,
 LinkedHashMap.Entry<K,V> dst) {
 //修改dst的前驅和後繼指向
 LinkedHashMap.Entry<K,V> b = dst.before = src.before;
 LinkedHashMap.Entry<K,V> a = dst.after = src.after;
 //將雙向鏈表中原來指向src的連接改成指向dst
 if (b == null)
 head = dst;
 else
 b.after = dst;
 if (a == null)
 tail = dst;
 else
 a.before = dst;
}

LinkedHashMap 重寫了父類新建節點的方法，在新建節點以後調用 linkNodeLast 方法將新添加的節點連接到雙向鏈表的末尾：ui

//覆蓋父類方法
Node<K,V> newNode(int hash, K key, V value, Node<K,V> e) {
 //新建節點
 LinkedHashMap.Entry<K,V> p =
 new LinkedHashMap.Entry<K,V>(hash, key, value, e);
 //將節點連接到雙向鏈表的末尾
 linkNodeLast(p);
 return p;
}

//覆蓋父類方法
//新建TreeNode
TreeNode<K,V> newTreeNode(int hash, K key, V value, Node<K,V> next) {
 TreeNode<K,V> p = new TreeNode<K,V>(hash, key, value, next);
 //將新建的節點連接到雙向鏈表的末尾
 linkNodeLast(p);
 return p;
}

咱們知道，HashMap 中單個桶中的元素可能會在單鏈表和紅黑樹之間進行轉換，LinkedHashMap 中固然也是同樣，不過在轉換時還要調用 transferLinks 來改變雙向鏈表中的鏈接關係：this

//覆蓋父類方法
// For conversion from TreeNodes to plain nodes
// 將節點由 TreeNode 轉換爲 普通節點
Node<K,V> replacementNode(Node<K,V> p, Node<K,V> next) {
 LinkedHashMap.Entry<K,V> q = (LinkedHashMap.Entry<K,V>)p; //TreeNode
 //根據TreeNode的信息建立新的普通節點
 LinkedHashMap.Entry<K,V> t =
 new LinkedHashMap.Entry<K,V>(q.hash, q.key, q.value, next);
 //將雙向鏈表中的TreeNode替換爲新的普通節點
 transferLinks(q, t);
 return t;
}

//覆蓋父類方法
// For treeifyBin
// 將節點由普通節點轉換爲TreeNode
TreeNode<K,V> replacementTreeNode(Node<K,V> p, Node<K,V> next) {
 LinkedHashMap.Entry<K,V> q = (LinkedHashMap.Entry<K,V>)p;
 TreeNode<K,V> t = new TreeNode<K,V>(q.hash, q.key, q.value, next);
 transferLinks(q, t);
 return t;
}

如何維護插入序和訪問序？

在 LinkedHashMap 中，全部的 Entry 都被串聯在一個雙向鏈表中。從上一節的代碼中能夠看到，每次在新建一個節點時都會將新建的節點連接到雙向鏈表的末尾。這樣從雙向鏈表的尾部向頭部遍歷就能夠保證插入順序了，頭部節點是最先添加的節點，而尾部節點則是最近添加的節點。那麼，訪問順序要怎麼實現呢？

以前咱們在分析 HashMap 的源碼的時候，在添加及更新、查找、刪除等操做中能夠看到 afterNodeAccess、afterNodeInsertion、afterNodeRemoval 等幾個方法的調用，不過在 HashMap 中這幾個方法中沒有任何操做。實際上，這幾個方法就是供 LinkedHashMap 的重寫的，咱們不妨看一下在 HashMap 中這幾個方法的聲明：

// Callbacks to allow LinkedHashMap post-actions
void afterNodeAccess(Node<K,V> p) { }
void afterNodeInsertion(boolean evict) { }
void afterNodeRemoval(Node<K,V> p) { }

在 LinkedHashMap 中對這幾個方法進行了重寫：

//移除節點的回調函數
void afterNodeRemoval(Node<K,V> e) { // unlink
 //移除一個節點，雙向鏈表中的鏈接關係也要調整
 LinkedHashMap.Entry<K,V> p =
 (LinkedHashMap.Entry<K,V>)e, b = p.before, a = p.after;
 p.before = p.after = null;
 if (b == null)
 head = a;
 else
 b.after = a;
 if (a == null)
 tail = b;
 else
 a.before = b;
}

//插入節點的回調函數
//構造函數中調用，evict爲false
void afterNodeInsertion(boolean evict) { // possibly remove eldest
 LinkedHashMap.Entry<K,V> first;
 //first是頭元素，也是最老的元素
 //在插入序中，就是最早插入的元素
 //在訪問序中，就是最遠被訪問的元素
 //這裏removeEldestEntry(first)始終返回true，即不刪除最老的元素
 //若是是一個容量固定的cache，可調整removeEldestEntry(first)的實現
 if (evict && (first = head) != null && removeEldestEntry(first)) {
 //不是構造方法中
 //頭元素不爲空
 //要刪除最老的元素
 //在LinkedHashMap的實現中，不會進入這裏
 K key = first.key;
 removeNode(hash(key), key, null, false, true);
 }
}

//訪問節點的回調函數
void afterNodeAccess(Node<K,V> e) { // move node to last
 LinkedHashMap.Entry<K,V> last;
 //若是是訪問序，且當前節點並非尾節點
 //將該節點置爲雙向鏈表的尾部
 if (accessOrder && (last = tail) != e) {
 //p 當前節點， b 前驅結點， a 後繼結點
 LinkedHashMap.Entry<K,V> p =
 (LinkedHashMap.Entry<K,V>)e, b = p.before, a = p.after;
 p.after = null; //設爲尾節點，則沒有後繼
 if (b == null)
 head = a; //p是頭節點，調整後其後繼結點變爲頭節點
 else
 b.after = a;//p不是頭節點，前驅和後繼結點相連
 if (a != null)
 a.before = b;
 else
 last = b;//應該不會出現這種狀況，p是尾節點
 if (last == null)
 head = p;
 else {
 //將p置於尾節點以後
 p.before = last;
 last.after = p;
 }
 tail = p;//調整tail指向
 ++modCount;//結構性改變
 }
}

在插入節點、刪除節點和訪問節點後會調用相應的回調函數。能夠看到，在 afterNodeAccess方法中，若是該 LinkedHashMap 是訪問序，且當前訪問的節點不是尾部節點，則該節點會被置爲雙鏈表的尾節點。即，在訪問序下，最近訪問的節點會是尾節點，頭節點則是最遠訪問的節點。

在 afterNodeInsertion 中，若是 removeEldestEntry(first) 節點返回 true，則會將頭部節點刪除。若是想要實現一個固定容量的 Map，能夠在繼承 LinkedHashMap 後重寫 removeEldestEntry 方法。在 LinkedHashMap 中，該方法始終返回 false。

//返回false
//是否移除最老的Entry
protected boolean removeEldestEntry(Map.Entry<K,V> eldest) {
 return false;
}

在 HashMap 中，在 putVal 和 removeNode 中都調用了相應的回調函數，而 get 則沒有，於是在 LinkedHahsMap 中進行了重寫：

public V get(Object key) {
 Node<K,V> e;
 if ((e = getNode(hash(key), key)) == null)
 return null;
 //訪問序
 if (accessOrder)
 //訪問後調用回調方法調整雙鏈表
 afterNodeAccess(e);
 return e.value;
}

遍歷及迭代器

由於 LinkeHashMap 的全部的節點都在一個雙向鏈表中，於是能夠經過該雙向鏈表來遍歷全部的 Entry。而在 HashMap 中，要遍歷全部的 Entry，則要依次遍歷全部桶中的單鏈表。相比較而言，從時間複雜度的角度來看，LinkedHashMap 的複雜度爲 O(size())，而 HashMap 則爲 O(capacity + size())。

//由於全部的節點都被串聯在雙向鏈表中，迭代器在迭代時能夠利用雙向鏈表的連接關係進行
//雙向鏈表的順序是按照插入序或訪問序排列的
//相比於HashMap中的迭代，LinkedHashMap更爲高效，O(size())
//HashMapde 迭代，O(capacity + size())
abstract class LinkedHashIterator {
 LinkedHashMap.Entry<K,V> next;
 LinkedHashMap.Entry<K,V> current;
 int expectedModCount;

 LinkedHashIterator() {
 next = head;
 expectedModCount = modCount;
 current = null;
 }

 public final boolean hasNext() {
 return next != null;
 }

 final LinkedHashMap.Entry<K,V> nextNode() {
 LinkedHashMap.Entry<K,V> e = next;
 if (modCount != expectedModCount)
 throw new ConcurrentModificationException();
 if (e == null)
 throw new NoSuchElementException();
 current = e;
 next = e.after;
 return e;
 }

 public final void remove() {
 Node<K,V> p = current;
 if (p == null)
 throw new IllegalStateException();
 if (modCount != expectedModCount)
 throw new ConcurrentModificationException();
 current = null;
 K key = p.key;
 removeNode(hash(key), key, null, false, false);
 expectedModCount = modCount;
 }
}

能夠看到，在遍歷全部節點時是經過節點的 after 引用進行的。這樣，能夠雙鏈表的頭部遍歷到到雙鏈表的尾部，就不用像 HahsMap 那樣訪問空槽了。

在 containsValue 和 internalWriteEntries 中也使用了雙向鏈表進行遍歷。

public boolean containsValue(Object value) {
 //使用雙向鏈表進行遍歷
 for (LinkedHashMap.Entry<K,V> e = head; e != null; e = e.after) {
 V v = e.value;
 if (v == value || (value != null && value.equals(v)))
 return true;
 }
 return false;
}

//覆蓋父類方法
//序列化，Called only from writeObject, to ensure compatible ordering.
void internalWriteEntries(java.io.ObjectOutputStream s) throws IOException {
 //調整元素的遍歷方式，使用雙鏈表遍歷
 for (LinkedHashMap.Entry<K,V> e = head; e != null; e = e.after) {
 s.writeObject(e.key);
 s.writeObject(e.value);
 }
}

使用 LinkedHashMap 實現 LRU Cache

LinkedHashMap 的訪問序能夠方便地用來實現一個 LRU Cache。在訪問序模式下，尾部節點是最近一次被訪問的節點 (least-recently)，而頭部節點則是最遠訪問 (most-recently) 的節點。於是在決定失效緩存的時候，將頭部節點移除便可。

可是，因爲鏈表是無界的，但緩存每每是資源受限的，如何肯定什麼時候移除最遠訪問的緩存呢？前面分析過，在 afterNodeInsertion 中，會調用 removeEldestEntry 來決定是否將最老的節點移除，於是咱們可使用 LinkedHashMap 的子類，並重寫 removeEldestEntry 方法，當 Enrty 的數量超過緩存的容量是返回 true 便可。

下面給出基於 LinkedHashMap 實現的 LRU Cache 的代碼：

public class CacheImpl<K,V> {
    private Map<K, V> cache;
    private int capacity;

    public enum POLICY {
        LRU, FIFO
    }

    public CacheImpl(int cap, POLICY policy) {
        this.capacity = cap;
        cache = new LinkedHashMap<K, V>(cap, 0.75f, policy.equals(POLICY.LRU)){
            //超出容量就刪除最老的值
            @Override
            protected boolean removeEldestEntry(Map.Entry<K, V> eldest) {
                return size() > capacity;
            }
        };
    }

    public V get(K key) {
        if (cache.containsKey(key)) {
            return cache.get(key);
        }
        return null;
    }

    public void set(K key, V val) {
        cache.put(key, val);
    }

    public void printKV() {
        System.out.println("key value in cache");
        for (Map.Entry<K,V> entry : cache.entrySet()) {
            System.out.println(entry.getKey() + ":" + entry.getValue());
        }
    }

    public static void main(String[] args) {
        CacheImpl<Integer, String> cache = new CacheImpl(5, POLICY.LRU);

        cache.set(1, "first");
        cache.set(2, "second");
        cache.set(3, "third");
        cache.set(4, "fourth");
        cache.set(5, "fifth");
        cache.printKV();

        cache.get(1);
        cache.get(2);
        cache.printKV();

        cache.set(6, "sixth");
        cache.printKV();
    }
}

小結

本文對 JDK 8 中的 LinkedHashMap 的源碼及實現進行了簡單的分析。LinkedHashMap 繼承自 HashMap，並在其基本結構上增長了雙向鏈表的實現，於是 LinkedHashMap 在內存佔用上要比 HashMap 高出許多。LinkedHashMap 仍然沿用了 HashMap 中基於桶數組、桶內單鏈表和紅黑樹結構的哈希表，在哈希計算、定位、擴容等方面都和 HashMAp 是一致的。LinkedHashMap 一樣支持爲 null 的鍵和值。

因爲增長了雙向鏈表將全部的 Entry 串在一塊兒，LinkedHashMap 的一個重要的特色就是支持按照插入順序或訪問順序來遍歷全部的 Entry，這一點和 HashMap 的亂序遍歷很不相同。在一些對順序有要求的場合，就須要使用 LinkedHashMap 來替代 HashMap。

因爲雙向鏈表的緣故，在遍歷時能夠直接在雙向鏈表上進行，於是遍歷時間複雜度和容量無關，只和當前 Entry 數量有關。這點相比於 HashMap 要更加高效一些。