Java併發分析—ConcurrentHashMap

時間 2019-11-11

標籤 java 併發分析 concurrenthashmap 欄目 Java 简体版

原文原文鏈接

　　LZ在 http://www.javashuo.com/article/p-ycazzgxt-b.html 中簡單介紹了List和Map中的經常使用集合，惟獨沒有CurrentHashMap。緣由是CurrentHashMap太複雜了，因而新開一篇，將在這裏將隆重介紹。html

　　在java中，hashMap 和hashTable 與 currentHashMap 的關係比較密切，因此LZ在這多囉嗦一下，從hashMap，hashTable提及，再逐漸過渡到CurrentHashMap，以便於讀者更能清晰地理解它的前因後果。java

1.hashMap(JDK 1.7)

1.1 hashMap的數據結構圖

　　你們都知道hashmap的數據結構是由數組+鏈表實現的，如圖：linux

　　HashMap默認的初始化容量是16，默認加載因子是0.75。擴容就是把一原map結構中的數據一一取出來放在一個更大的map結構中，在操做鏈表時使用的是頭插法。在單線程時，擴容是沒有問題的，可是在多線程下，會發生線程安全問題。數組

1.2 HashMap擴容分析

　　擴容源碼以下：安全

void transfer(Entry[] newTable) {
        Entry[] src = table;
        int newCapacity = newTable.length;
        for (int j = 0; j < src.length; j++) {
            Entry<K,V> e = src[j];
            if (e != null) {
                src[j] = null;
                do {
                   Entry<K,V> next = e.next;
                   int i = indexFor(e.hash, newCapacity);
                    e.next = newTable[i];
                    newTable[i] = e;
                    e = next;
                } while (e != null);
            }
        }
    }

　　其中最主要的是紅色部分，把這幾句代碼摘出來，標上序號，方便後文引用，看下執行過程當中會發生什麼?數據結構

Entry<K,V> next = e.next;  ①
e.next = newTable[i];      ②
newTable[i] = e;           ③
e = next;                  ④

　　具體過程舉一個例子：多線程

　　單線程狀況下的擴容狀況：併發

　　這是一個大小爲2的map結構，其中在下標爲1的數組上掛了一個長度爲3的鏈表，鏈表中的3個key分別爲 3,5,9 。而這三個key都是 mod(2) 之後放在鏈表中的。形成一個鏈過程。此時e節點指向了3，next節點指向了e的下一個節點 5 ，如今要將此map擴容，則將mod(2)變成mode(4)。單線程執行步驟以下：app

（1）執行代碼①②後結果：e指向了新map的3 ，e的next爲空。高併發

（2）循環執行代碼④①後的結果：e指向5，next執行9

（3）繼續循環執行，鏈表使用頭插法，最終的結果以下：e指向了5，next指向了null，5和9的順序發生了反轉，和擴容完畢。

　　在多線程下的擴容狀況：

　　一樣是上面的map結構。有兩個線程A和線程B進行擴容，

（1）線程A執行代碼①後掛起。此時線程A的指針狀況如上圖，e指向3，next指向5。

（2）此時線程B執行擴容，直至線程B擴容完畢，新的map結構如單線程中的執行結果：

　　在這個時候，線程A開始執行，但線程A的指針仍是掛起以前的狀態，爲了方便標識，下面用紅色標識A線程，用綠色標識B線程。

　　如上圖，線程A在掛起以前e指向3，next，指向5，而且這兩個節點在原map上，當A掛起後，就像睡了一覺，這是線程B將原map結構上的節點複製到了新的mao結構上，當A醒來以後，它的e和next執行節點沒變，可是節點的位置發生了變化，已經在新的map上了，所以會出現上圖現象，此時A開始擴容：

（3）A執行②③代碼，狀況和上圖同樣，沒有變化，依然是將3節點放在newtable[3]上。

（4）A循環執行④①代碼，狀況以下；

　　e = next； ④ （3）執行完後的狀況如上圖，A的next執行 5，因此執行完這行代碼後，e指向5。

　　next = e.next；① 執行完代碼④後，e指向了5，而在e掛起以前，5的next指向了9，此時e的next爲9，next = e.next = 9。

（5）A在執行②③代碼後的狀況以下：

e.next = newTable[i];      ②  此時e指向5，i等於1，e.next指向9 （線程B擴完容，9的next指向了5） newTable[i] = e; ③ 此時，新table[1]指向e，即5

　　此時出現了環形循環，即死循環。。。

1.3 HashMap擴容時機

　　從上面知道了HashMap擴容原理，那麼hashMap究竟是何時擴容的呢？

　　上面提到過，HashMap默認的初始化容量是16，默認加載因子是0.75，什麼意思呢？就是16*0.75 = 12，即當向hashMap中經過put()方法存入的數據大於12個的時候就會擴容，擴容後的容量爲 16*2 = 32 ，舉個簡單的例子。

public static void main(String[] args) {
        Map map = new HashMap<Integer, Integer>();
        for(int i = 0;i < 12;i++){
            map.put(i, i+1);
        }
}

　　這是一個很簡單的put(）操做，而後在擴容源碼上打上斷點，debug執行完成，沒有任何攔截，過程再也不演示，下面將for循環中的條件改爲 i < 13,debug當put第13個鍵值對的時候，以下圖：

　　從上圖可知，當put的鍵值對大於12的時候就會進行擴容。

2.hashMap(JDK 1.8)

　　在1.8中，對hashmap作了優化，在1.7中，有個很容易發生的問題，就是當發生hash衝突的機率比較高時，數組上的某個鏈表上的數據就會比較多，而其餘鏈表上數據比較少，某個鏈表將變的很是長，致使查詢效率下降。因而，在1.8中，當某個鏈表上的鍵值對個數達到8個時，就會將此鏈表轉化爲紅黑樹，咱們知道，紅黑樹的查詢效率很是高，主要是用它來存儲有序的數據，它的時間複雜度是O(lgn)，java集合中的TreeSet和TreeMap以及linux虛擬內存管理就是用紅黑樹實現的。關於紅黑樹，這裏再也不介紹，能夠參閱 http://www.360doc.com/content/18/0904/19/25944647_783893127.shtml 。下面看看hashmap的源碼。

put()方法源碼：

 1  /**
 2      * Implements Map.put and related methods
 3      *
 4      * @param hash hash for key
 5      * @param key the key
 6      * @param value the value to put
 7      * @param onlyIfAbsent if true, don't change existing value
 8      * @param evict if false, the table is in creation mode.
 9      * @return previous value, or null if none
10      */
11     final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
12                    boolean evict) {
13         Node<K,V>[] tab; Node<K,V> p; int n, i;
14         if ((tab = table) == null || (n = tab.length) == 0)
15             n = (tab = resize()).length;
16         if ((p = tab[i = (n - 1) & hash]) == null)
17             tab[i] = newNode(hash, key, value, null);
18         else {
19             Node<K,V> e; K k;
20             if (p.hash == hash &&
21                 ((k = p.key) == key || (key != null && key.equals(k))))
22                 e = p;
23             else if (p instanceof TreeNode)
24                 e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
25             else {
26                 for (int binCount = 0; ; ++binCount) {
27                     if ((e = p.next) == null) {
28                         p.next = newNode(hash, key, value, null);
29                         if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
30                             treeifyBin(tab, hash);
31                         break;
32                     }
33                     if (e.hash == hash &&
34                         ((k = e.key) == key || (key != null && key.equals(k))))
35                         break;
36                     p = e;
37                 }
38             }
39             if (e != null) { // existing mapping for key
40                 V oldValue = e.value;
41                 if (!onlyIfAbsent || oldValue == null)
42                     e.value = value;
43                 afterNodeAccess(e);
44                 return oldValue;
45             }
46         }
47         ++modCount;
48         if (++size > threshold)
49             resize();
50         afterNodeInsertion(evict);
51         return null;
52     }

　　其中紅色部分就是當鏈表上的鍵值對大於8時，將鏈表轉化爲紅黑樹。TREEIFY_THRESHOLD 的初始化值爲8。

3.hashtable是線程安全且效率低的

　　hashTable其實和hashMap原理類似（1.7，1.8），不一樣點有四個：

　  （1）.hashTable是線程安全而 HashMap不是線程安全的。

　　（2）.HashTable不容許key和value爲null 而 HashMap容許。

　　（3）.hashtable初始化大小爲11，默認加載因子爲0.75，擴容後容量是原來的2倍+1，而hashMap初始化容量大小爲16，默認加載因子爲0.75，
         擴容後的容量是原來的2倍。
  
   （4）.hashtable計算hash是直接使用key的hashcode對table數組的長度直接進行取模，hashmap計算hash對key的hashcode進行了二次hash，
         以得到更好的散列值，而後對table數組長度取摸

　　實現線程安全的方法則是使用synchronized關鍵字，下面看下hashtable部分源碼：

public synchronized int size() {
    return count;    
}

public synchronized V put(K key, V value) {
    // Make sure the value is not null
    if (value == null) {
        throw new NullPointerException();
    }

    // Makes sure the key is not already in the hashtable.
    Entry tab[] = table;
    int hash = key.hashCode();
    int index = (hash & 0x7FFFFFFF) % tab.length;
    for (Entry<K,V> e = tab[index] ; e != null ; e = e.next) {
        if ((e.hash == hash) && e.key.equals(key)) {
        V old = e.value;
        e.value = value;
        return old;
        }
    }

    modCount++;
    if (count >= threshold) {
        // Rehash the table if the threshold is exceeded
        rehash();

            tab = table;
            index = (hash & 0x7FFFFFFF) % tab.length;
    }

    // Creates the new entry.
    Entry<K,V> e = tab[index];
    tab[index] = new Entry<K,V>(hash, key, value, e);
    count++;
    return null;
    }

　　能夠看出，在源碼中，在不少方法彙總都插入了synchronized關鍵在保證同步，所以，在擴容時，不會出現多個線程同一時間間隔內擴容，因此不會出現死循環。在LZ上篇文中（http://www.javashuo.com/article/p-arfmgdvg-r.html）已經詳細介紹了synchronized，它一次只容許一個線程執行鎖中的代碼，故而，hashtable是線程安全且效率低的。

　　HashMap中只有一條記錄能夠是一個空的key，但任意數量的條目能夠是空的value。若是在表中沒有發現搜索鍵，或者若是發現了搜索鍵，但它是一個空的值，那麼get()將返回null。若是有必要，用containKey()方法來區別這兩種狀況。

　　爲何HashTable和ConcurrentHashMap都不容許key和value爲null 而 HashMap容許？

　　網上找到的答案是這樣的：ConcurrentHashmap和Hashtable都是支持併發的，這樣會有一個問題，當你經過get(k)獲取對應的value時，若是獲取到的是null時，你沒法判斷，它是put（k,v）的時候value爲null，仍是這個key歷來沒有作過映射。HashMap是非併發的，能夠經過contains(key)來作這個判斷。而支持併發的Map在調用m.contains（key）和m.get(key),m可能已經不一樣了。

4.優秀的ConcurrentHashMap

　　在涉及到Java多線程開發時，若是咱們使用HashMap可能會致使死鎖問題，使用HashTable效率又不高。而ConcurrentHashMap既能夠保持同步也能夠提升併發效率，因此這個時候ConcurrentHashmap是咱們最好的選擇。

　　CurrentHashMap底層是一個複雜的數據結構，先看圖。

　　上圖就是ConcurrenthashMap(1.8)的數據結構圖，它是由Segment數組和hashMap組成的。其中每個Segment都對應一個hashmap，由Segment，在jdk1.7中，ConcurrentHashMap使用的hashmap是jdk1.7中的hashMap，在jdk1.8中，ConcurrentHashMap使用的HashMap是jdk1.8中的hashMap，其原理相似，且Jdk1.7中的hashMap上文已經作過介紹，故，在此只介紹1,8中的ConcurrentHashMap。

　　ConcurrentHashMap的優勢是使用了Segment數組，Segment數組的每個元素對用一個hashmap。Segment繼承了ReentrantLock ，使用ReentrantLock 對數組某些元素加鎖，即只對部分hashMap加鎖，從而實現了只對須要加鎖的的某一段數進行加鎖，實現了多線程併發的操做，這種加鎖方式就是分段鎖。

　　Segment繼承了ReentrantLock的源碼以下：

 /**
     * Stripped-down version of helper class used in previous version,
     * declared for the sake of serialization compatibility
     */
    static class Segment<K,V> extends ReentrantLock implements Serializable {
        private static final long serialVersionUID = 2249069246763182397L;
        final float loadFactor;
        Segment(float lf) { this.loadFactor = lf; }
    }

　　一些默認的參數：

/*
     * 最大可能的擴容數量爲1 << 30，即2的30次方。
     * 說明：
     * 1.HashMap在肯定數組下標Index的時候，採用的是( length-1) & hash的方式，
     *   只有當length爲2的指數冪的時候才能較均勻的分佈元素
     * 2.因爲HashMap規定了其容量是2的n次方，因此咱們採用位運算<<來控制HashMap的大小。
     * 使用位運算同時還提升了Java的處理速度。HashMap內部由Entry[]數組構成，
     * Java的數組下標是由Int表示的。因此對於HashMap來講其最大的容量應該是不超過int最大值的一個2的指數冪，
     * 而最接近int最大值的2個指數冪用位運算符表示就是 1 << 30
     */
    private static final int MAXIMUM_CAPACITY = 1 << 30;

    /*
     *  默認初始表容量。 必須是2的冪,（即至少爲1）且最多爲MAXIMUM_CAPACITY。
     *  因此HashMap規定了其容量必須是2的n次方
     */
    private static final int DEFAULT_CAPACITY = 16;

    /*
     * 最大可能（非冪2）陣列大小，須要使用toArray和相關方法。
     * MAX_VALUE = 0x7fffffff;
     * 數組做爲一個對象，須要必定的內存存儲對象頭信息，對象頭信息最大佔用內存不可超過8字節
     */
    static final int MAX_ARRAY_SIZE = Integer.MAX_VALUE - 8;

    /*
     * 此表的默認併發級別。即Segment數組的大小，
     * 也就是默認會建立 16 個箱子，箱子的個數不能太多或太少。
     * 若是太少，很容易觸發擴容，若是太多，遍歷哈希表會比較慢。
     */
    private static final int DEFAULT_CONCURRENCY_LEVEL = 16;

    /*
     * 默認加載因子，
     * 當鍵值對的數量大於 16 * 0.75 = 12 時，就會觸發擴容
     */
    private static final float LOAD_FACTOR = 0.75f;

    /*
     * 計數閾值,當鏈表中的數量大於等於8時，鏈表轉化爲紅黑樹
     * 由於紅黑樹須要的結點至少爲8個
     */
    static final int TREEIFY_THRESHOLD = 8;

    /*
     * 在哈希表擴容時，若是發現鏈表長度小於 6，則會由樹從新退化爲鏈表
     */
    static final int UNTREEIFY_THRESHOLD = 6;

   /*
    * 在轉變成樹以前，會作一次判斷，只有鍵值對數量大於 64 纔會發生轉換。
    * 這是爲了不在哈希表創建初期，多個鍵值對剛好被放入了同一個鏈表中而致使沒必要要的轉化。
    */
    static final int MIN_TREEIFY_CAPACITY = 64;

　　ConcurrentHashMap和hashMap的原理相似，下面是一些重要的類：

Node：

static class Node<K,V> implements Map.Entry<K,V> {
        final int hash;
        final K key;
        volatile V val;
        volatile Node<K,V> next;

        Node(int hash, K key, V val, Node<K,V> next) {
            this.hash = hash;
            this.key = key;
            this.val = val;
            this.next = next;
        }

Node類是構造鏈表或者紅黑樹的結點的類，主要包含key，value，hash和next。

TreeNode：

static final class TreeNode<K,V> extends Node<K,V> {
        TreeNode<K,V> parent;  // red-black tree links
        TreeNode<K,V> left;
        TreeNode<K,V> right;
        TreeNode<K,V> prev;    // needed to unlink next upon deletion
        boolean red;

        TreeNode(int hash, K key, V val, Node<K,V> next,
                 TreeNode<K,V> parent) {
            super(hash, key, val, next);
            this.parent = parent;
        }

        Node<K,V> find(int h, Object k) {
            return findTreeNode(h, k, null);
        }

        /**
         * 返回給定鍵的TreeNode（若是未找到，則返回null）
         * 從給定的根開始。
         */
        final TreeNode<K,V> findTreeNode(int h, Object k, Class<?> kc) {
            if (k != null) {
                TreeNode<K,V> p = this;
                do  {
                    int ph, dir; K pk; TreeNode<K,V> q;
                    TreeNode<K,V> pl = p.left, pr = p.right;
                    if ((ph = p.hash) > h)
                        p = pl;
                    else if (ph < h)
                        p = pr;
                    else if ((pk = p.key) == k || (pk != null && k.equals(pk)))
                        return p;
                    else if (pl == null)
                        p = pr;
                    else if (pr == null)
                        p = pl;
                    else if ((kc != null ||
                              (kc = comparableClassFor(k)) != null) &&
                             (dir = compareComparables(kc, k, pk)) != 0)
                        p = (dir < 0) ? pl : pr;
                    else if ((q = pr.findTreeNode(h, k, kc)) != null)
                        return q;
                    else
                        p = pl;
                } while (p != null);
            }
            return null;
        }
    }

TreeNode類是對紅黑樹的描述，主要方法是返回給定鍵的TreeNode。
再看put方法：

final V putVal(K key, V value, boolean onlyIfAbsent) {
        if (key == null || value == null) throw new NullPointerException();
        int hash = spread(key.hashCode());
        int binCount = 0;
        
        for (Node<K,V>[] tab = table;;) {
            //初始化數組
            Node<K,V> f; int n, i, fh;
            if (tab == null || (n = tab.length) == 0)
                tab = initTable();
            //找到具體的數組下標，若是此位置沒有值，那麼直接初始化一下 Node ,並把值放在這個位置
            else if ((f = tabAt(tab, i = (n - 1) & hash)) == null) {
                if (casTabAt(tab, i, null,
                             new Node<K,V>(hash, key, value, null)))
                    break;                   // no lock when adding to empty bin
            }
            else if ((fh = f.hash) == MOVED)
                tab = helpTransfer(tab, f);
            else {
                V oldVal = null;
                synchronized (f) {
                    if (tabAt(tab, i) == f) {
                        if (fh >= 0) {
                            binCount = 1;
                            //將結點加入到鏈表中
                            for (Node<K,V> e = f;; ++binCount) {
                                K ek;
                                if (e.hash == hash &&
                                    ((ek = e.key) == key ||
                                     (ek != null && key.equals(ek)))) {
                                    oldVal = e.val;
                                    if (!onlyIfAbsent)
                                        e.val = value;
                                    break;
                                }
                                Node<K,V> pred = e;
                                if ((e = e.next) == null) {
                                    pred.next = new Node<K,V>(hash, key,
                                                              value, null);
                                    break;
                                }
                            }
                        }
                        //將結點加入到紅黑樹中
                        else if (f instanceof TreeBin) {
                            Node<K,V> p;
                            binCount = 2;
                            if ((p = ((TreeBin<K,V>)f).putTreeVal(hash, key,
                                                           value)) != null) {
                                oldVal = p.val;
                                if (!onlyIfAbsent)
                                    p.val = value;
                            }
                        }
                    }
                }
                if (binCount != 0) {
                    if (binCount >= TREEIFY_THRESHOLD)
                        //若是結點個數大於等於8，則轉化爲紅黑樹
                        treeifyBin(tab, i);
                    if (oldVal != null)
                        return oldVal;
                    break;
                }
            }
        }
        addCount(1L, binCount);
        return null;
    }