ConcurrentHashMap分析

時間 2019-11-17

原文原文鏈接

ConcurrentHashMap分析：

聞其名，便知其義，併發的hashmap, 咱們先來看看ConcurrentHashMap數據結構圖： java

ConcurrentHashMap由多個Segment組成，而Segment內部是由HashEntry(存放key-value對)數組組成(相似於HashMap的Entry數組)。 node

從代碼來看ConcurrentHashMap的基本屬性：數組

//segment掩碼值: 用於計算key所在segments索引值
final int segmentMask;
//segment偏移值: 用於計算key所在segments索引值   
final int segmentShift;
//segments數組，其內部也是由HashEntry數組實現，正由於有了多個segment，才提升了併發度   
final Segment<K,V>[] segments;

看到重要的Segment數據結構：

/**
 * 其實現了ReentrantLock, 自身可線程安全
 * 其自己就像個HashMap
 */
static final class Segment<K,V> extends ReentrantLock implements Serializable {
    //存放元素的table
    transient volatile HashEntry<K,V>[] table;
    //元素個數
    transient int count;
    //table resize閾值
    transient int threshold;
    //裝載因子,默認0.75   
    final float loadFactor;
    ...
}

仍是先從ConcurrentHashMap初始化工做開始提及：

public ConcurrentHashMap(int initialCapacity,
                             float loadFactor, int concurrencyLevel) {
    if (!(loadFactor > 0) || initialCapacity < 0 || concurrencyLevel <= 0)
            throw new IllegalArgumentException();
    if (concurrencyLevel > MAX_SEGMENTS) //併發級別，默認16，最大值爲65536
            concurrencyLevel = MAX_SEGMENTS;
    // Find power-of-two sizes best matching arguments
    int sshift = 0;
    int ssize = 1; //segment數組的大小，必須是大於concurrentLevel且最小的2的指數
    while (ssize < concurrencyLevel) { //找到大於等於conrrencyLevel且爲2的指數的最小ssize
         ++sshift;
         ssize <<= 1;
    }
    this.segmentShift = 32 - sshift; //segmentShift段偏移, 32即hashCode是int型(4字節32位),用來計算key所在segment下標
    this.segmentMask = ssize - 1; //segment段掩碼：2^ssize - 1, 相似與子網掩碼的道理，ssize默認16，掩碼就是1111,用來計算key所在segment下標
    if (initialCapacity > MAXIMUM_CAPACITY) //初始化容量(segments數組)，默認16
            initialCapacity = MAXIMUM_CAPACITY;
    int c = initialCapacity / ssize;
    if (c * ssize < initialCapacity)
        ++c;
    int cap = MIN_SEGMENT_TABLE_CAPACITY; //segment中的table數組大小，最小爲2, 值也必須是2的指數倍
    while (cap < c)
        cap <<= 1;
    // create segments and segments[0]
    Segment<K,V> s0 =
        new Segment<K,V>(loadFactor, (int)(cap * loadFactor),
                         (HashEntry<K,V>[])new HashEntry[cap]); //建立segment[0]，用於後面建立其餘segment的模版
    Segment<K,V>[] ss = (Segment<K,V>[])new Segment[ssize]; //建立segments
    UNSAFE.putOrderedObject(ss, SBASE, s0); // ordered write of segments[0]
    this.segments = ss;
}

一些基本的操做實現put(), get(), remove(),size()：安全

put操做實現：

public V put(K key, V value) {
    Segment<K,V> s;
    if (value == null)
        throw new NullPointerException(); //鍵值都不可爲null
    int hash = hash(key); //計算key的hash值
    int j = (hash >>> segmentShift) & segmentMask; //計算key所在segment索引值,保證j值會在segments索引範圍內
    if ((s = (Segment<K,V>)UNSAFE.getObject(segments, (j << SSHIFT) + SBASE)) == null)//若對應segment不存在
        s = ensureSegment(j); //建立segment
    return s.put(key, hash, value, false);
}

hash計算, 與HashMap有區別：

private int hash(Object k) {
     int h = hashSeed;

     if ((0 != h) && (k instanceof String)) {
         return sun.misc.Hashing.stringHash32((String) k);
     }

     h ^= k.hashCode();
     h += (h <<  15) ^ 0xffffcd7d;
     h ^= (h >>> 10);
     h += (h <<   3);
     h ^= (h >>>  6);
     h += (h <<   2) + (h << 14);
     return h ^ (h >>> 16);
}

ensureSegment方法：

private Segment<K,V> ensureSegment(int k) {
    final Segment<K,V>[] ss = this.segments;
    long u = (k << SSHIFT) + SBASE; // raw offset
    Segment<K,V> seg;
    if ((seg = (Segment<K,V>)UNSAFE.getObjectVolatile(ss, u)) == null) {
         Segment<K,V> proto = ss[0]; // use segment 0 as prototype
         int cap = proto.table.length;
         float lf = proto.loadFactor;
         int threshold = (int)(cap * lf);
         HashEntry<K,V>[] tab = (HashEntry<K,V>[])new HashEntry[cap];
         if ((seg = (Segment<K,V>)UNSAFE.getObjectVolatile(ss, u))
               == null) { // recheck
                Segment<K,V> s = new Segment<K,V>(lf, threshold, tab); //建立新的segment
                while ((seg = (Segment<K,V>)UNSAFE.getObjectVolatile(ss, u))
                       == null) {
                    if (UNSAFE.compareAndSwapObject(ss, u, null, seg = s))
                        break;
                }
          }
    }
    return seg;
}

繼續看Segment的put方法實現：

final V put(K key, int hash, V value, boolean onlyIfAbsent) {
    HashEntry<K,V> node = tryLock() ? null : //獲取到了segment鎖，node爲null
          scanAndLockForPut(key, hash, value); //未獲取到鎖，則在等鎖過程當中先定位，構建新的node節點
    V oldValue;
    try {
        HashEntry<K,V>[] tab = table;
        int index = (tab.length - 1) & hash; //根據key的hash值計算key在table中的索引
        HashEntry<K,V> first = entryAt(tab, index); //獲取第一個對應bucket的第一個HashEntry
        for (HashEntry<K,V> e = first;;) {
             if (e != null) { //該HashEntry已經有元素
                  K k;
                  if ((k = e.key) == key ||
                        (e.hash == hash && key.equals(k))) { //若key相等
                       oldValue = e.value;
                       if (!onlyIfAbsent) { //須要覆蓋舊值
                            e.value = value;
                            ++modCount;
                       }
                       break;
                  }
                  e = e.next;
              } else { //找完整個HashEntry bucket鏈表都沒有相等的元素，則插入
                 if (node != null) //若前面等待鎖時，已經初始化了node
                      node.setNext(first); //添加到bucket鏈表頭部
                 else //新建node
                      node = new HashEntry<K,V>(hash, key, value, first); 
                      int c = count + 1;
                      if (c > threshold && tab.length < MAXIMUM_CAPACITY)
                           rehash(node); //擴容
                      else
                           setEntryAt(tab, index, node); //插入新HashEntry到table的index下標位置
                      ++modCount;
                      count = c;
                      oldValue = null;
                      break;
              }
      }
    } finally {
      unlock(); //解鎖該segment
    }
    return oldValue;
}

也可看看等鎖過程scanAndLockForPut()方法：

private HashEntry<K,V> scanAndLockForPut(K key, int hash, V value) {
      HashEntry<K,V> first = entryForHash(this, hash); //該hash值對應的bucket鏈表的第一個節點
      HashEntry<K,V> e = first; 
      HashEntry<K,V> node = null;
      int retries = -1; // negative while locating node
      while (!tryLock()) { //未獲取到鎖繼續嘗試構建new node
           HashEntry<K,V> f; // to recheck first below
           if (retries < 0) { 
                    if (e == null) { //第一個節點爲null, 表示該bucket index未被佔用
                        if (node == null) // 建立新節點
                            node = new HashEntry<K,V>(hash, key, value, null);
                        retries = 0;
                    } else if (key.equals(e.key))//若找到相等的元素，就不用再嘗試了
                        retries = 0;
                     else //繼續看下一個節點
                        e = e.next;
            } else if (++retries > MAX_SCAN_RETRIES) { //嘗試次數太多，就直接鎖上，該值在cpu核數>1時爲64次，不然爲1次
                    lock();
                    break;
            } else if ((retries & 1) == 0 && 
                         (f = entryForHash(this, hash)) != first) { //若node新建了或找到相等，可是這時有可能在等鎖過程，其餘線程修改了頭節點(那個節點hash後也在相同的bucket index)或刪除該頭節點
                    e = first = f; // re-traverse if entry changed
                    retries = -1;
            }
     }
     return node;
}

上面這個過程相似put裏的過程，只是但願線程在被鎖住了能夠儘可能提早作一些事情。數據結構

最後再來看看，擴容rehash的過程：多線程

private void rehash(HashEntry<K,V> node) {
      HashEntry<K,V>[] oldTable = table;
      int oldCapacity = oldTable.length;
      int newCapacity = oldCapacity << 1; //擴容爲原來的2倍
      threshold = (int)(newCapacity * loadFactor); //新的閾值
      HashEntry<K,V>[] newTable =
            (HashEntry<K,V>[]) new HashEntry[newCapacity];
      int sizeMask = newCapacity - 1; // table掩碼
      for (int i = 0; i < oldCapacity ; i++) {
          HashEntry<K,V> e = oldTable[i];
          if (e != null) {
                HashEntry<K,V> next = e.next;
                int idx = e.hash & sizeMask;
                if (next == null)   //在該bucket上只有一個節點，則直接添加到新table裏
                    newTable[idx] = e;
                else { // 該bucket鏈表上不止一個節點，則保持整個鏈表重用
                    HashEntry<K,V> lastRun = e;
                    int lastIdx = idx;
                    for (HashEntry<K,V> last = next; //找到該bucket鏈上最後一個節點
                         last != null;
                         last = last.next) {
                         int k = last.hash & sizeMask;
                         if (k != lastIdx) {
                             lastIdx = k;
                             lastRun = last;
                         }
                    }
                    newTable[lastIdx] = lastRun; //賦值該bucketin最後一個節點
                    //依次克隆該bucket鏈表上的全部節點
                    for (HashEntry<K,V> p = e; p != lastRun; p = p.next) {
                         V v = p.value;
                         int h = p.hash;
                         int k = h & sizeMask;
                         HashEntry<K,V> n = newTable[k];
                         newTable[k] = new HashEntry<K,V>(h, p.key, v, n);
                    }
                }
            }
        }
        int nodeIndex = node.hash & sizeMask; //添加新的節點
        node.setNext(newTable[nodeIndex]);
        newTable[nodeIndex] = node;
        table = newTable;
  }

這個put操做就簡略說了，繼續看看get方法吧。併發

get操做實現, 比較好明白：

public V get(Object key) {
    Segment<K,V> s; // manually integrate access methods to reduce overhead
    HashEntry<K,V>[] tab;
    int h = hash(key.hashCode());//根據key的hashCode計算hash值
    long u = (((h >>> segmentShift) & segmentMask) << SSHIFT) + SBASE; //獲得其key所在segment下標
    if ((s = (Segment<K,V>)UNSAFE.getObjectVolatile(segments, u)) != null &&
         (tab = s.table) != null) {
         for (HashEntry<K,V> e = (HashEntry<K,V>) UNSAFE.getObjectVolatile //獲取該key對應segment.table中的第一個節點
                     (tab, ((long)(((tab.length - 1) & h)) << TSHIFT) + TBASE);
             e != null; e = e.next) { //遍歷bucket鏈表，比較，返回
             K k;
             if ((k = e.key) == key || (e.hash == h && key.equals(k)))
                  return e.value;
         }
    }
    return null;
}

remove操做實現：

public V remove(Object key) {
     int hash = hash(key.hashCode()); //計算hash值
     Segment<K,V> s = segmentForHash(hash); //定位segment
     return s == null ? null : s.remove(key, hash, null);
}

private Segment<K,V> segmentForHash(int h) {
     long u = (((h >>> segmentShift) & segmentMask) << SSHIFT) + SBASE;
     return (Segment<K,V>) UNSAFE.getObjectVolatile(segments, u);
}

Segment中的remove方法，基本就是鏈表的刪除操做： ssh

final V remove(Object key, int hash, Object value) {
    if (!tryLock()) //請求鎖
         scanAndLock(key, hash); //嘗試獲取鎖
     V oldValue = null;
     try {
         HashEntry<K,V>[] tab = table;
         int index = (tab.length - 1) & hash; //根據hash計算元素所在table的索引
         HashEntry<K,V> e = entryAt(tab, index); //獲取該元素
         HashEntry<K,V> pred = null;
         while (e != null) {
              K k;
              HashEntry<K,V> next = e.next;
              if ((k = e.key) == key ||
                     (e.hash == hash && key.equals(k))) {
                 V v = e.value;
                 if (value == null || value == v || value.equals(v)) {
                       if (pred == null) //刪除元素頭節點
                            setEntryAt(tab, index, next);
                       else  //將刪除節點的前一個節點--->刪除節點的下一個節點
                            pred.setNext(next);
                       ++modCount;
                       --count;
                       oldValue = v;
                  }
                  break;
              }
              pred = e;
              e = next;
           }
     } finally {
         unlock();//解鎖
     }
     return oldValue;
}

最後看看size操做實現，這個是有點麻煩的，由於頗有可能不少線程都在添加或刪除操做segments, 看看怎麼統計size：

public int size() {
        // Try a few times to get accurate count. On failure due to
        // continuous async changes in table, resort to locking.
        final Segment<K,V>[] segments = this.segments;
        int size;
        boolean overflow; // true if size overflows 32 bits
        long sum;         // sum of modCounts
        long last = 0L;   // previous sum
        int retries = -1; // first iteration isn't retry
        try {
            for (;;) {
                if (retries++ == RETRIES_BEFORE_LOCK) { //在對每一個segment加鎖前先嚐試不加鎖(假設沒有線程寫操做)，默認嘗試2次
                    for (int j = 0; j < segments.length; ++j)
                        ensureSegment(j).lock(); //加鎖
                }
                sum = 0L;
                size = 0;
                overflow = false;
                for (int j = 0; j < segments.length; ++j) {
                    Segment<K,V> seg = segmentAt(segments, j);
                    if (seg != null) {
                        sum += seg.modCount;
                        int c = seg.count;
                        if (c < 0 || (size += c) < 0)
                            overflow = true;
                    }
                }
                if (sum == last)
                    break;
                last = sum;
            }
        } finally {
            if (retries > RETRIES_BEFORE_LOCK) { //說明嘗試失敗，要解鎖
                for (int j = 0; j < segments.length; ++j)
                    segmentAt(segments, j).unlock();
            }
        }
        return overflow ? Integer.MAX_VALUE : size;
}

上面就分析了ConcurrentHashMap的一些基本操做，仍是比較有意思的，可能你會看到這裏面有不少UNSAFE相關的操做，這是非jdk核心庫的一個類，聞其名，就不安全，但jdk裏不少都會用，由於其操做的性能要比普通的操做高，能夠了解相關文章，那麼ConcurrentHashMap併發性能到底怎麼樣呢？作了一些簡單的性能測試, ConcurrentHashMap和HashTable：

5個線程，插入100w對象，ConcurrentHashMap性能高於HashTable, 並且會隨着線程數和數據量增長，性能差會更大。 async

不吝指正。性能

相關標籤/搜索

concurrenthashmap

分析

concurrenthashmap#helptransfer

concurrenthashmap#transfer

concurrenthashmap#put

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。