在多線程場景下使用HashMap會形成死循環,CPU100%等問題,因此咱們不能在多線程場景下使用HashMap,另一個集合類HashTable是線程安全的,但其使用synchronized
這種粗粒度的鎖來實現的,因此併發場景下性能低下,在多線程(併發)場景下咱們推薦使用ConcurrentHashMap類。 這裏放一張ConcurrentHashMap的類圖: java
能夠看出該類也實現了Map接口,因此一般能夠直接替換HashMap使用而不用修改業務代碼。node
HashTable之因此性能低下,緣由是多線程競爭同一把鎖(HashTable粗暴的爲整個存儲結構加了鎖),而ConcurrentHashMap則改進這了一點。該類經過分段加鎖來下降資源競爭,底層的存儲數組結構再也不像HashMap同樣直接是一個哈希表(數組),而是使用Segment數組來實現分片,Segment類繼承了ReentrantLock類,因此它自己也是一個可重入鎖,每一個Segment則至關於一個HashMap,一樣使用哈希表存儲數據,每一個Bucket都是一個鏈表,其內部實現思想與HashMap基本一致,不一樣的是put、remove等方法都是加了鎖的。這樣分段加鎖的好處是,若是兩個線程操做的不是同一個Segment,則相互不影響,不用相互等待,從而提高了性能。數組
Segment數組自己是不加鎖的,那麼在向ConcurrentHashMap中添加元素時,會根據鍵計算出的HashCode來定位Segment,這個過程由於不涉及修改操做,因此不須要加鎖。而針對特定的Segment內部數據進行操做,則須要加鎖,下面以JDK1.7版ConcurrentHashMap源碼爲例進行解讀。安全
ConcurrentHashMap底層實現涉及多個內部類,這裏簡述一下多線程
static final class HashEntry<K,V> { final int hash; final K key; volatile V value; volatile HashEntry<K,V> next; // ... ... }
static final class Segment<K,V> extends ReentrantLock implements Serializable { static final int MAX_SCAN_RETRIES = Runtime.getRuntime().availableProcessors() > 1 ? 64 : 1; transient volatile HashEntry<K,V>[] table; transient int count; transient int modCount; transient int threshold; final float loadFactor; Segment(float lf, int threshold, HashEntry<K,V>[] tab) {} final V put(K key, int hash, V value, boolean onlyIfAbsent) {} @SuppressWarnings("unchecked") private void rehash(HashEntry<K,V> node) {} private HashEntry<K,V> scanAndLockForPut(K key, int hash, V value) {} private void scanAndLock(Object key, int hash) {} final V remove(Object key, int hash, Object value) {} final boolean replace(K key, int hash, V oldValue, V newValue) {} final V replace(K key, int hash, V value) {} final void clear() {} }
ConcurrentHashMap中分段是由Segment數組實現的,而每一個Segment的內部存儲結構爲哈希表(數組),而每一個Bucket則是由HashEntry構成的鏈表組成(這點與HashMap是同樣的)。併發
下面經過ConcurrentHashMap中的幾個主要方法來解讀less
public ConcurrentHashMap(int initialCapacity, float loadFactor, int concurrencyLevel) { if (!(loadFactor > 0) || initialCapacity < 0 || concurrencyLevel <= 0) throw new IllegalArgumentException(); if (concurrencyLevel > MAX_SEGMENTS) concurrencyLevel = MAX_SEGMENTS; // Find power-of-two sizes best matching arguments int sshift = 0; int ssize = 1; // 找到恰好比 concurrencyLevel 大或相等的2的整數次冪 while (ssize < concurrencyLevel) { ++sshift; ssize <<= 1; } this.segmentShift = 32 - sshift; this.segmentMask = ssize - 1; if (initialCapacity > MAXIMUM_CAPACITY) initialCapacity = MAXIMUM_CAPACITY; int c = initialCapacity / ssize; if (c * ssize < initialCapacity) ++c; // 計算每段容量(取恰好大於等於c的2的整數次冪) int cap = MIN_SEGMENT_TABLE_CAPACITY; while (cap < c) cap <<= 1; // create segments and segments[0] Segment<K,V> s0 = new Segment<K,V>(loadFactor, (int)(cap * loadFactor), (HashEntry<K,V>[])new HashEntry[cap]); Segment<K,V>[] ss = (Segment<K,V>[])new Segment[ssize]; UNSAFE.putOrderedObject(ss, SBASE, s0); // ordered write of segments[0] this.segments = ss; }
與HashMap不一樣該類的構造方法多了一個concurrencyLevel
參數,該參數主要用於控制分段數,該類的其它構造方法都脫胎與該方法,這裏再也不贅述,其中無參構造方法中的參數默認值分別是:initialCapacity=16
、loadFactor=0.75f
、concurrencyLevel=16
。 構造方法中分別初始化了:分段數、每段容器大小、Segment數組和第一個Segment節點。ssh
public boolean isEmpty() { long sum = 0L; final Segment<K,V>[] segments = this.segments; for (int j = 0; j < segments.length; ++j) { Segment<K,V> seg = segmentAt(segments, j); if (seg != null) { if (seg.count != 0) return false; sum += seg.modCount; } } if (sum != 0L) { // recheck unless no modifications for (int j = 0; j < segments.length; ++j) { Segment<K,V> seg = segmentAt(segments, j); if (seg != null) { if (seg.count != 0) return false; sum -= seg.modCount; } } if (sum != 0L) return false; } return true; } public int size() { // Try a few times to get accurate count. On failure due to // continuous async changes in table, resort to locking. final Segment<K,V>[] segments = this.segments; int size; boolean overflow; // true if size overflows 32 bits long sum; // sum of modCounts long last = 0L; // previous sum int retries = -1; // first iteration isn't retry try { for (;;) { if (retries++ == RETRIES_BEFORE_LOCK) { for (int j = 0; j < segments.length; ++j) ensureSegment(j).lock(); // force creation } sum = 0L; size = 0; overflow = false; for (int j = 0; j < segments.length; ++j) { Segment<K,V> seg = segmentAt(segments, j); if (seg != null) { sum += seg.modCount; int c = seg.count; if (c < 0 || (size += c) < 0) overflow = true; } } if (sum == last) break; last = sum; } } finally { if (retries > RETRIES_BEFORE_LOCK) { for (int j = 0; j < segments.length; ++j) segmentAt(segments, j).unlock(); } } return overflow ? Integer.MAX_VALUE : size; }
兩個實現方法的思路相同,都是遍歷所有Segment,再計算每一個Segment內部元素個數。須要注意的是爲了防止在方法執行過程當中,Segment自己會發生變化(如:添加、刪除元素等),但遍歷過程當中對Segment加鎖,方法執行結束後釋放鎖,因此這兩個方法的性能不如HashMap的高(應用場景不一樣,自己也沒什麼可比性)。async
public V put(K key, V value) { Segment<K,V> s; if (value == null) throw new NullPointerException(); int hash = hash(key); int j = (hash >>> segmentShift) & segmentMask; if ((s = (Segment<K,V>)UNSAFE.getObject // nonvolatile; recheck (segments, (j << SSHIFT) + SBASE)) == null) // in ensureSegment s = ensureSegment(j); return s.put(key, hash, value, false); } public V putIfAbsent(K key, V value) { Segment<K,V> s; if (value == null) throw new NullPointerException(); int hash = hash(key); int j = (hash >>> segmentShift) & segmentMask; if ((s = (Segment<K,V>)UNSAFE.getObject (segments, (j << SSHIFT) + SBASE)) == null) s = ensureSegment(j); return s.put(key, hash, value, true); } static final class Segment<K,V> extends ReentrantLock implements Serializable { final V put(K key, int hash, V value, boolean onlyIfAbsent) { HashEntry<K,V> node = tryLock() ? null : scanAndLockForPut(key, hash, value); V oldValue; try { HashEntry<K,V>[] tab = table; int index = (tab.length - 1) & hash; HashEntry<K,V> first = entryAt(tab, index); for (HashEntry<K,V> e = first;;) { if (e != null) { K k; if ((k = e.key) == key || (e.hash == hash && key.equals(k))) { oldValue = e.value; if (!onlyIfAbsent) { e.value = value; ++modCount; } break; } e = e.next; } else { if (node != null) node.setNext(first); else node = new HashEntry<K,V>(hash, key, value, first); int c = count + 1; if (c > threshold && tab.length < MAXIMUM_CAPACITY) rehash(node); else setEntryAt(tab, index, node); ++modCount; count = c; oldValue = null; break; } } } finally { unlock(); } return oldValue; } }
put方法的邏輯比較深,但有HashMap的源碼基礎的話,其實也不復雜。在ConcurrentHashMap中的put方法實際上只是根據HashCode找到對應的Segment,這個過程不須要加鎖,而實際put動做是由Segment類中的put方法完成的。 該方法相比HashMap中的put方法,只是增長了鎖的機制(畢竟是面向多線程場景)。性能
public boolean containsKey(Object key) { Segment<K,V> s; // same as get() except no need for volatile value read HashEntry<K,V>[] tab; int h = hash(key); long u = (((h >>> segmentShift) & segmentMask) << SSHIFT) + SBASE; if ((s = (Segment<K,V>)UNSAFE.getObjectVolatile(segments, u)) != null && (tab = s.table) != null) { for (HashEntry<K,V> e = (HashEntry<K,V>) UNSAFE.getObjectVolatile (tab, ((long)(((tab.length - 1) & h)) << TSHIFT) + TBASE); e != null; e = e.next) { K k; if ((k = e.key) == key || (e.hash == h && key.equals(k))) return true; } } return false; }
只是簡單的查找,與size不一樣的是,不須要加鎖(確實也沒有加鎖的必要,若是元素存在則再也不添加,可使用putIfAbsent方法)。
public V get(Object key) { Segment<K,V> s; // manually integrate access methods to reduce overhead HashEntry<K,V>[] tab; int h = hash(key); long u = (((h >>> segmentShift) & segmentMask) << SSHIFT) + SBASE; if ((s = (Segment<K,V>)UNSAFE.getObjectVolatile(segments, u)) != null && (tab = s.table) != null) { for (HashEntry<K,V> e = (HashEntry<K,V>) UNSAFE.getObjectVolatile (tab, ((long)(((tab.length - 1) & h)) << TSHIFT) + TBASE); e != null; e = e.next) { K k; if ((k = e.key) == key || (e.hash == h && key.equals(k))) return e.value; } } return null; }
public V remove(Object key) { int hash = hash(key); Segment<K,V> s = segmentForHash(hash); return s == null ? null : s.remove(key, hash, null); } static final class Segment<K,V> extends ReentrantLock implements Serializable { final V remove(Object key, int hash, Object value) { if (!tryLock()) scanAndLock(key, hash); V oldValue = null; try { HashEntry<K,V>[] tab = table; int index = (tab.length - 1) & hash; HashEntry<K,V> e = entryAt(tab, index); HashEntry<K,V> pred = null; while (e != null) { K k; HashEntry<K,V> next = e.next; if ((k = e.key) == key || (e.hash == hash && key.equals(k))) { V v = e.value; if (value == null || value == v || value.equals(v)) { if (pred == null) setEntryAt(tab, index, next); else pred.setNext(next); ++modCount; --count; oldValue = v; } break; } pred = e; e = next; } } finally { unlock(); } return oldValue; } }
偷懶了,偷懶了,最近每天看源碼,看得頭大,這篇就到這裏了(草草結束),主要是理解實現原理,後面再完善細節吧。