一。Getting Starthtml
Again and again,until you master it.早在接觸java.util包的時候,咱們都會去閱讀ArrayList,甚至也會去閱讀HashMap(畢竟面試必考)。然而咱們有可能」知道「了它們,卻不必定」理解「它們。爲了更深刻的瞭解它們,筆者決定再細讀一遍,而後將其寫成博客,以接近理解的狀態。(學習的最好方式就是將其教授給他人)前端
咱們知道目前jdk甚至到了11都已投入生產的狀況了,鑑於目前工做應用的關係我將使用jdk8的源碼做爲解析。java
做爲使用空間換時間的典型案例,這種數據結構擁有這O(1)+O(len(List))的讀效率和寫效率,在數據結構基礎課上介紹的哈希表,就是此物。node
它擁有如下結構。面試
做爲數據結構基礎,它的結構是外層一個數組,數組中存着具體的元素Node,Node是單鏈表基本元素。算法
它在查找元素的O(1)的效率在於利用了做爲key元素的hash值(那是一種標記Java元素的long型數據),查詢經過(hash & len - 1)的方式計算數組下標,而後遍歷鏈表(O(len-1))找到key對應的Node中的value。數組
由於一些緣由(懶),1.7的代碼就不從IDE中閱讀了,借鑑了這位仁兄的博客:http://www.javashuo.com/article/p-gicrzvyg-er.html數據結構
jdk1.7的hashmap易於理解,put的時候找到鏈表中是否存在key,存在則替換value不存在就鏈在最後(尾插法);而它在resize的時候,會遍歷數組對數組上鍊表的每個元素進行從新hash,並放到擴容後的數組上(頭插法),因此被連接到同一個位置上的元素是倒序的。多線程
接下來咱們進入1.8的HashMap源碼閱讀。app
對一個類的瞭解,就從他怎麼來的開始吧。(構造函數)
tips:打開IDEA,選擇view的tool查看structure,便於對類有個全局感
打開HashMap源碼,在左側窗口中看到他的成員變量,以及構造函數,成員函數。
咱們看到四個構造函數,無參構造僅僅是把負載因子設爲默認值(0.75)
而使用initialCapacity的構造函數與initialCapacity & loadFactor的構造函數,操做是同樣的。
解析:(1)若是初始化容量是一個負數,則會拋出不合法異常;
(2)初始化容量大於最大整型值,則使用最大整型值;
(3)負載因子是負數或者NaN(前端常常見到這個)會拋出不合法異常
(4)將負載因子賦予本對象的負載因子屬性並計算threhold,最後賦予本對象threhold屬性。
最後一個構造函數,加載默認負載因子,並將傳入Map中的元素拷貝到本實例的數組存儲中。
在jdk1.8的HashMap中,它們遵循一個原則:HashMap的數組長度是2的n次方,因此在你使用初始化容量的時候,它會計算離你這個數最近的2的n次方這個數值。
而計算方法也是很是的精彩(sao)
這裏有一個位運算(>>>),它表示的是無符號右移,在二進制(例如0100 >>> 1 = 0010)操做中,將它們每一位右移,高位使用0補齊。這個算法目的在於將cap最高位的左邊一位置爲1,其餘置爲0(好比01011 -> 10000)。
以01011爲例:(1)01011 - 1 = 01010
(2) 01010 | 00101 = 01111
(3)01111 | 00011 = 01111
(4)01111 | 00000 = 01111
(5)...(後邊都是00000了)
(6)在return的那一步,表示的是若是 n < 0 則返回1,也就是cap本來就爲0的狀況,不然判斷n是否爲整型最大值,若是不是則加1(10000);這樣就找出了最接近cap的2次冪的值。
至於爲何最後只移位16,由於一個int值只有4個byte(32個bit,也就是32個0101),而正數負數使用一個符號位,so,int的MaxValue = 2的31次方 - 1。
接下來咱們進入put方法的閱讀。
/** * Associates the specified value with the specified key in this map. * If the map previously contained a mapping for the key, the old * value is replaced. * * @param key key with which the specified value is to be associated * @param value value to be associated with the specified key * @return the previous value associated with <tt>key</tt>, or * <tt>null</tt> if there was no mapping for <tt>key</tt>. * (A <tt>null</tt> return can also indicate that the map * previously associated <tt>null</tt> with <tt>key</tt>.) */ public V put(K key, V value) { return putVal(hash(key), key, value, false, true); } /** * Implements Map.put and related methods * * @param hash hash for key * @param key the key * @param value the value to put * @param onlyIfAbsent if true, don't change existing value * @param evict if false, the table is in creation mode. * @return previous value, or null if none */ final V putVal(int hash, K key, V value, boolean onlyIfAbsent, boolean evict) { Node<K,V>[] tab; Node<K,V> p; int n, i; if ((tab = table) == null || (n = tab.length) == 0) n = (tab = resize()).length; if ((p = tab[i = (n - 1) & hash]) == null) tab[i] = newNode(hash, key, value, null); else { Node<K,V> e; K k; if (p.hash == hash && ((k = p.key) == key || (key != null && key.equals(k)))) e = p; else if (p instanceof TreeNode) e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value); else { for (int binCount = 0; ; ++binCount) { if ((e = p.next) == null) { p.next = newNode(hash, key, value, null); if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st treeifyBin(tab, hash); break; } if (e.hash == hash && ((k = e.key) == key || (key != null && key.equals(k)))) break; p = e; } } if (e != null) { // existing mapping for key V oldValue = e.value; if (!onlyIfAbsent || oldValue == null) e.value = value; afterNodeAccess(e); return oldValue; } } ++modCount; if (++size > threshold) resize(); afterNodeInsertion(evict); return null; }
按咱們的理解,它的主題思想應該是(尋找下標->處理衝突->尾差法),讓咱們具體看看。
咱們最經常使用的put(k,v)形式,實際調用的是putVal,onlyIfAbsent這個參數爲true就不會改變已經存在節點的值,evict這個參數暫時沒關注(註釋代表它爲false才生效)
putVal第一行,聲明瞭所須要的參數:tab(指向容器的對象數組),p(指向對應位置的數組元素),n(table的length),i(下標位置)
(1)它的第一個if條件,表示若是table(持有的對象數組)爲空或者長度爲0,即沒有初始化,它就會觸發一下resize,並返回一個長度,而這個長度n,若是你傳入了initialCapacity則會計算出最接近這個值的取上界的2次冪,若是未傳入則使用默認的值16。
(2)好的,咱們知道了第一個條件,而後它緊接着又是一個if,計算出table的對應位置是否爲空,若是爲空,easy,直接將這個k,v對new一個Node存儲起來。
這個if的另外一個分支,else,表示的是若是產生了哈希衝突,即這個位置已經存在鏈表結點;它在處理這個狀況的時候很仔細,它先是檢查頭結點是否和傳入的key相等(這個檢查包括hash是否相等,k的值是否相等),若是相等則將數組上的元素賦予e之後使用;
它的下一個分支,表示它這個節點是個樹的根節點,則將進入紅黑樹的插入操做(這步不仔細讀了)
最後一種狀況就是遍歷鏈表,比對每個元素,在break的時候,拿到的那個節點就是後續須要操做的e;這裏有一種狀況,就是鏈表長度達到閾值8,則將鏈表轉化爲紅黑樹,
在下一步,進行清算,若是e不爲空,onlyIfAbsent是容許你改變值的,就將e節點的value改變爲你傳入的value,而後返回老的值(oldValue)。
最後幾行先是將modCount加一下,而後在斷定size是否大於threshold,若是大於閾值,則resize。
到此咱們已經解讀完畢。
接下來咱們將進入另外一個相當重要的方法:resize,由於它是HashMap的一個重要的過程(擴容)
/** * Initializes or doubles table size. If null, allocates in * accord with initial capacity target held in field threshold. * Otherwise, because we are using power-of-two expansion, the * elements from each bin must either stay at same index, or move * with a power of two offset in the new table. * * @return the table */ final Node<K,V>[] resize() { Node<K,V>[] oldTab = table; int oldCap = (oldTab == null) ? 0 : oldTab.length; int oldThr = threshold; int newCap, newThr = 0; if (oldCap > 0) { if (oldCap >= MAXIMUM_CAPACITY) { threshold = Integer.MAX_VALUE; return oldTab; } else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY && oldCap >= DEFAULT_INITIAL_CAPACITY) newThr = oldThr << 1; // double threshold } else if (oldThr > 0) // initial capacity was placed in threshold newCap = oldThr; else { // zero initial threshold signifies using defaults newCap = DEFAULT_INITIAL_CAPACITY; newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY); } if (newThr == 0) { float ft = (float)newCap * loadFactor; newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY ? (int)ft : Integer.MAX_VALUE); } threshold = newThr; @SuppressWarnings({"rawtypes","unchecked"}) Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap]; table = newTab; if (oldTab != null) { for (int j = 0; j < oldCap; ++j) { Node<K,V> e; if ((e = oldTab[j]) != null) { oldTab[j] = null; if (e.next == null) newTab[e.hash & (newCap - 1)] = e; else if (e instanceof TreeNode) ((TreeNode<K,V>)e).split(this, newTab, j, oldCap); else { // preserve order Node<K,V> loHead = null, loTail = null; Node<K,V> hiHead = null, hiTail = null; Node<K,V> next; do { next = e.next; if ((e.hash & oldCap) == 0) { if (loTail == null) loHead = e; else loTail.next = e; loTail = e; } else { if (hiTail == null) hiHead = e; else hiTail.next = e; hiTail = e; } } while ((e = next) != null); if (loTail != null) { loTail.next = null; newTab[j] = loHead; } if (hiTail != null) { hiTail.next = null; newTab[j + oldCap] = hiHead; } } } } } return newTab; }
(1)進行resize以前,咱們須要準備一些材料(老的table內存地址,老數組的長度,老擴容閾值等);
(2)若是是正常擴容,則進入第一個條件;在第一個條件中,若是達到整型上界,則直接返回,不會再resize,而且threshold也會賦予上界值,若是未達上界,則進行左移位,這個時候只有當新的容量長度大於DEFAULT(也就是16)的時候纔會將newThr賦值爲原先的2倍;
(3)第二個條件就是將threshold賦予新容量,這種狀況是init傳入非零值的時候;
(4)最後一個條件是init傳入0的時候,採用默認容量和threshold;
(5)接下來這個if的狀況就只有這個紅框if不成立的狀況了,將從新計算newThr;
(6)接下來就是簡單的new一個新的數組,並賦值給老的地址;
(7)若是老數組是存在的(也就是否是初次resize),這個時候須要對原數組的元素進行從新映射;
方法就是遍歷原數組,分爲若是是孤立節點則直接賦值,若是是紅黑樹則進行紅黑樹處理,若是是鏈表,則進行一些特殊的處理(這個操做騷得我閃到腰);
在講解這個方法以前,我查閱了別人解讀的文章,大致上說這個方法是推出這個鏈表上的元素:映射在新數組的一樣位置或者一樣位置加上舊數組長度的位置;
至此我作了個實驗;
使用了一些做爲案例進行計算,並非全部的數都適用這種規則。
int[] hashcode = {5,17,23,33}; int old = 10; for(int h : hashcode){ System.out.println(h & (old - 1)); } System.out.println("rehash"); old = old << 1; for (int h : hashcode){ System.out.println(h & (old - 1)); }
只有2的n次方纔適用這種策略,由於2的n次方,在二進制中只佔用1位1,舉個例子會比較清楚(假設原長度爲16<10000>,它的擴容16 << 1 = 32<100000>,而真正肯定下標的算法是 h & len - 1,也就是決定下標的位數在於比長度最高位低的全部位;好比 5<00101> & 16-1<01111> = 00101,21<10101> & 16-1<01111> = 00101,他們產生了衝突必然成爲鏈表;而進行擴容後16->32,5<00101> & 32-1<011111> = 00101,而21<10101> & 32-1<011111> = 10101,這個時候它相對於之前就加上了16,緣由就是長度擴容以後,& 的那位數多了一位二進制1,若是原來的哈希值含有這個位1,則就會加上老數組的長度)
因此,它第一步就會作與原數組長度&一下,爲了肯定那個位是否爲0,爲0保持原位,爲1加上原數組長度;它分別將兩條鏈表構造完成以後,再賦值到新數組位置上,這樣出來的元素就不是倒敘的了;
要說他比1.7有什麼提高麼,就是一個是倒置一個是順序放置吧;複雜度應該沒太大變化。(可是騷是騷了點)
至此,HashMap中最難以理解的部分所有都解析完畢。
好,咱們順勢進入CHM的閱讀(HashTable那貨沒什麼價值)----(可是這貨真難啃)
(老子讀了好久,最不理解的就是這些個位運算了!對sizeCtl又是隻知其一;不知其二,真難!)
咱們就忘了1.7的segment吧,看看1.8的實現,1.8比起1.7的分段鎖,它把鎖的粒度變得更細了,細到獲取的是頭結點的對象鎖。
構造方法還須要看嗎?走讀一下吧。
/** * Creates a new, empty map with an initial table size * accommodating the specified number of elements without the need * to dynamically resize. * * @param initialCapacity The implementation performs internal * sizing to accommodate this many elements. * @throws IllegalArgumentException if the initial capacity of * elements is negative */ public ConcurrentHashMap(int initialCapacity) { if (initialCapacity < 0) throw new IllegalArgumentException(); int cap = ((initialCapacity >= (MAXIMUM_CAPACITY >>> 1)) ? MAXIMUM_CAPACITY : tableSizeFor(initialCapacity + (initialCapacity >>> 1) + 1)); this.sizeCtl = cap; }
一如既往去校驗init參數的合法性,並將其轉化爲最接近的2的n次方,真正去new數組仍是在putVal裏。
putVal讀起來真吃力,由於裏面含有多線程操做的場景(不少時候想象力有限)。
/** * Maps the specified key to the specified value in this table. * Neither the key nor the value can be null. * * <p>The value can be retrieved by calling the {@code get} method * with a key that is equal to the original key. * * @param key key with which the specified value is to be associated * @param value value to be associated with the specified key * @return the previous value associated with {@code key}, or * {@code null} if there was no mapping for {@code key} * @throws NullPointerException if the specified key or value is null */ public V put(K key, V value) { return putVal(key, value, false); } /** Implementation for put and putIfAbsent */ final V putVal(K key, V value, boolean onlyIfAbsent) { if (key == null || value == null) throw new NullPointerException(); int hash = spread(key.hashCode()); int binCount = 0; for (Node<K,V>[] tab = table;;) { Node<K,V> f; int n, i, fh; if (tab == null || (n = tab.length) == 0) tab = initTable(); else if ((f = tabAt(tab, i = (n - 1) & hash)) == null) { if (casTabAt(tab, i, null, new Node<K,V>(hash, key, value, null))) break; // no lock when adding to empty bin } else if ((fh = f.hash) == MOVED) tab = helpTransfer(tab, f); else { V oldVal = null; synchronized (f) { if (tabAt(tab, i) == f) { if (fh >= 0) { binCount = 1; for (Node<K,V> e = f;; ++binCount) { K ek; if (e.hash == hash && ((ek = e.key) == key || (ek != null && key.equals(ek)))) { oldVal = e.val; if (!onlyIfAbsent) e.val = value; break; } Node<K,V> pred = e; if ((e = e.next) == null) { pred.next = new Node<K,V>(hash, key, value, null); break; } } } else if (f instanceof TreeBin) { Node<K,V> p; binCount = 2; if ((p = ((TreeBin<K,V>)f).putTreeVal(hash, key, value)) != null) { oldVal = p.val; if (!onlyIfAbsent) p.val = value; } } } } if (binCount != 0) { if (binCount >= TREEIFY_THRESHOLD) treeifyBin(tab, i); if (oldVal != null) return oldVal; break; } } } addCount(1L, binCount); return null; }
在讀這些代碼以前,咱們須要具備自旋CAS的基礎。
所謂自旋CAS,就是循環CAS的意思,直到成功,這個作法在AQS中大量使用。CAS,就是一種樂觀鎖的使用方法,在update值的時候,須要它的expect與內存值相等。
好的,putVal開頭,先是判斷這個key和value的非空性;
以後,計算一下hash(這個位運算沒看懂)
接下來進入到自旋中,每次循環都會將table拷貝到工做線程tab中;狀況依然分爲幾個:(1)表未初始化;(2)算出下標位置爲空;(3)擴容中;(4)造成了鏈表(R-B tree);
咱們先看initTable方法:
/** * Initializes table, using the size recorded in sizeCtl. */ private final Node<K,V>[] initTable() { Node<K,V>[] tab; int sc; while ((tab = table) == null || tab.length == 0) { if ((sc = sizeCtl) < 0) Thread.yield(); // lost initialization race; just spin else if (U.compareAndSwapInt(this, SIZECTL, sc, -1)) { try { if ((tab = table) == null || tab.length == 0) { int n = (sc > 0) ? sc : DEFAULT_CAPACITY; @SuppressWarnings("unchecked") Node<K,V>[] nt = (Node<K,V>[])new Node<?,?>[n]; table = tab = nt; sc = n - (n >>> 2); } } finally { sizeCtl = sc; } break; } } return tab; }
依然是一個自旋CAS的思路,因爲初始化只須要一個線程作就行了,若是檢查到sizeCtl小於0則表示已經有線程在初始化了,則讓出CPU,Thread.yield();而若是你是負責初始化的線程,首先CAS sizeCtl 的值,將其設爲-1,此時sizeCtl原來是在初始化時候設置的2次冪的長度,它將它拷貝到線程工做內存sc中,將sc做爲初始化長度n創造一個數組,並賦予table;sc = n - (n>>>2),不知道啥意思,不過這個會做爲下次resize的閾值,最終將這個sc賦予sizeCtl;(這裏面還用了雙檢鎖的思惟)初始化成功會返回tab;
static final <K,V> boolean casTabAt(Node<K,V>[] tab, int i, Node<K,V> c, Node<K,V> v) { return U.compareAndSwapObject(tab, ((long)i << ASHIFT) + ABASE, c, v); }
它的第二段,若是tableAt對應位置取出來爲null(這裏還用了原子型獲取對象),則會CAS設置節點值;這裏的位運算,首先咱們數組的內存首地址和具體值之間間隔了一個ABASE字節,這個ABASE是經過usafe的接口獲取的;而後這個ASHIFT的計算,有點繞;
ABASE = U.arrayBaseOffset(ak); int scale = U.arrayIndexScale(ak); if ((scale & (scale - 1)) != 0) throw new Error("data type scale not a power of two"); ASHIFT = 31 - Integer.numberOfLeadingZeros(scale);
經過上面的計算,ASHIFT實際上是指長度是2的多少次方這個次數;在二進制的乘法裏,一個數乘以2的多少次方,其實就是左移多少位;在首地址+填充位+數據具體位置,就能夠找到具體的元素位置了,最後CAS上去;
接下來第三種狀況,就是在CMH作擴容時候,會將對應位置上的元素替換爲一種特殊節點,它上面的hash值是-1,這時候會進入幫助擴容的方法;擴容咱們到後邊讀到transfer再具體分析;
而後最後一種狀況,來個雙檢鎖,這裏和hashmap就同樣了,不是跟在後面就是比對節點hash值和key,更新value;若是是樹節點,則作樹操做;
最後在插入節點以後有一個binCount,表示當前數組位置的鏈表(樹)長度,若是達到閾值TREEIFY_THRESHOLD,則進行擴容或者是樹轉換;(居然是經過單個位置是否過長來resize)
/* ---------------- Conversion from/to TreeBins -------------- */ /** * Replaces all linked nodes in bin at given index unless table is * too small, in which case resizes instead. */ private final void treeifyBin(Node<K,V>[] tab, int index) { Node<K,V> b; int n, sc; if (tab != null) { if ((n = tab.length) < MIN_TREEIFY_CAPACITY) tryPresize(n << 1); else if ((b = tabAt(tab, index)) != null && b.hash >= 0) { synchronized (b) { if (tabAt(tab, index) == b) { TreeNode<K,V> hd = null, tl = null; for (Node<K,V> e = b; e != null; e = e.next) { TreeNode<K,V> p = new TreeNode<K,V>(e.hash, e.key, e.val, null, null); if ((p.prev = tl) == null) hd = p; else tl.next = p; tl = p; } setTabAt(tab, index, new TreeBin<K,V>(hd)); } } } } }
如碼所示,若是數組長度小於64,就擴容;不然將其轉換爲樹結構;
最後一個putVal裏面的方法,就是addCount;
/** * Adds to count, and if table is too small and not already * resizing, initiates transfer. If already resizing, helps * perform transfer if work is available. Rechecks occupancy * after a transfer to see if another resize is already needed * because resizings are lagging additions. * * @param x the count to add * @param check if <0, don't check resize, if <= 1 only check if uncontended */ private final void addCount(long x, int check) { CounterCell[] as; long b, s; if ((as = counterCells) != null || !U.compareAndSwapLong(this, BASECOUNT, b = baseCount, s = b + x)) { CounterCell a; long v; int m; boolean uncontended = true; if (as == null || (m = as.length - 1) < 0 || (a = as[ThreadLocalRandom.getProbe() & m]) == null || !(uncontended = U.compareAndSwapLong(a, CELLVALUE, v = a.value, v + x))) { fullAddCount(x, uncontended); return; } if (check <= 1) return; s = sumCount(); } if (check >= 0) { Node<K,V>[] tab, nt; int n, sc; while (s >= (long)(sc = sizeCtl) && (tab = table) != null && (n = tab.length) < MAXIMUM_CAPACITY) { int rs = resizeStamp(n); if (sc < 0) { if ((sc >>> RESIZE_STAMP_SHIFT) != rs || sc == rs + 1 || sc == rs + MAX_RESIZERS || (nt = nextTable) == null || transferIndex <= 0) break; if (U.compareAndSwapInt(this, SIZECTL, sc, sc + 1)) transfer(tab, nt); } else if (U.compareAndSwapInt(this, SIZECTL, sc, (rs << RESIZE_STAMP_SHIFT) + 2)) transfer(tab, null); s = sumCount(); } } }
這個方法經過一堆繁瑣的操做,計算元素總數,而後判斷是否須要resize;(看得累,都是對sizeCtl的狀態轉換);
由此咱們能夠看出,每次新增元素的時候,會調用addCount方法判斷是否擴容,因此擴容時機總的來講是在addCount的時候,固然前邊判斷樹節點的時候,也會觸發擴容,不過是length<64的時候纔會。
接下來再看兩個方法就能對CMH有一個清晰的輪廓了。
/** * Tries to presize table to accommodate the given number of elements. * * @param size number of elements (doesn't need to be perfectly accurate) */ private final void tryPresize(int size) { int c = (size >= (MAXIMUM_CAPACITY >>> 1)) ? MAXIMUM_CAPACITY : tableSizeFor(size + (size >>> 1) + 1); int sc; while ((sc = sizeCtl) >= 0) { Node<K,V>[] tab = table; int n; if (tab == null || (n = tab.length) == 0) { n = (sc > c) ? sc : c; if (U.compareAndSwapInt(this, SIZECTL, sc, -1)) { try { if (table == tab) { @SuppressWarnings("unchecked") Node<K,V>[] nt = (Node<K,V>[])new Node<?,?>[n]; table = nt; sc = n - (n >>> 2); } } finally { sizeCtl = sc; } } } else if (c <= sc || n >= MAXIMUM_CAPACITY) break; else if (tab == table) { int rs = resizeStamp(n); if (sc < 0) { Node<K,V>[] nt; if ((sc >>> RESIZE_STAMP_SHIFT) != rs || sc == rs + 1 || sc == rs + MAX_RESIZERS || (nt = nextTable) == null || transferIndex <= 0) break; if (U.compareAndSwapInt(this, SIZECTL, sc, sc + 1)) transfer(tab, nt); } else if (U.compareAndSwapInt(this, SIZECTL, sc, (rs << RESIZE_STAMP_SHIFT) + 2)) transfer(tab, null); } } }
/** * Moves and/or copies the nodes in each bin to new table. See * above for explanation. */ private final void transfer(Node<K,V>[] tab, Node<K,V>[] nextTab) { int n = tab.length, stride; if ((stride = (NCPU > 1) ? (n >>> 3) / NCPU : n) < MIN_TRANSFER_STRIDE) stride = MIN_TRANSFER_STRIDE; // subdivide range if (nextTab == null) { // initiating try { @SuppressWarnings("unchecked") Node<K,V>[] nt = (Node<K,V>[])new Node<?,?>[n << 1]; nextTab = nt; } catch (Throwable ex) { // try to cope with OOME sizeCtl = Integer.MAX_VALUE; return; } nextTable = nextTab; transferIndex = n; } int nextn = nextTab.length; ForwardingNode<K,V> fwd = new ForwardingNode<K,V>(nextTab); boolean advance = true; boolean finishing = false; // to ensure sweep before committing nextTab for (int i = 0, bound = 0;;) { Node<K,V> f; int fh; while (advance) { int nextIndex, nextBound; if (--i >= bound || finishing) advance = false; else if ((nextIndex = transferIndex) <= 0) { i = -1; advance = false; } else if (U.compareAndSwapInt (this, TRANSFERINDEX, nextIndex, nextBound = (nextIndex > stride ? nextIndex - stride : 0))) { bound = nextBound; i = nextIndex - 1; advance = false; } } if (i < 0 || i >= n || i + n >= nextn) { int sc; if (finishing) { nextTable = null; table = nextTab; sizeCtl = (n << 1) - (n >>> 1); return; } if (U.compareAndSwapInt(this, SIZECTL, sc = sizeCtl, sc - 1)) { if ((sc - 2) != resizeStamp(n) << RESIZE_STAMP_SHIFT) return; finishing = advance = true; i = n; // recheck before commit } } else if ((f = tabAt(tab, i)) == null) advance = casTabAt(tab, i, null, fwd); else if ((fh = f.hash) == MOVED) advance = true; // already processed else { synchronized (f) { if (tabAt(tab, i) == f) { Node<K,V> ln, hn; if (fh >= 0) { int runBit = fh & n; Node<K,V> lastRun = f; for (Node<K,V> p = f.next; p != null; p = p.next) { int b = p.hash & n; if (b != runBit) { runBit = b; lastRun = p; } } if (runBit == 0) { ln = lastRun; hn = null; } else { hn = lastRun; ln = null; } for (Node<K,V> p = f; p != lastRun; p = p.next) { int ph = p.hash; K pk = p.key; V pv = p.val; if ((ph & n) == 0) ln = new Node<K,V>(ph, pk, pv, ln); else hn = new Node<K,V>(ph, pk, pv, hn); } setTabAt(nextTab, i, ln); setTabAt(nextTab, i + n, hn); setTabAt(tab, i, fwd); advance = true; } else if (f instanceof TreeBin) { TreeBin<K,V> t = (TreeBin<K,V>)f; TreeNode<K,V> lo = null, loTail = null; TreeNode<K,V> hi = null, hiTail = null; int lc = 0, hc = 0; for (Node<K,V> e = t.first; e != null; e = e.next) { int h = e.hash; TreeNode<K,V> p = new TreeNode<K,V> (h, e.key, e.val, null, null); if ((h & n) == 0) { if ((p.prev = loTail) == null) lo = p; else loTail.next = p; loTail = p; ++lc; } else { if ((p.prev = hiTail) == null) hi = p; else hiTail.next = p; hiTail = p; ++hc; } } ln = (lc <= UNTREEIFY_THRESHOLD) ? untreeify(lo) : (hc != 0) ? new TreeBin<K,V>(lo) : t; hn = (hc <= UNTREEIFY_THRESHOLD) ? untreeify(hi) : (lc != 0) ? new TreeBin<K,V>(hi) : t; setTabAt(nextTab, i, ln); setTabAt(nextTab, i + n, hn); setTabAt(tab, i, fwd); advance = true; } } } } } }
第一個方法tryPresize是在樹轉換中<64長度調用的,把n擴容n<<1長度;while中判斷sizeCtl是否大於0,若是大於則開始擴容,這裏又搞了一段init;而後判斷下這個c和sc的關係,是否結束;最後這個有兩種狀況,若是存在nextTable,若是不存在(不是擴容過程)則直接返回了;(沒啥耐心看sizeCtl的狀態了)
這兩個地方就是告訴transfer是否須要new一個nextTable;
而後咱們重點看transfer方法;它的擴容數據搬運思路是,一個線程申請一個區間,從大到小(好比有64長度,默認狀況下一個線程獲取16個長度,那麼公共變量就是64,申請了一次變爲48,表示它要處理48-63區間的數據,若是爲0則申請不到區間,能夠退下了);佔位符的意思是,若是數據遷移完成,put的時候會檢查是否hash爲-1,若是是就來幫助擴容;bound是處理區間的下界;
這裏有根據cpu線程數來計算每一個線程須要處理的數據量;
總的來講,仍是之前的思路,&那個位數是0仍是1,是原位仍是+n,作完以後,將原來數組設置爲ForwardingNode。作完以後,正在put的就知道你在擴容了,就不會往老數組寫入數組了,保證了一致性;
In conclusion,1.8的CMH是經過自選CAS進行插入、擴容等操做,並經過識別sizeCtl來協調各類過程,如何保證一致性呢?在未擴容的狀況,很是明朗,對象鎖或者是CAS,這樣有效解決了衝突;最難想象的是擴容和插入並行的時候,想象一下,你搞了一個newTable,在作數據遷移,你拿到了一個位置a(加鎖的),擴容一直在申請鎖,一旦申請到了,會將數據遷移到新數組a或者a+n位置,這時將老數組標記爲-1(MOVE);此時putVal的自旋中,若是table仍是老地址,則拿到對應位置是MOVE,則就感知到它正在擴容了,就會去幫着幹,最終拿到擴容後的tab地址;再申請鎖,再雙檢,最後纔是插入鏈表。
由此看,它是HashMap的一個子類,它的結構除了總體上是個數組+鏈表以外,彼此元素間仍是個雙端鏈表,它保存了插入順序;
public class LinkedHashMap<K,V> extends HashMap<K,V> implements Map<K,V>
/** * HashMap.Node subclass for normal LinkedHashMap entries. */ static class Entry<K,V> extends HashMap.Node<K,V> { Entry<K,V> before, after; Entry(int hash, K key, V value, Node<K,V> next) { super(hash, key, value, next); } }
擴展性真好,只須要重寫一些方法,就能夠在基礎上改變結構;在明白HashMap的基礎上,這個數據結構也會變得更容易理解了;主要是遍歷的時候不同。
首先咱們的Map會有這樣的一種結構,做爲元素類的Node,以及持有數組Node[]的Map類。
public class DataStructure<K,V> implements Serializable { private volatile Node<K,V>[] tab; private final int DEFAULT_CAPACITY = 1 << 4; public DataStructure(){ tab = new Node[DEFAULT_CAPACITY]; } }
class Node<K,V>{ private K key; private V value; private Node next; private int hash; Node(){ } Node(K key, V value, Node next, int hash) { this.key = key; this.value = value; this.next = next; this.hash = hash; } public K getKey() { return key; } public void setKey(K key) { this.key = key; } public V getValue() { return value; } public void setValue(V value) { this.value = value; } public Node getNext() { return next; } public void setNext(Node next) { this.next = next; } public int getHash() { return hash; } public void setHash(int hash) { this.hash = hash; } }
而這個數據結構能循環的三個方法,put,get,以及resize;
public abstract class BaseStructure<K,V> { class Node<K,V>{ private K key; private V value; private Node next; private int hash; Node(){ } Node(K key, V value, Node next, int hash) { this.key = key; this.value = value; this.next = next; this.hash = hash; } public K getKey() { return key; } public void setKey(K key) { this.key = key; } public V getValue() { return value; } public void setValue(V value) { this.value = value; } public Node getNext() { return next; } public void setNext(Node next) { this.next = next; } public int getHash() { return hash; } public void setHash(int hash) { this.hash = hash; } } public abstract void put(int hash,K key,V value); public abstract V get(int hash,K key); public abstract void resize(); }
以及真正實現的類。
import java.io.Serializable; public class DataStructure<K,V> extends BaseStructure<K,V> implements Serializable { private Node<K,V>[] tab; private final int DEFAULT_CAPACITY = 1 << 2; private double factor = 0.75f; private int size; public DataStructure(){ tab = new Node[DEFAULT_CAPACITY]; } public void put(int hash,K key,V value){ if(key == null){ throw new NullPointerException(); } int len = tab.length; int index; Node e = tab[index = (hash & (len -1))]; // 粗略估計擴容 if(size + 1 > (len * factor)){ resize(); } if(e == null){ tab[index] = new Node(key,value,null,hash); size++; }else{ while(e.getNext() != null) { if (e.getKey().equals(key) && e.getHash() == hash) { e.setValue(value); return; } e = e.getNext(); } e.setNext(new Node(key,value,null,hash)); size++; } } public V get(int hash,K key){ if(key == null){ throw new NullPointerException(); } int len = tab.length; int i = hash & (len -1); Node<K,V> node = tab[hash & (len - 1)]; while(node != null){ if(node.getHash() == hash && node.getKey().equals(key)){ return node.getValue(); } node = node.getNext(); } return null; } @Override public void resize() { int len = tab.length; int newLen = len << 1; Node<K,V>[] newTab = new Node[newLen]; for(int i=0;i<len;i++){ Node<K,V> node = tab[i]; if(node == null){ continue; } else if(node.getNext() == null){ newTab[node.getHash() & (newLen - 1)] = node; }else{ Node<K,V> loHead = null,loTail = null,hiHead = null,hiTail = null; Node<K,V> e = node; Node<K,V> next = null; do{ next = e.getNext(); if((node.getHash() & len) == 0){ if(loHead == null){ loHead = loTail = e; loTail.setNext(null); }else { loTail.setNext(e); loTail = e; loTail.setNext(null); } }else { if(hiHead == null){ hiHead = hiTail = e; hiTail.setNext(null); }else { hiTail.setNext(e); hiTail = e; hiTail.setNext(null); } } }while ((e = next) != null); newTab[i] = loHead; newTab[i+len] = hiHead; } } tab = newTab; } }