1、HashMap介紹的Javadoc翻譯javascript
/** * Hash table based implementation of the <tt>Map</tt> interface. This * implementation provides all of the optional map operations, and permits * <tt>null</tt> values and the <tt>null</tt> key. (The <tt>HashMap</tt> * class is roughly equivalent to <tt>Hashtable</tt>, except that it is * unsynchronized and permits nulls.) This class makes no guarantees as to * the order of the map; in particular, it does not guarantee that the order * will remain constant over time. 基於map接口實現的哈希表,實現了全部map操做,而且容許null值和null鍵,除了 HashMap不是線程安全和容許null以外,它基本等同於HashTable。該類不對map的排序 作保證,特別是在一段時間以後更不能保證順序不變。 * * <p>This implementation provides constant-time performance for the basic * operations (<tt>get</tt> and <tt>put</tt>), assuming the hash function * disperses the elements properly among the buckets. Iteration over * collection views requires time proportional to the "capacity" of the * <tt>HashMap</tt> instance (the number of buckets) plus its size (the number * of key-value mappings). Thus, it's very important not to set the initial * capacity too high (or the load factor too low) if iteration performance is * important. * 假如哈希函數能夠很好的將元素散列在哈希槽中,該實現對get和put操做將提供固定時 間的訪問性能。遍歷集合所需時間與hashmap對象的槽個數加上鍵值對個數成正比。因此 ,不要設置太高的初始容量或者過低的加載因子對於遍歷性能很重要。 * <p>An instance of <tt>HashMap</tt> has two parameters that affect its * performance: <i>initial capacity</i> and <i>load factor</i>. The * <i>capacity</i> is the number of buckets in the hash table, and the initial * capacity is simply the capacity at the time the hash table is created. The * <i>load factor</i> is a measure of how full the hash table is allowed to * get before its capacity is automatically increased. When the number of * entries in the hash table exceeds the product of the load factor and the * current capacity, the hash table is <i>rehashed</i> (that is, internal data * structures are rebuilt) so that the hash table has approximately twice the * number of buckets. 一個hashmap實例有兩個影響性能的重要參數:初始容量和加載因子。容量是指哈希表的 槽個數,初始容量就是當哈希表建立時的容量。加載因子是表徵哈希表的滿載程度,就是 在哈希表增長容量以前還能夠容許放多少鍵值對。當哈希表中的項超過了加載因子與當前 容量的乘積,哈希表將被重建(也就是說數據結構重建),大小大約是原來的兩倍。 * * <p>As a general rule, the default load factor (.75) offers a good * tradeoff between time and space costs. Higher values decrease the * space overhead but increase the lookup cost (reflected in most of * the operations of the <tt>HashMap</tt> class, including * <tt>get</tt> and <tt>put</tt>). The expected number of entries in * the map and its load factor should be taken into account when * setting its initial capacity, so as to minimize the number of * rehash operations. If the initial capacity is greater than the * maximum number of entries divided by the load factor, no rehash * operations will ever occur. 一般來講,缺省的加載因子(0.75)提供了空間與時間的平衡。高於此將下降空間 開銷可是增長查詢開銷(反映在大多數haspmap的操做上,包括了get和put操做)。 預期map中存儲的鍵值對和加載因子應該在設置初始容量時就要考慮到,以致於最小 減小rehash的次數。若是初始容量大於最大存儲的鍵值對個數與加載因子的乘積, 從新hash將永遠不會發生。 * * <p>If many mappings are to be stored in a <tt>HashMap</tt> * instance, creating it with a sufficiently large capacity will allow * the mappings to be stored more efficiently than letting it perform * automatic rehashing as needed to grow the table. Note that using * many keys with the same {@code hashCode()} is a sure way to slow * down performance of any hash table. To ameliorate impact, when keys * are {@link Comparable}, this class may use comparison order among * keys to help break ties. 若是hashmap將存儲許多鍵值對,建立一個合適容量hashmap將比按需重哈希增長哈希表 更高效。注意,當許多鍵具備相同hashcode時,很肯定的是這將下降哈希表的性能。 所以,爲了下降影響,對於實現comparable的key,hashmap將使用比較排序以打破這種 影響。 * * <p><strong>Note that this implementation is not synchronized.</strong> * If multiple threads access a hash map concurrently, and at least one of * the threads modifies the map structurally, it <i>must</i> be * synchronized externally. (A structural modification is any operation * that adds or deletes one or more mappings; merely changing the value * associated with a key that an instance already contains is not a * structural modification.) This is typically accomplished by * synchronizing on some object that naturally encapsulates the map. * 注意,該類的實現不是同步的。若是多線程訪問一個hashmap,而且至少一個線程修改了 map的結構,必須須要額外的同步手段。(結構上的修改是指添加刪除一個或多個鍵值對; 僅僅改變已存在的鍵對應的值不是結構性修改。)一般這將經過對某個封裝了這個map的 對象來進行同步以達到同步的目的。 * If no such object exists, the map should be "wrapped" using the * {@link Collections#synchronizedMap Collections.synchronizedMap} * method. This is best done at creation time, to prevent accidental * unsynchronized access to the map:<pre> * Map m = Collections.synchronizedMap(new HashMap(...));</pre> 若是不存在這麼個對象,map則須要使用Collections.synchronizedMap方法封裝。這是 建立時最好的作法,用來防止意外的非同步訪問。 Map m = Collections.synchronizedMap(new HashMap(...)) * * <p>The iterators returned by all of this class's "collection view methods" * are <i>fail-fast</i>: if the map is structurally modified at any time after * the iterator is created, in any way except through the iterator's own * <tt>remove</tt> method, the iterator will throw a * {@link ConcurrentModificationException}. Thus, in the face of concurrent * modification, the iterator fails quickly and cleanly, rather than risking * arbitrary, non-deterministic behavior at an undetermined time in the * future. * 該類的全部的訪問集合類的方法所返回的迭代器都是快速失敗類型的,是指:除非經過迭 代器本身的remove方法,若是迭代器建立後的任意時間出現了對map的結構性修改,都將拋 出ConcurrentModificationException。所以,對於併發修改,迭代器快速利落的失敗而不 是在將來隨意冒險和某個不肯定時間產生不肯定的行爲。 * <p>Note that the fail-fast behavior of an iterator cannot be guaranteed * as it is, generally speaking, impossible to make any hard guarantees in the * presence of unsynchronized concurrent modification. Fail-fast iterators * throw <tt>ConcurrentModificationException</tt> on a best-effort basis. * Therefore, it would be wrong to write a program that depended on this * exception for its correctness: <i>the fail-fast behavior of iterators * should be used only to detect bugs.</i> * 注意,迭代器的快速失敗行爲並不能保證跟它描述的那樣,通常來講,非同步的併發修改不可能 獲得任何的保證。快速失敗迭代器拋出的ConcurrentModificationException是基於最大努力。因 此,依賴這個異常來寫程序是錯誤的,迭代器快速失敗行爲應該僅被用來檢測缺陷。 * <p>This class is a member of the * <a href="{@docRoot}/../technotes/guides/collections/index.html"> * Java Collections Framework</a>. * * @param <K> the type of keys maintained by this map * @param <V> the type of mapped values * * @author Doug Lea * @author Josh Bloch * @author Arthur van Hoff * @author Neal Gafter * @see Object#hashCode() * @see Collection * @see Map * @see TreeMap * @see Hashtable * @since 1.2 */
2、HashMap的put源碼html
HashMap對比HashTable,首先插入的key的哈希計算方法更優,提早將高16位歸入哈希計算;HashMap在鏈表階段採用的是尾插法,而HashTable採用的是頭插法,HashMap在達到必定閾值以後,將鏈表轉變爲紅黑樹,保證最差查詢,修改,刪除的時間複雜度爲O(lgN)java
public V put(K key, V value) { return putVal(hash(key), key, value, false, true); } /** * Computes key.hashCode() and spreads (XORs) higher bits of hash * to lower. Because the table uses power-of-two masking, sets of * hashes that vary only in bits above the current mask will * always collide. (Among known examples are sets of Float keys * holding consecutive whole numbers in small tables.) So we * apply a transform that spreads the impact of higher bits * downward. There is a tradeoff between speed, utility, and * quality of bit-spreading. Because many common sets of hashes * are already reasonably distributed (so don't benefit from * spreading), and because we use trees to handle large sets of * collisions in bins, we just XOR some shifted bits in the * cheapest possible way to reduce systematic lossage, as well as * to incorporate impact of the highest bits that would otherwise * never be used in index calculations because of table bounds. 整數最大是32位,這段話大概意思是把數據的高16位與低16位進行異或,由於自己高位對於散列的影響比較小, 由於哈希表的長度自己就最大到2的30次方,在很長一段時間纔會擴展到這麼長的一個哈希表,爲了儘量 散列並儘早將高16位對哈希值的影響提早儘量的散列開因此進行了高16位和低16位的異或。同時3個>的右移 表示的是帶符號位右移。 */ static final int hash(Object key) { int h; return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16); } /** 保存鍵值 */ final V putVal(int hash, K key, V value, boolean onlyIfAbsent, boolean evict) { Node<K,V>[] tab; Node<K,V> p; int n, i; if ((tab = table) == null || (n = tab.length) == 0) //初始化表格,而且容量初始值16 n = (tab = resize()).length; //查找是否當前位置的hash已經存在節點,由於是下標因此減1,若是不存在則構建節點 if ((p = tab[i = (n - 1) & hash]) == null) //建立節點,並設置key和value以及下一個元素爲null,數組索引賦值 tab[i] = newNode(hash, key, value, null); else { //若是hash計算的同一位置則進行哈希再檢測 Node<K,V> e; K k; //若是hash一致而且鍵一致記錄下當前節點爲要put的鍵值對節點 if (p.hash == hash && ((k = p.key) == key || (key != null && key.equals(k)))) e = p; //判斷當前是否是樹型節點 else if (p instanceof TreeNode) //是的話則繼續存放樹型節點 e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value); else { //鍵的hash值不同與當前存放的鍵不同。則準備開始進行下一步操做 for (int binCount = 0; ; ++binCount) { //一直查詢到鏈表的尾部,這段代碼就是鏈表的遍歷過程 if ((e = p.next) == null) { //建立新的節點,並將當前遍歷的下一節點指向當前節點 p.next = newNode(hash, key, value, null); //若是鏈表的長度大於樹的深度則進行樹型轉換,默認鏈表長度是8 if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st //樹型變換 treeifyBin(tab, hash); break; } //發現相同key的元素 if (e.hash == hash && ((k = e.key) == key || (key != null && key.equals(k)))) break; //繼續遍歷 p = e; } } //若是當前的key存在於當前的哈希表中,從新賦值 if (e != null) { // existing mapping for key V oldValue = e.value; if (!onlyIfAbsent || oldValue == null) e.value = value; afterNodeAccess(e); //只是改變值的話直接返回,下面發生結構化次數就不會加1了 return oldValue; } } //發生結構花改變的次數 ++modCount; //大於閾值以後rebuild哈希表 if (++size > threshold) resize(); afterNodeInsertion(evict); return null; } /** * Replaces all linked nodes in bin at index for given hash unless * table is too small, in which case resizes instead. 根據當前的hash值,對哈希表進行樹型變換 */ final void treeifyBin(Node<K,V>[] tab, int hash) { int n, index; Node<K,V> e; //若是哈希表的長度小於最小樹型變化的容量則須要進行從新變換大小,而不是直接進行樹型變化, //這個最小樹型容量變化的值是4倍的鏈表長度閾值,以防止樹型變換和表格長度變換之間的衝突 if (tab == null || (n = tab.length) < MIN_TREEIFY_CAPACITY) resize(); else if ((e = tab[index = (n - 1) & hash]) != null) { TreeNode<K,V> hd = null, tl = null; do { //將當前節點轉爲樹型節點 TreeNode<K,V> p = replacementTreeNode(e, null); //第一個節點時是null if (tl == null) //頭節點是第一個遍歷的鏈表節點 hd = p; else { //這裏的目的是構建一個雙向鏈表 p.prev = tl; tl.next = p; } tl = p; } while ((e = e.next) != null); if ((tab[index] = hd) != null) //執行樹型變換 hd.treeify(tab); } } /** * Forms tree of the nodes linked from this node. 連接到該節點的全部節點進行樹型變換 * @return root of tree */ final void treeify(Node<K,V>[] tab) { TreeNode<K,V> root = null; //正向遍歷鏈表 for (TreeNode<K,V> x = this, next; x != null; x = next) { //獲取下一節點 next = (TreeNode<K,V>)x.next; //左右子樹置空 x.left = x.right = null; //根爲null,說明當前遍歷第一個節點爲根節點 if (root == null) { //根節點無父節點 x.parent = null; //根節點只能是黑色 x.red = false; //設置根節點 root = x; } else { K k = x.key; int h = x.hash; Class<?> kc = null; //根節點開始比較當前節點 for (TreeNode<K,V> p = root;;) { //ph是p的hash int dir, ph; K pk = p.key; //若是ph的哈希大於h,說明是要向左子樹方向遍歷插入 if ((ph = p.hash) > h) dir = -1; else if (ph < h) dir = 1; else if ((kc == null && (kc = comparableClassFor(k)) == null) || (dir = compareComparables(kc, k, pk)) == 0) dir = tieBreakOrder(k, pk); TreeNode<K,V> xp = p; if ((p = (dir <= 0) ? p.left : p.right) == null) { x.parent = xp; if (dir <= 0) xp.left = x; else xp.right = x; //以上就是二叉查找樹,又稱二叉排序樹,也稱二叉搜索樹的構建,接下來就是樹的平衡插入,由於插入節點會打破平衡, //下面這個是重頭了,紅黑樹平衡算法 //傳入的參數是根與當前節點,返回節點爲新根節點 root = balanceInsertion(root, x); break; } } } } moveRootToFront(tab, root); } //向左旋轉 static <K,V> TreeNode<K,V> rotateLeft(TreeNode<K,V> root, TreeNode<K,V> p) { TreeNode<K,V> r, pp, rl; //p不爲NULL,而且p的右子樹也不爲NULL,不然不須要向左旋轉 if (p != null && (r = p.right) != null) { //首先將要旋轉的p節點的右子樹的左子樹賦值給p的右子樹,這就至關於把p節點的右孩子的左孩子提出來,爲p的右孩子上提作準備 if ((rl = p.right = r.left) != null) //除了將p的右孩子修改指向爲右還在的左還在以外,還須要指定rl的父節點爲p節點 rl.parent = p; //右孩子上提,還須要將右孩子的父親修改成p的父親,若是r的父節點爲null,則直接確認就是根節點,直接返回,並設置根節點爲黑色 if ((pp = r.parent = p.parent) == null) (root = r).red = false; //若是p的父親指定的左孩子爲當前p節點,則將父節點的左孩子直接指定爲p節點的右孩子 else if (pp.left == p) pp.left = r; //若是當前p節點的右孩子爲當前p節點,則將父節點的右孩子指定爲原來p節點的右孩子 else pp.right = r; //最終p節點成爲它右孩子的左節點,p節點的父親成爲原來它右孩子的左節點,上面一系列步驟就是爲了交換父親和左右子樹 r.left = p; p.parent = r; } return root; } //右旋轉 static <K,V> TreeNode<K,V> rotateRight(TreeNode<K,V> root, TreeNode<K,V> p) { TreeNode<K,V> l, pp, lr; //p不爲NULL,而且p的左子樹不爲NULL,不然不須要向右旋轉 if (p != null && (l = p.left) != null) { //將l的右子樹拆下來,放到即將被旋轉的p節點的左子樹 if ((lr = p.left = l.right) != null) //原來l的右子樹的父節點指向p節點 lr.parent = p; //將l的父節點連接到要被旋轉的p節點的父節點上 if ((pp = l.parent = p.parent) == null) //若是父節點爲NULL則直接原來旋轉節點的左孩子就是根節點,而且顏色爲黑色 (root = l).red = false; //不然將若是被旋轉節點是父節點的右孩子,則父節點的有孩子就是要旋轉的p節點的左孩子 else if (pp.right == p) pp.right = l; //若是被旋轉節點是父節點的左孩子,則父節點的左孩子就是要旋轉的p節點的左孩子 else pp.left = l; //被旋轉的p節點下沉成爲其原來左孩子的右孩子 l.right = p; //被旋轉的p節點的父節點就是原來的左孩子 p.parent = l; } return root; } /** (1)每一個結點要麼是紅的要麼是黑的。 (2)根結點是黑的。 (3)每一個葉結點(葉結點即指樹尾端NIL指針或NULL結點)都是黑的。 (4)若是一個結點是紅的,那麼它的兩個兒子都是黑的。 (5)對於任意結點而言,其到葉結點樹尾端NIL指針的每條路徑都包含相同數目的黑結點。**/ static <K,V> TreeNode<K,V> balanceInsertion(TreeNode<K,V> root, TreeNode<K,V> x) { //一進來就設置當前插入的節點時紅色的 x.red = true; //定義了四個變量的含義從左至右依次爲:x的父節點,x的祖父節點,x的祖父節點的左孩子(x的叔叔節點),x的祖父節點的右孩子(x的叔叔節點) for (TreeNode<K,V> xp, xpp, xppl, xppr;;) { //節點x的父節點xp爲null,則當前節點就是根節點,返回當前節點 if ((xp = x.parent) == null) { x.red = false; return x; } //當前節點的父節點是黑色,加入一個紅色的節點不會影響紅黑樹的結構,因此能夠直接返回 //或者祖父節點是null,表示是隻有兩級也能夠直接插入,由於根節點是黑色的,插入的是紅色的也不會影響紅黑樹的結構 else if (!xp.red || (xpp = xp.parent) == null) return root; //x節點添加到根的左子樹,通過以上條件過濾完了以後,插入節點是紅色的,父節點是紅色的,而且祖父節點也是存在的 //左子樹進行變換,轉成一個紅黑樹 if (xp == (xppl = xpp.left)) { //一、x的叔節點不爲null,而且叔叔節點時紅色的,由於父節點也是紅色的 if ((xppr = xpp.right) != null && xppr.red) { //不知足性質(4),此時將x節點的父節點和叔叔節點都設置爲黑色,再把祖父節點設置爲紅色 xppr.red = false; xp.red = false; xpp.red = true; //此時修改插入節點x變爲祖父節點,繼續循環進行調整 x = xpp; } //二、若是叔叔節點是null或者是黑色的,可是父親節點是紅色的 else { //x節點爲右節點 if (x == xp.right) { //以x的父節點進行向左旋轉,同時x變爲原來的父節點 root = rotateLeft(root, x = xp); //左旋轉完成以後,原來的xp旋轉成爲原來x的左孩子,因此插入節點x這時候又變成了xp就是原來的父節點 xpp = (xp = x.parent) == null ? null : xp.parent; } //x節點爲左節點 if (xp != null) { //將x的父節點設置爲黑色 xp.red = false; if (xpp != null) { //同時將x的祖父節點設置爲紅色 xpp.red = true; //以祖父節點向右旋轉,直至旋轉完 root = rotateRight(root, xpp); } } } } else { //鏡像旋轉的 if (xppl != null && xppl.red) { xppl.red = false; xp.red = false; xpp.red = true; x = xpp; } else { if (x == xp.left) { root = rotateRight(root, x = xp); xpp = (xp = x.parent) == null ? null : xp.parent; } if (xp != null) { xp.red = false; if (xpp != null) { xpp.red = true; root = rotateLeft(root, xpp); } } } } } } //初始化或者設置容量爲原來的2倍 final Node<K,V>[] resize() { //舊錶的引用保存爲oldTab Node<K,V>[] oldTab = table; //初始化時爲null,因此oldcapacity=0 int oldCap = (oldTab == null) ? 0 : oldTab.length; //舊的閾值保存 int oldThr = threshold; //新的容量和新的閾值初始爲0 int newCap, newThr = 0; if (oldCap > 0) { //若是大於最大容量2^30 if (oldCap >= MAXIMUM_CAPACITY) { //閾值設置爲整數最大值,並返回舊錶 threshold = Integer.MAX_VALUE; return oldTab; } //若是新的容量增大兩倍後小於最大容量,而且舊的容量大於缺省的初始容量,則新的閾值增大兩倍 else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY && oldCap >= DEFAULT_INITIAL_CAPACITY) newThr = oldThr << 1; // double threshold } //舊的容量大於0表示初始化過,而且舊的閾值也大於0,則將新的容量設置爲舊的閾值 else if (oldThr > 0) // initial capacity was placed in threshold newCap = oldThr; else { // zero initial threshold signifies using defaults,不然表示的是初始化map newCap = DEFAULT_INITIAL_CAPACITY; newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY);//這個比較重要初始化HashMap若是不指定負載因子的話就會默認使用0.75,之後閾值就是基於0.75進行擴展 } //新的閾值爲0,則再次計算新的閾值 if (newThr == 0) {//使用HashMap(int,faloat)初始化時就會使用這個進行閾值設置 float ft = (float)newCap * loadFactor; newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY ? (int)ft : Integer.MAX_VALUE); } //賦值本map的閾值 threshold = newThr; //建立Node數組 @SuppressWarnings({"rawtypes","unchecked"}) Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap]; //賦值給本map的table table = newTab; //若是不是初始化表而是擴容 if (oldTab != null) { for (int j = 0; j < oldCap; ++j) { Node<K,V> e; if ((e = oldTab[j]) != null) { oldTab[j] = null; if (e.next == null) //爲空則執行移位就能夠了,由於是2的冪 newTab[e.hash & (newCap - 1)] = e; else if (e instanceof TreeNode) //當前節點時樹型節點,則進行分裂 ((TreeNode<K,V>)e).split(this, newTab, j, oldCap); else { // preserve order //該拆分鏈表是保留原始順序的 //怎麼拆分的呢? //低位的頭 Node<K,V> loHead = null, loTail = null; //高位的頭 Node<K,V> hiHead = null, hiTail = null; Node<K,V> next; do { next = e.next; //與原來容量的大小進行與計算可知是放在低位的,注意這裏與保存時不同,保存是容量減1,移位時用的是容量 if ((e.hash & oldCap) == 0) { if (loTail == null) loHead = e; else loTail.next = e; loTail = e; } else { if (hiTail == null) hiHead = e; else hiTail.next = e; hiTail = e; } } while ((e = next) != null); //低位繼續存放在原始位置 if (loTail != null) { loTail.next = null; newTab[j] = loHead; } //高位的進行初始容量次位移 if (hiTail != null) { hiTail.next = null; newTab[j + oldCap] = hiHead; } } } } } return newTab; }
紅黑樹平衡算法要代碼結合原理去分析,jdk源碼的實現比較巧妙,算法導論中的理論分析能夠參考以下的分析node
https://www.cnblogs.com/Anker/archive/2013/01/30/2882773.html算法