HashMap底層實現(源碼分析)

時間 2019-11-07

標籤 hashmap 底層實現源碼分析简体版

原文原文鏈接

1、數據結構
Map將實際數據存儲在Entry類的數組中。
代碼片斷：

Java代碼

transient Entry[] table;//HashMap的成員變量，存放數據
static class Entry<K,V> implements Map.Entry<K,V> {//內部類Entry
final K key;
V value;
Entry<K,V> next;//指向下一個數據
final int hash;
/**
* Creates new entry.
*/
Entry(int h, K k, V v, Entry<K,V> n) {
value = v;
next = n;
key = k;
hash = h;
}

執行下面代碼後，可能的存儲內部結構是:
Map map = new HashMap();
map.put("key1","value1");
map.put("key2","value2");
map.put("key3","value3");

執行put方法時根據key的hash值來計算放到table數組的下標，若是hash到相同的下標，則新put進去的元素放到Entry鏈的頭部，如上圖所示。put方法的源碼後面詳細解釋。

2、屬性和構造方法

Java代碼

static final int DEFAULT_INITIAL_CAPACITY = 16;//默認的初始大小，若是執行無參的構造方法，則默認初始大小爲16
static final int MAXIMUM_CAPACITY = 1 << 30;//最大容量，1073741824
static final float DEFAULT_LOAD_FACTOR = 0.75f;//默認的負載因子，若是沒有經過構造方法傳入負載因子，則使用0.75。
transient Entry[] table; //存放具體鍵值對的Entry數組
transient int size; //HashMap的大小
int threshold;//閥值 threshold = (int)(capacity * loadFactor); 即容量*負載因子，執行put方法時若是size大於threshold則進行擴容，後面put方法將會看到
final float loadFactor; //用戶設置的負載因子
transient volatile int modCount;//HashMap實例被改變的次數，這個同ArrayList

構造方法1、

Java代碼

public HashMap() {
this.loadFactor = DEFAULT_LOAD_FACTOR;
threshold = (int)(DEFAULT_INITIAL_CAPACITY * DEFAULT_LOAD_FACTOR);
table = new Entry[DEFAULT_INITIAL_CAPACITY];
init();
}

使用了默認的容量和默認的負載因子。
構造方法2、

Java代碼

public HashMap(int initialCapacity) {
this(initialCapacity, DEFAULT_LOAD_FACTOR);
}

使用了用戶設置的初始容量和默認的負載因子。
構造方法3、

Java代碼

public HashMap(int initialCapacity, float loadFactor) {
if (initialCapacity < 0)
throw new IllegalArgumentException("Illegal initial capacity: " +
initialCapacity);
if (initialCapacity > MAXIMUM_CAPACITY)
initialCapacity = MAXIMUM_CAPACITY;
if (loadFactor <= 0 || Float.isNaN(loadFactor))
throw new IllegalArgumentException("Illegal load factor: " +
loadFactor);
// Find a power of 2 >= initialCapacity
int capacity = 1;
while (capacity < initialCapacity)
capacity <<= 1;
this.loadFactor = loadFactor;
threshold = (int)(capacity * loadFactor);
table = new Entry[capacity];
init();
}

用戶傳入了初始容量和負載因子，這兩個值是HashMap性能優化的關鍵，涉及到了HashMap的擴容問題。
HashMap的容量永遠是2的倍數，若是傳入的不是2的倍數則被調整爲大於傳入值的最近的2的倍數，例如若是傳入130，則capacity計算後是256。是這段代碼起的做用：

Java代碼

while (capacity < initialCapacity)
capacity <<= 1;

構造方法4、

Java代碼

public HashMap(Map<? extends K, ? extends V> m) {
this(Math.max((int) (m.size() / DEFAULT_LOAD_FACTOR) + 1,
DEFAULT_INITIAL_CAPACITY), DEFAULT_LOAD_FACTOR);//計算Map的大小
putAllForCreate(m);//初始化
}
private void putAllForCreate(Map<? extends K, ? extends V> m) {
for (Iterator<? extends Map.Entry<? extends K, ? extends V>> i = m.entrySet().iterator(); i.hasNext(); ) {//經過entryset進行遍歷
Map.Entry<? extends K, ? extends V> e = i.next();
putForCreate(e.getKey(), e.getValue());
}
}

根據傳入的map進行初始化。

3、關鍵方法
1）put

Java代碼

public V put(K key, V value) {
if (key == null)
return putForNullKey(value);//單獨處理，老是放到table[0]中
int hash = hash(key.hashCode());//計算key的hash值，後面介紹性能的時候再說這個hash方法。
int i = indexFor(hash, table.length);//將hash和length-1取&來獲得數組的下表
for (Entry<K,V> e = table[i]; e != null; e = e.next) {
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
V oldValue = e.value;
e.value = value;
e.recordAccess(this);
return oldValue;
}
}//若是這個key值，原來已經則替換後直接返回。
modCount++;
addEntry(hash, key, value, i);
return null;
}
void addEntry(int hash, K key, V value, int bucketIndex) {
Entry<K,V> e = table[bucketIndex];
table[bucketIndex] = new Entry<K,V>(hash, key, value, e);//若是table[bucketIndex]中已經存在Entry則放到頭部。
if (size++ >= threshold)//若是大於了閥值，則擴容到原來大小的2倍。
resize(2 * table.length);
}
void resize(int newCapacity) {
Entry[] oldTable = table;
int oldCapacity = oldTable.length;
if (oldCapacity == MAXIMUM_CAPACITY) {
threshold = Integer.MAX_VALUE;
return;
}
Entry[] newTable = new Entry[newCapacity];
transfer(newTable);//賦值到新的table中，注意轉移後會從新hash，因此位置可能會跟以前不一樣，目的是均勻分不到新的table中。
table = newTable;
threshold = (int)(newCapacity * loadFactor);
}

2）get方法

Java代碼

public V get(Object key) {
if (key == null)
return getForNullKey();
int hash = hash(key.hashCode());
for (Entry<K,V> e = table[indexFor(hash, table.length)];//找到數組的下表，進行遍歷
e != null;
e = e.next) {
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
return e.value;//找到則返回
}
return null;//不然，返回null
}

3)remove方法

Java代碼

public V remove(Object key) {
Entry<K,V> e = removeEntryForKey(key);
return (e == null ? null : e.value);
}
final Entry<K,V> removeEntryForKey(Object key) {
int hash = (key == null) ? 0 : hash(key.hashCode());
int i = indexFor(hash, table.length);
Entry<K,V> prev = table[i];
Entry<K,V> e = prev;
while (e != null) {//Entry鏈未遍歷完則一直遍歷
Entry<K,V> next = e.next;
Object k;
if (e.hash == hash &&
((k = e.key) == key || (key != null && key.equals(k)))) {
modCount++;
size--;
if (prev == e)//若是是第一個，則將table[i]執行e.next
table[i] = next;
else //不然將前一個的next指向e.next
prev.next = next;
e.recordRemoval(this);
return e;
}
prev = e;//未找到則繼續日後遍歷
e = next;
}
return e;
}

4)HashMap的遍歷方法

Java代碼

Map map = new HashMap();
map.put("key1","value1");
map.put("key2","value2");
map.put("key3", "value3");
for(Iterator it = map.entrySet().iterator();it.hasNext();){
Map.Entry e = (Map.Entry)it.next();
System.out.println(e.getKey() + "=" + e.getValue());
}
System.out.println("-----------------------------------------");
for(Iterator it = map.keySet().iterator();it.hasNext();){
Object key = it.next();
Object value = map.get(key);
System.out.println(key+"="+value);
}
System.out.println("-----------------------------------------");
System.out.println(map.values());
輸出爲：
key3=value3
key2=value2
key1=value1
-----------------------------------------
key3=value3
key2=value2
key1=value1
-----------------------------------------
[value3, value2, value1]

4、性能相關
1）hash
若是總計算到相同的數組下標，則得進行Entry的遍從來取值和存放值，必然會影響性能。
因此HashMap提供了hash方法，來解決key的hashCode方法質量不高問題。

Java代碼

public V put(K key, V value) {
...
int hash = hash(key.hashCode());
...
}
static int hash(int h) {
// This function ensures that hashCodes that differ only by
// constant multiples at each bit position have a bounded
// number of collisions (approximately 8 at default load factor).
h ^= (h >>> 20) ^ (h >>> 12);
return h ^ (h >>> 7) ^ (h >>> 4);
}

2）初始容量和負載因子
由於put的時候可能須要作擴容，擴容會致使性能損耗，因此若是能夠預知Map大小的話，能夠設置合理的初始大小和負載因子來避免HashMap的頻繁擴容致使的性能消耗。

Java代碼

void addEntry(int hash, K key, V value, int bucketIndex) {
ntry<K,V> e = table[bucketIndex];
table[bucketIndex] = new Entry<K,V>(hash, key, value, e);
if (size++ >= threshold)
resize(2 * table.length);
}

5、同Hashtable的區別
1）HashMap容許key和value均可覺得null,Hashtable都不能夠爲空。
HashMap的put方法，代碼片斷：

Java代碼

if (key == null)
return putForNullKey(value);

Hashtable的put方法，代碼片斷：

Java代碼

if (value == null) {
throw new NullPointerException();
}
Entry tab[] = table;
int hash = key.hashCode();

2）HashMap是非線程安全的，Hashtable是線程安全的。
Hashtable的put和get方法均爲synchronized的。
6、ConcurrentHashMap
ConcurrentHashMap是Doug Lea寫的線程安全的HashMap實現，將HashMap默認劃分爲了16個Segment，減小了鎖的爭用，另外經過寫時加鎖讀時不加鎖減小了鎖的持有時間，優雅的解決了高併發下鎖的高競爭問題。感興趣的可參見筆者的另外一篇博客 http://frank1234.iteye.com/blog/2162490

相關標籤/搜索

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。