hashmap的hash算法( 轉)

時間 2019-11-21

標籤 hashmap hash 算法简体版

原文原文鏈接

HashMap 中hash table 定位算法：算法

int hash = hash(key.hashCode());  
int i = indexFor(hash, table.length);

其中indexFor和hash源碼以下： app

/** 
 * Applies a supplemental hash function to a given hashCode, which 
 * defends against poor quality hash functions.  This is critical 
 * because HashMap uses power-of-two length hash tables, that 
 * otherwise encounter collisions for hashCodes that do not differ 
 * in lower bits. Note: Null keys always map to hash 0, thus index 0. 
 */  
static int hash(int h) {  
    // This function ensures that hashCodes that differ only by  
    // constant multiples at each bit position have a bounded  
    // number of collisions (approximately 8 at default load factor).  
    h ^= (h >>> 20) ^ (h >>> 12);  
    return h ^ (h >>> 7) ^ (h >>> 4);  
}  
  
/** 
 * Returns index for hash code h. 
 */  
static int indexFor(int h, int length) {  
    return h & (length-1);  
}

如今分析一下hash算法： spa

h ^= (h >>> 20) ^ (h >>> 12);  
return h ^ (h >>> 7) ^ (h >>> 4);

假設key.hashCode()的值爲：0x7FFFFFFF，table.length爲默認值16。

上面算法執行以下：

獲得i=15

其中h^(h>>>7)^(h>>>4) 結果中的位運行標識是把h>>>7 換成 h>>>8來看。

即最後h^(h>>>8)^(h>>>4) 運算後hashCode值每位數值以下：

8=8

7=7^8

6=6^7^8

5=5^8^7^6

4=4^7^6^5^8

3=3^8^6^5^8^4^7

2=2^7^5^4^7^3^8^6

1=1^6^4^3^8^6^2^7^5

結果中的一、二、3三位出現重複位^運算

3=3^8^6^5^8^4^7 -> 3^6^5^4^7

2=2^7^5^4^7^3^8^6 -> 2^5^4^3^8^6

1=1^6^4^3^8^6^2^7^5 -> 1^4^3^8^2^7^5

算法中是採用(h>>>7)而不是(h>>>8)的算法，應該是考慮一、二、3三位出現重複位^運算的狀況。使得最低位上原hashCode的8位都參與了^運算，因此在table.length爲默認值16的狀況下面，hashCode任意位的變化基本都能反應到最終hash table 定位算法中，這種狀況下只有原hashCode第3位高1位變化不會反應到結果中，即：0x7FFFF7FF的i=15。

相關標籤/搜索

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。