Java集合-HashMap擾動函數

上一篇文章HashMap內部結構提到了 HashMap 有一個擾動函數,來判斷元素落在數組的位置。下面經過具體的例子說明。html

public V get(Object key) {
    Node<K,V> e;
    return (e = getNode(hash(key), key)) == null ? null : e.value;
}

static final int hash(Object key) {
    int h;
    return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
}

final Node<K,V> getNode(int hash, Object key) {
    Node<K,V>[] tab; Node<K,V> first, e; int n; K k;
    if ((tab = table) != null && (n = tab.length) > 0 &&
        (first = tab[(n - 1) & hash]) != null) {
        
        ...
    }
    return null;
}

說明

get 方法如何肯定key在數組中的位置,先經過 hash(key) 再經過 tab[(n-1) & hash] 來肯定位置。java

hash(key)

(h = key.hashCode) ^ (h >>> 16) 這個是什麼意思?數組

h >>> 16 表示將hashCode的二進制碼右移16位函數

^ 表示按位異或,也就是2個二進制碼異或,2個數不一樣則結果爲1,不然爲0。spa

舉個例子.net

混合高位和地位來加大隨機性3d

 

table[(n-1) & hash]

那麼 (n-1) & hash 又是什麼意思?code

n-1 表示map數組的長度減1htm

& 按位與,2個進制碼相與,2個數相同則結果1,不然爲0。blog

接着上面的例子,假設如今n等於map的默認長度 16

其實就是保留最後4位,將其餘位都清零,再轉換成10進制 0100就是4,也就是在 tab[4] 這個地方讀取數據。若是進行了一次擴容那麼數組的長度會擴展到32,這樣就是根據二進制最後的5位來判斷數組的位置(32 的二進制爲 100000,31爲 11111)。這也是爲何map數組的長度必需是2的n次方(a power of two),2的n次方-1 轉換成二進制末尾都是1,長度不一樣。利用這種方式來使得插入的數據儘可能不會落在同一個地方,均勻分佈在數組的各個位置。

http://vanillajava.blogspot.com/2015/09/an-introduction-to-optimising-hashing.html

上面這篇文章詳細的說明了 hash 策略,hash衝撞發生的機率。

 

實際場景模擬

經過本身寫的簡單代碼模擬一下

String str1 = "abcd";
String str2 = "a";
String str3 = "cc";
String str4 = "d";

String[] table = new String[4];

System.out.println("str1 = " + str1);
System.out.println("str1.hashCode() = " + str1.hashCode());
System.out.println("str1.hashCode() >>> 16 = " + (str1.hashCode() >>> 16));
System.out.println("str1.hashCode() ^ (str1.hashCode() >>> 16) = " + ((str1.hashCode()) ^ (str1.hashCode() >>> 16)));
System.out.println("table.length - 1 = " + (table.length - 1));
System.out.println("(table.length - 1) & hash1 = " + ((table.length - 1) & ((str1.hashCode()) ^ (str1.hashCode() >>> 16))));
System.out.println();
System.out.println("str2 = " + str2);
System.out.println("str2.hashCode() = " + str2.hashCode());
System.out.println("str2.hashCode() >>> 16 = " + (str2.hashCode() >>> 16));
System.out.println("str2.hashCode() ^ (str2.hashCode() >>> 16) = " + ((str2.hashCode()) ^ (str2.hashCode() >>> 16)));
System.out.println("table.length - 1 = " + (table.length - 1));
System.out.println("(table.length - 1) & hash2 = " + ((table.length - 1) & ((str2.hashCode()) ^ (str2.hashCode() >>> 16))));
System.out.println();
System.out.println("str3 = " + str3);
System.out.println("str3.hashCode() = " + str3.hashCode());
System.out.println("str3.hashCode() >>> 16 = " + (str3.hashCode() >>> 16));
System.out.println("str3.hashCode() ^ (str3.hashCode() >>> 16) = " + ((str3.hashCode()) ^ (str3.hashCode() >>> 16)));
System.out.println("table.length - 1 = " + (table.length - 1));
System.out.println("(table.length - 1) & hash3 = " + ((table.length - 1) & ((str3.hashCode()) ^ (str3.hashCode() >>> 16))));
System.out.println();
System.out.println("str4 = " + str4);
System.out.println("str4.hashCode() = " + str4.hashCode());
System.out.println("str4.hashCode() >>> 16 = " + (str4.hashCode() >>> 16));
System.out.println("str4.hashCode() ^ (str4.hashCode() >>> 16) = " + ((str4.hashCode()) ^ (str4.hashCode() >>> 16)));
System.out.println("table.length - 1 = " + (table.length - 1));
System.out.println("(table.length - 1) & hash4 = " + ((table.length - 1) & ((str4.hashCode()) ^ (str4.hashCode() >>> 16))));


int hash1 = hash(str1);
int index1 = (table.length - 1) & hash1;
table[index1] = str1;

int hash2 = hash(str2);
int index2 = (table.length - 1) & hash2;
table[index2] = str2;

int hash3 = hash(str3);
int index3 = (table.length - 1) & hash3;
table[index3] = str3;

int hash4 = hash(str4);
int index4 = (table.length - 1) & hash4;
table[index4] = str4;

System.out.println(JSON.toJSONString(table));

初始化了一個長度爲4的數組,利用擾動函數分別將 str1,str2,str3,str4插入到數組。

輸出的結果是

str1 = abcd
str1.hashCode() = 2987074
str1.hashCode() >>> 16 = 45
str1.hashCode() ^ (str1.hashCode() >>> 16) = 2987119
table.length - 1 = 3
(table.length - 1) & hash1 = 3

str2 = a
str2.hashCode() = 97
str2.hashCode() >>> 16 = 0
str2.hashCode() ^ (str2.hashCode() >>> 16) = 97
table.length - 1 = 3
(table.length - 1) & hash2 = 1

str3 = cc
str3.hashCode() = 3168
str3.hashCode() >>> 16 = 0
str3.hashCode() ^ (str3.hashCode() >>> 16) = 3168
table.length - 1 = 3
(table.length - 1) & hash3 = 0

str4 = d
str4.hashCode() = 100
str4.hashCode() >>> 16 = 0
str4.hashCode() ^ (str4.hashCode() >>> 16) = 100
table.length - 1 = 3
(table.length - 1) & hash4 = 0

["d","a",null,"abcd"]

由於被輸出到控制檯,因此二進制被轉換成10進制了,在計算機內部都是二進制計算的。可是不影響看結果,結果是cc字符串和d字符串計算出來數組的位置都是0,我這裏是直接覆蓋了,若是是 HashMap 的話,這裏就要轉換成鏈表了。

相關文章
相關標籤/搜索