【Java容器】HashMap使用方法及源碼分析

時間 2019-11-07

原文原文鏈接

HashMap容器

1 簡介

HashMap基於散列表實現了Map接口，提供了Map的全部可選操做，HashMap與Hashtable大體相同，區別在於HashMap不支持同步並且HashMap中存儲的鍵值均可覺得null。HashMap中不保證散列表的順序。java

當散列函數將元素正確地分散到各個桶之中的時候，HashMap中存取操做的時間複雜度都是O(1)。當HashMap實例的容量（capacity）爲M，存儲的鍵值對的數量（size）爲N時，遍歷HashMap的時間複雜度爲O(M+N)。node

影響一個HashMap實例的性能的兩個參數分別是：initial capacity（初始容量）和 load factor（負載因子）。容量表示哈希表中桶的個數，初始容量就是HashMap實例在構造化時初始化的容量；負載因子則控制哈希表在多滿的時候須要自動擴容。當哈希表中的鍵值對（entry）的數量超過當前容量與負載因子的乘積時，哈希表將會進行rehash操做，使哈希表的桶數增長一倍左右。算法

通常默認的負載因子取0.75，能夠較好地平衡時間與空間花費。太高的負載因子雖然節省空間，可是增長了訪問哈希表的時間消耗。在設置初始容量時須要考慮散列表中指望存放的鍵值對數量以及負載因子，減小rehash的次數。bootstrap

2 方法

2.1 構造方法

HashMap的構造方法共有四種重載形式，能夠在構造時指定HashMap的初始容量、負載因子或者使用已有的HashMap來初始化新的HashMap。數組

HashMap<String, Integer> map1 = new HashMap<>();    // 建立空的HashMap，默認容量16，默認負載因子0.75
System.out.println(map1);    // {}
        
int cap = 50;
HashMap<String, Integer> map2 = new HashMap<>(cap);    // 建立空的HashMap，初始化容量爲cap，默認負載因子0.75
System.out.println(map2);    // {}
        
float lf = 0.5f;
HashMap<String, Integer> map3 = new HashMap<>(cap, lf);    // 建立空的HashMap，初始化容量爲cap，默認負載因子lf
System.out.println(map3);    // {}
        
HashMap<String, Integer> map4 = new HashMap<>(map3);    // 使用另外一個HashMap來建立一個新的HashMap
System.out.println(map4);    // {}
System.out.println("map4 == map3? "+(map4 == map3));    // 不是同一個HashMap對象
System.out.println("map4.equals(map3)? "+(map4.equals(map3)));    // HashMap中的內容相等

2.2 添加元素

HashMap提供put(K key, V value) 、putAll(Map<? extends K, ? extends V> m)app

以及putIfAbsent(K key, V value)方法向HashMap添加單個鍵值對或添加指定HashMap中的全部鍵值對。ide

map1.put("a", 100);    // 添加{k:v}記錄
System.out.println("map1:"+map1);    // map1:{a=100}
        
map2.putAll(map1);    // 複製另外一個HashMap的全部鍵值對
System.out.println("map2:"+map2);    // map2:{a=100}
        
int ret1 = map1.putIfAbsent("a", 200);    // 嘗試添加記錄{k:v}，若k已存在且指向非null，返回當前HashMap的k所指對象
System.out.println("map1:"+map1+" return:"+ret1);    // map1:{a=100} return:100
Object ret2 = map1.putIfAbsent("b", 200);    // 若k不存在或指向null，添加{k:v}記錄，返回null
System.out.println("map1:"+map1+" return:"+ret2);    // map1:{a=100, b=200} return:null

2.3 刪除元素

HashMap提供重載的remove()方法刪除HashMap中的鍵值對。函數

int ret3 = map1.remove("a");    // 根據指定的鍵刪除一個鍵值對
System.out.println("map1:"+map1+" return:"+ret3);    // map1:{b=200} return:100
        
boolean ret4 = map1.remove("b", 300);    // 刪除指定的鍵值對，若k不指向v，則不刪除，返回是否成功刪除
System.out.println("map1:"+map1+" return:"+ret4);    // map1:{b=200} return:false
boolean ret5 = map1.remove("b", 200);
System.out.println("map1:"+map1+" return:"+ret5);    // map1:{} return:true

2.4 訪問元素

HashMap提供get(Object key)和getOrDefault(Object key, V defaultValue)獲取指定鍵對應的值，若HashMap中無該鍵的記錄，前者返回null後者返回默認值。源碼分析

int val1 = map2.get("a");    // 獲取指定鍵的值，若無記錄返回null
System.out.println("map2:"+map2+" a:"+val1);    // map2:{a=100} a:100
        
int val2 = map2.getOrDefault("b", 200);    // 獲取指定鍵的值，若無記錄則返回指定值
System.out.println("map2:"+map2+" b:"+val2);    // map2:{a=100} b:200

2.5 元素變動

HashMap提供重載的replace()方法，用於更新HashMap中指定的鍵值對。性能

int ret6 = map2.replace("a", 300);    // 更新HashMap中存在的某個鍵對應的值，返回更新前的值
System.out.println("map2:"+map2+" return:"+ret6);    // map2:{a=300} return:100
Object ret7 = map2.replace("c", 300);    // 更新HashMap中某個鍵的值，若無該鍵的記錄，直接返回null
System.out.println("map2:"+map2+" return:"+ret7);    // map2:{a=300} return:null
        
boolean ret8 = map2.replace("a", 300, 400);    // 更新HashMap中存在的指定鍵值對，返回操做是否成功
System.out.println("map2:"+map2+" return:"+ret8);    // map2:{a=400} return:true
boolean ret9 = map2.replace("a", 100, 200);    // 指定的鍵值對不對應
System.out.println("map2:"+map2+" return:"+ret9);    // map2:{a=400} return:false
boolean ret10 = map2.replace("b", 100, 200);    // 無指定的鍵
System.out.println("map2:"+map2+" return:"+ret10);    // map2:{a=400} return:false

// void replaceAll(BiFunction<? super K,? super V,? extends V> function)

2.6 HashMap操做

clear()方法用於刪除全部元素；

isEmpty()方法檢查HashMap是否爲空；

size()方法返回HashMap中鍵值對的數量；

containsKey(Object key)方法檢查HashMap中是否有指定的鍵的記錄；

containsKey(Object value)方法檢查HashMap中是否有指定的值的記錄；

boolean ret11 = map1.containsKey("a");    // 檢查HashMap中是否有指定的鍵的記錄
System.out.println("map1:"+map1+" return:"+ret11);    // map1:{a=100, b=200, c=100} return:true
boolean ret12 = map1.containsKey("d");
System.out.println("map1:"+map1+" return:"+ret12);    // map1:{a=100, b=200, c=100} return:false

boolean ret13 = map1.containsValue(100);    // 檢查HashMap中是否有指定的值的記錄
System.out.println("map1:"+map1+" return:"+ret13);    // map1:{a=100, b=200, c=100} return:true
boolean ret14 = map1.containsValue(500);
System.out.println("map1:"+map1+" return:"+ret14);    // map1:{a=100, b=200, c=100} return:false

entrySet()方法以Set的形式返回HashMap中的每一個鍵值對（Entry）；

Set ret15 = map1.entrySet();    // 以Set的形式返回HashMap中的每一個鍵值對（Entry）
System.out.println("map1:"+map1+" \nSet<Map.Entry>:"+ret15);
// map1:{a=100, b=200, c=100}
// Set<Map.Entry>:[a=100, b=200, c=100]

keySet()方法以Set的形式返回HashMap中的每一個鍵；

Set ret17 = map1.keySet();    // 以Set的形式返回HashMap中的每一個鍵
System.out.println("map1:"+map1+" \nSet<K>:"+ret17);
// map1:{a=100, b=200, c=100} 
// Set<K>:[a, b, c]

values()方法以Collection的形式返回HashMap中的值。

Collection ret19 = map1.values();    // 以Collection的形式返回HashMap中的值
System.out.println("map1:"+map1+" \nCollection<V>:"+ret19);
// map1:{a=100, b=200, c=100} 
// Collection<V>:[100, 200, 100]

3 源碼分析

版本：jdk 1.8

3.1 存儲結構

HashMap中使用一個數組存儲鍵值對（Node<K,V>），數組的大小默認爲16，且只能爲2的整數次冪。

/**
* The table, initialized on first use, and resized as
* necessary. When allocated, length is always a power of two.
* (We also tolerate length zero in some operations to allow
* bootstrapping mechanics that are currently not needed.)
*/
transient Node<K,V>[] table;

每一個Node<K,V>中記錄四個字段，

static class Node<K,V> implements Map.Entry<K,V> {
        final int hash;
        final K key;
        V value;
        Node<K,V> next;
        ...
}

3.2 put()方法

put(K key, V value)方法向HashMap中添加一個鍵值對，

/**
     * Associates the specified value with the specified key in this map.
     * If the map previously contained a mapping for the key, the old
     * value is replaced.
     * ...
     */
    public V put(K key, V value) {
        return putVal(hash(key), key, value, false, true);
    }

其會對輸入的key調用hash()方法，再調用putVal(int hash, K key, V value, boolean onlyIfAbsent, boolean evict)方法實現，

final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
                   boolean evict) {
        Node<K,V>[] tab; Node<K,V> p; int n, i;
        /*判斷是否須要擴容，若須要則擴容，最後獲得HashMap的當前容量*/
        if ((tab = table) == null || (n = tab.length) == 0)
            n = (tab = resize()).length;
        /*計算新添加條目在散列表的位置，若hash值沒發生衝突，直接添加新條目*/
        if ((p = tab[i = (n - 1) & hash]) == null)
            tab[i] = newNode(hash, key, value, null);
        /*hash值衝突*/
        else {
            Node<K,V> e; K k;
            /*若新添加的條目的key與已有的key相同，則覆蓋*/
            if (p.hash == hash &&
                ((k = p.key) == key || (key != null && key.equals(k))))
                e = p;
            /*若該桶中已是樹結構，添加條目到該樹中*/
            else if (p instanceof TreeNode)
                e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
            /*新條目與舊條目hash值相同，key不一樣的狀況（採用拉鍊法）*/
            else {
                for (int binCount = 0; ; ++binCount) {
                    if ((e = p.next) == null) {
                        p.next = newNode(hash, key, value, null);    // 新條目插在鏈表末端
                        /*若鏈表長度大於等於8時，轉換爲樹結構*/
                        if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
                            treeifyBin(tab, hash);
                        break;
                    }
                    /*若新添加的條目的key與鏈表中已有的key相同，則覆蓋*/
                    if (e.hash == hash &&
                        ((k = e.key) == key || (key != null && key.equals(k))))
                        break;
                    p = e;
                }
            }
            if (e != null) { // existing mapping for key
                V oldValue = e.value;
                if (!onlyIfAbsent || oldValue == null)
                    e.value = value;
                afterNodeAccess(e);
                return oldValue;
            }
        }
        /*記錄操做次數*/
        ++modCount;
        /*添加條目後再檢查是否須要resize*/
        if (++size > threshold)
            resize();
        afterNodeInsertion(evict);
        return null;
    }

在插入新條目的時候，首先根據key計算出散列值，而後根據散列值肯定條目存放的桶的下標，

i = (n - 1) & hash    // i爲桶的索引,n爲桶的數量

若是該位置爲空，則直接存放新條目；若是不爲空，則在該位置使用鏈表記錄這些相同散列值的條目，當一個桶記錄的條目大於8時，改用紅黑樹記錄該桶中的元素。

3.3 get()方法

get(Object key)方法根據指定的key返回一個value，

public V get(Object key) {
        Node<K,V> e;
        return (e = getNode(hash(key), key)) == null ? null : e.value;
    }

調用了getNode(int hash, Object key)方法，獲取指定key的節點（Entry），

final Node<K,V> getNode(int hash, Object key) {
        Node<K,V>[] tab; Node<K,V> first, e; int n; K k;
        if ((tab = table) != null && (n = tab.length) > 0 &&
            (first = tab[(n - 1) & hash]) != null) {
            if (first.hash == hash && // always check first node
                ((k = first.key) == key || (key != null && key.equals(k))))
                return first;
            if ((e = first.next) != null) {
                /*在紅黑樹結構中搜索節點*/
                if (first instanceof TreeNode)
                    return ((TreeNode<K,V>)first).getTreeNode(hash, key);
                /*在鏈表中搜索節點*/
                do {
                    /*分別比較散列值和key是否相同*/
                    if (e.hash == hash &&
                        ((k = e.key) == key || (key != null && key.equals(k))))
                        return e;
                } while ((e = e.next) != null);
            }
        }
        return null;
    }

在尋找指定key的節點時，會先比較散列值是否相同，再比較key的值是否相同。只有key散列值相同且key相同纔是所要找的節點。

3.4 remove()方法

remove()方法，

@Override
    public boolean remove(Object key, Object value) {
        return removeNode(hash(key), key, value, true, true) != null;
    }

調用removeNode(int hash, Object key, Object value, boolean matchValue, boolean movable)方法，

final Node<K,V> removeNode(int hash, Object key, Object value,
                               boolean matchValue, boolean movable) {
        Node<K,V>[] tab; Node<K,V> p; int n, index;
        if ((tab = table) != null && (n = tab.length) > 0 &&
            (p = tab[index = (n - 1) & hash]) != null) {
            Node<K,V> node = null, e; K k; V v;
            /*找到待刪除的節點，並用node指向它*/
            if (p.hash == hash &&
                ((k = p.key) == key || (key != null && key.equals(k))))
                node = p;
            else if ((e = p.next) != null) {
                if (p instanceof TreeNode)
                    node = ((TreeNode<K,V>)p).getTreeNode(hash, key);
                else {
                    do {
                        if (e.hash == hash &&
                            ((k = e.key) == key ||
                             (key != null && key.equals(k)))) {
                            node = e;
                            break;
                        }
                        p = e;
                    } while ((e = e.next) != null);
                }
            }
            /**/
            if (node != null && (!matchValue || (v = node.value) == value ||
                                 (value != null && value.equals(v)))) {
                /*在紅黑樹中刪除該節點*/
                if (node instanceof TreeNode)
                    ((TreeNode<K,V>)node).removeTreeNode(this, tab, movable);
                /*該桶只有一個節點*/
                else if (node == p)
                    tab[index] = node.next;
                /*在鏈表中刪除該節點*/
                else
                    p.next = node.next;
                ++modCount;
                --size;
                afterNodeRemoval(node);
                return node;
            }
        }
        return null;
    }

3.5 計算桶下標

在諸如插入或者刪除的操做中都涉及到桶下標的計算

i = (n - 1) & hash    // n-1作hash的掩碼

其中的n表示桶的個數（2的整數次冪），hash由hash(Object key)方法計算，

static final int hash(Object key) {
        int h;
        return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
    }

舉個例子：

以在一個容量爲16（1000B）的HashMap中，計算key爲字符串"hello"時，應該存放的桶下標。

第一步，計算字符串"hello"的hashCode，

"hello".hashCode() = 0000 0101 1110 1001 0001 1000 1101 0010

第二步，在hash(Object key)中，將hashCode的高位（16位）與低位（16位）異或，

h     = 0000 0101 1110 1001 0001 1000 1101 0010    // hashCode
h >>> 16 = 0000 0000 0000 0000 0000 0101 1110 1001    // hashCode無符號右移16位
  XOR     = 0000 0101 1110 1001 0001 1101 0011 1011    // 用於計算桶下標的散列值

第三步，將異或的結果做爲計算桶下標的hash值，

hash  = 0000 0101 1110 1001 0001 1101 0011 1011         // 散列值
n - 1 = 0000 0000 0000 0000 0000 0000 0000 1111 (15)    // 桶數-1
index = 0000 0000 0000 0000 0000 0000 0000 1011 (11)    // 桶下標

其中要注意的是，在第三步計算桶下標的時候，沒有直接使用hash%n這樣取餘的方式，由於取餘的方式複雜度較位運算要高。因爲hash算法均勻分佈的原則，做爲掩碼的二進制位全爲1，可使得求得的桶下標也是均勻的，所以規定table的容量應該是2的整數次冪。

而在第二步計算hash的時候將hashCode的高位與低位求異或，則是由於使用了掩碼的方式求下標，致使大部分狀況下只利用了hash值中低位的信息，容易產生hash衝突，所以將hashCode的高位信息經過這種形式引入到hash值中。

3.6 擴容

當HashMap中的節點數量達到臨界值時，會調用resize()方法對HashMap進行擴容，該方法也用於初始化HashMap中的table數組，

final Node<K,V>[] resize() {
        Node<K,V>[] oldTab = table;
        int oldCap = (oldTab == null) ? 0 : oldTab.length;
        int oldThr = threshold;
        int newCap, newThr = 0;
        /*擴容*/
        if (oldCap > 0) {
            /*已經達到最大容量，直接返回*/
            if (oldCap >= MAXIMUM_CAPACITY) {
                threshold = Integer.MAX_VALUE;
                return oldTab;
            }
            /*新容量爲舊容量的2倍(<<1)*/
            else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY &&
                     oldCap >= DEFAULT_INITIAL_CAPACITY)
                newThr = oldThr << 1; // double threshold
        }
        /*使用構造函數中建議的容量*/
        else if (oldThr > 0) // initial capacity was placed in threshold
            newCap = oldThr;
        /*默認初始化*/
        else {               // zero initial threshold signifies using defaults
            newCap = DEFAULT_INITIAL_CAPACITY;
            newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY);
        }
        /*更新擴容閾值*/
        if (newThr == 0) {
            float ft = (float)newCap * loadFactor;
            newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY ?
                      (int)ft : Integer.MAX_VALUE);
        }
        threshold = newThr;
        @SuppressWarnings({"rawtypes","unchecked"})
            Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap];    // 構造新的table
        table = newTab;
        /*將舊table中的節點保存到新table*/
        if (oldTab != null) {
            for (int j = 0; j < oldCap; ++j) {
                Node<K,V> e;
                if ((e = oldTab[j]) != null) {
                    oldTab[j] = null;
                    /*桶中只有一個節點，直接放入新table*/
                    if (e.next == null)
                        newTab[e.hash & (newCap - 1)] = e;
                    /*桶中是紅黑樹結構，拆分放入新table*/
                    else if (e instanceof TreeNode)
                        ((TreeNode<K,V>)e).split(this, newTab, j, oldCap);
                    /*桶中是鏈表結構，放入新table時保留鏈表中的順序*/
                    else { // preserve order
                        Node<K,V> loHead = null, loTail = null;
                        Node<K,V> hiHead = null, hiTail = null;
                        Node<K,V> next;
                        do {
                            next = e.next;
                            if ((e.hash & oldCap) == 0) {
                                if (loTail == null)
                                    loHead = e;
                                else
                                    loTail.next = e;
                                loTail = e;
                            }
                            else {
                                if (hiTail == null)
                                    hiHead = e;
                                else
                                    hiTail.next = e;
                                hiTail = e;
                            }
                        } while ((e = next) != null);
                        if (loTail != null) {
                            loTail.next = null;
                            newTab[j] = loHead;
                        }
                        if (hiTail != null) {
                            hiTail.next = null;
                            newTab[j + oldCap] = hiHead;
                        }
                    }
                }
            }
        }
        return newTab;
    }

在進行擴容時，首先計算新table的容量，而後將遍歷舊table中的節點放到新table中，十分費時。其中涉及對節點在新table中的桶下標的計算，HashMap採起一個機制，下降從新計算桶下標的複雜度，

oldTable[index] --> newTable[index+oldCap]

舉個例子，

假設原數組容量oldCap爲 16，擴容以後newCap爲 32：

hash     = 110110    (54)    // 散列值
oldIndex = 000110    (6)     // 舊下標
oldCap   = 010000    (16)    // 舊table容量
newCap   = 100000    (32)    // 新table容量

按照求異或的算法，算出新下標爲

hash     = 110110    (54)    // 散列值
newCap-1 = 011111    (32)    // 新table容量-1
newIndex = 010110    (22)    // 新下標

根據公式算得新的下標爲

oldIndex + oldCap = 6 + 16 = 22    // 新下標

相關標籤/搜索

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。