HashTable代碼解析

時間 2019-11-06

原文原文鏈接

HashTable繼承關係以下：html

HashTable是一個線程安全的【鍵-值對】存儲結構。其存儲結構和HashMap相同，參考這裏。java

1. HashTable定義了一個類型爲Entry<K,V>的數組table用來存儲數據。數組

    /**
     * The hash table data.
     */
    private transient Entry<K,V>[] table;

類型Entry<K,V>的定義以下：安全

    /**
     * Hashtable bucket collision list entry
     */
    private static class Entry<K,V> implements Map.Entry<K,V> {
        int hash;
        final K key;
        V value;
        Entry<K,V> next;
    }

由Entry<K,V>的定義可知，上圖每一個節點中其實存了4個變量：多線程

key表示鍵，即存入map的鍵值app

value表示值，即存入map的值less

next表示下一個Entry節點函數

hash表示key的哈希值。學習

那麼，table的圖示爲：this

2. HashTable定義了count值來表示HashTable中元素的個數

    /**
     * The total number of entries in the hash table.
     */
    private transient int count;

因爲全部對count值進行操做的方法都是線程安全的，因此count能夠精確表示HashTable中元素的個數。（在HashMap中，size()方法是不精確的）

有了精確的count值，求size() / isEmpty() 就比較簡單了。

    /**
     * Returns the number of keys in this hashtable.
     *
     * @return  the number of keys in this hashtable.
     */
    public synchronized int size() {
        return count;
    }

    /**
     * Tests if this hashtable maps no keys to values.
     *
     * @return  <code>true</code> if this hashtable maps no keys to values;
     *          <code>false</code> otherwise.
     */
    public synchronized boolean isEmpty() {
        return count == 0;
    }

注意：這些方法都帶有synchronized關鍵字。

3. HashTable一樣定義了

threshold: hashtable的從新擴容的閾值，通常值爲(int)(capacity * loadFactor)。二般狀況下是什麼值呢？就是當HashTable的table數組的大小已經超過Integer最大值-8時，rehash的時候不在擴大table數組的大小，而是將threshold值放到最大。

loadFactor: 負載因子，默認是0.75f

modCount: 修改次數

代碼以下：

    /**
     * The table is rehashed when its size exceeds this threshold.  (The
     * value of this field is (int)(capacity * loadFactor).)
     *
     * @serial
     */
    private int threshold;

    /**
     * The load factor for the hashtable.
     *
     * @serial
     */
    private float loadFactor;

    /**
     * The number of times this Hashtable has been structurally modified
     * Structural modifications are those that change the number of entries in
     * the Hashtable or otherwise modify its internal structure (e.g.,
     * rehash).  This field is used to make iterators on Collection-views of
     * the Hashtable fail-fast.  (See ConcurrentModificationException).
     */
    private transient int modCount = 0;

3. HashTable默認構造函數爲

    /**
     * Constructs a new, empty hashtable with a default initial capacity (11)
     * and load factor (0.75).
     */
    public Hashtable() {
        this(11, 0.75f);
    }

爲何初始容量爲11 ？？

其中， this()調用了

    /**
     * Constructs a new, empty hashtable with the specified initial
     * capacity and the specified load factor.
     *
     * @param      initialCapacity   the initial capacity of the hashtable.
     * @param      loadFactor        the load factor of the hashtable.
     * @exception  IllegalArgumentException  if the initial capacity is less
     *             than zero, or if the load factor is nonpositive.
     */
    public Hashtable(int initialCapacity, float loadFactor) {
        if (initialCapacity < 0)
            throw new IllegalArgumentException("Illegal Capacity: "+
                                               initialCapacity);
        if (loadFactor <= 0 || Float.isNaN(loadFactor))
            throw new IllegalArgumentException("Illegal Load: "+loadFactor);

        if (initialCapacity==0)
            initialCapacity = 1;
        this.loadFactor = loadFactor;
        // 初始化table數組變量
        table = new Entry[initialCapacity];
        // 求threshold的值
        threshold = (int)Math.min(initialCapacity * loadFactor, MAX_ARRAY_SIZE + 1);
        initHashSeedAsNeeded(initialCapacity);
    }

4. hash()方法

    private int hash(Object k) {
        // hashSeed will be zero if alternative hashing is disabled.
        return hashSeed ^ k.hashCode();
    }

5. put()方法，使用了synchronized方法修飾

    /**
     * Maps the specified <code>key</code> to the specified
     * <code>value</code> in this hashtable. Neither the key nor the
     * value can be <code>null</code>. <p>
     *
     * The value can be retrieved by calling the <code>get</code> method
     * with a key that is equal to the original key.
     *
     * @param      key     the hashtable key
     * @param      value   the value
     * @return     the previous value of the specified key in this hashtable,
     *             or <code>null</code> if it did not have one
     * @exception  NullPointerException  if the key or value is
     *               <code>null</code>
     * @see     Object#equals(Object)
     * @see     #get(Object)
     */
    public synchronized V put(K key, V value) {
        // Make sure the value is not null
        if (value == null) {
            throw new NullPointerException();
        }

        // Makes sure the key is not already in the hashtable.
        Entry tab[] = table;
        int hash = hash(key);

        // 在HashMap中，求一個key的索引位置是隻用的hash & (tab.length-1)
        int index = (hash & 0x7FFFFFFF) % tab.length;
        // 若是已經包含了該key，更新value，並返回舊的value
        for (Entry<K,V> e = tab[index] ; e != null ; e = e.next) {
            if ((e.hash == hash) && e.key.equals(key)) {
                V old = e.value;
                e.value = value;
                return old;
            }
        }

        modCount++;
        // 若是HashTable中元素的個數已經超過了閾值threshold，須要對HashTable擴容，重新hash
        if (count >= threshold) {
            // Rehash the table if the threshold is exceeded
            rehash();

            tab = table;
            hash = hash(key);
            index = (hash & 0x7FFFFFFF) % tab.length;
        }

        // Creates the new entry. 這裏一樣是在鏈表頭部插入元素，將當前鏈表的第一個節點做爲新節點的下一個元素
        Entry<K,V> e = tab[index];
        tab[index] = new Entry<>(hash, key, value, e);
        // 元素個數加1
        count++;
        return null;
    }

6. rehash()，HashTable是如何進行rehash的呢？

    /**
     * Increases the capacity of and internally reorganizes this
     * hashtable, in order to accommodate and access its entries more
     * efficiently.  This method is called automatically when the
     * number of keys in the hashtable exceeds this hashtable's capacity
     * and load factor.
     */
    protected void rehash() {
        int oldCapacity = table.length;
        Entry<K,V>[] oldMap = table;

        // overflow-conscious code
        // oldCapacity左移1位，擴大2倍，可能會溢出
        int newCapacity = (oldCapacity << 1) + 1;
        if (newCapacity - MAX_ARRAY_SIZE > 0) {
            if (oldCapacity == MAX_ARRAY_SIZE)
                // Keep running with MAX_ARRAY_SIZE buckets
                return;
            newCapacity = MAX_ARRAY_SIZE;
        }
        // 建立新的table，大小爲newCapacity
        Entry<K,V>[] newMap = new Entry[newCapacity];

        modCount++;
        threshold = (int)Math.min(newCapacity * loadFactor, MAX_ARRAY_SIZE + 1);
        boolean rehash = initHashSeedAsNeeded(newCapacity);

        // 更新table數組，爲新的newMap
        table = newMap;
        // 遍歷oldMap，遷移到新的newMap中
        // oldMap數組的長度爲oldCapacity
        for (int i = oldCapacity ; i-- > 0 ;) {
            for (Entry<K,V> old = oldMap[i] ; old != null ; ) {
                Entry<K,V> e = old;
                old = old.next;

                if (rehash) {
                    e.hash = hash(e.key);
                }
                int index = (e.hash & 0x7FFFFFFF) % newCapacity;
                e.next = newMap[index];
                newMap[index] = e;
            }
        }
    }

7. get()方法比較簡單

(a). 根據key，求出hash值

(b). 根據hash值求出key所在的table的索引index，能夠定位到鏈表的第一個元素

(c). 遍歷鏈表娶老婆（元素），娶不到返回null

    /**
     * Returns the value to which the specified key is mapped,
     * or {@code null} if this map contains no mapping for the key.
     *
     * <p>More formally, if this map contains a mapping from a key
     * {@code k} to a value {@code v} such that {@code (key.equals(k))},
     * then this method returns {@code v}; otherwise it returns
     * {@code null}.  (There can be at most one such mapping.)
     *
     * @param key the key whose associated value is to be returned
     * @return the value to which the specified key is mapped, or
     *         {@code null} if this map contains no mapping for the key
     * @throws NullPointerException if the specified key is null
     * @see     #put(Object, Object)
     */
    public synchronized V get(Object key) {
        Entry tab[] = table;
        int hash = hash(key);
        int index = (hash & 0x7FFFFFFF) % tab.length;
        for (Entry<K,V> e = tab[index] ; e != null ; e = e.next) {
            if ((e.hash == hash) && e.key.equals(key)) {
                return e.value;
            }
        }
        return null;
    }

8. remove()，最後再來學習下remove()方法

    /**
     * Removes the key (and its corresponding value) from this
     * hashtable. This method does nothing if the key is not in the hashtable.
     *
     * @param   key   the key that needs to be removed
     * @return  the value to which the key had been mapped in this hashtable,
     *          or <code>null</code> if the key did not have a mapping
     * @throws  NullPointerException  if the key is <code>null</code>
     */
    public synchronized V remove(Object key) {
        Entry tab[] = table;
        int hash = hash(key);
        int index = (hash & 0x7FFFFFFF) % tab.length;
        // 和get方法相似，一樣是先獲取到key對應的鏈表tab[index]
        // 而後遍歷鏈表，移除元素
        for (Entry<K,V> e = tab[index], prev = null ; e != null ; prev = e, e = e.next) {
            // prev -> e ，prev是e的下一個元素，e就是要刪除的元素
            if ((e.hash == hash) && e.key.equals(key)) {
                modCount++;
                if (prev != null) {
                    prev.next = e.next;
                } else {
                    tab[index] = e.next;
                }
                // count個數減1
                count--;
                V oldValue = e.value;
                e.value = null;
                return oldValue;
            }
        }
        return null;
    }

HashTable的線程安全是使用synchronized關鍵字實現的，所以效率不高，因此在多線程環境下，推薦使用ConcurrentHashMap。