面試之HashSet之源碼剖析

時間 2019-12-07

原文原文鏈接

一、最近的面試都問到了hashset存入實現同一類的兩個對象，若是要去重要怎麼作：java

重寫equal方法或hashcode方法，也就是說判斷兩個對象是否相等用到的是Object類的equals方法，而equals源碼是面試

public boolean equals(Object obj) {
        return (this == obj);
    }

此時能夠直接重寫 equals這個方法，還有就是讓 this==obj爲true數組

那麼就要重寫hashcode方法了app

* 大意就是 hashcode方法是爲了避免同的類返回不一樣的integer類型for不一樣的對象，就像是每一個對象獨一無二的idide

/**
     * Returns a hash code value for the object. This method is
     * supported for the benefit of hash tables such as those provided by
     * {@link java.util.HashMap}.
     * <p>
     * The general contract of {@code hashCode} is:
     * <ul>
     * <li>Whenever it is invoked on the same object more than once during
     *     an execution of a Java application, the {@code hashCode} method
     *     must consistently return the same integer, provided no information
     *     used in {@code equals} comparisons on the object is modified.
     *     This integer need not remain consistent from one execution of an
     *     application to another execution of the same application.
     * <li>If two objects are equal according to the {@code equals(Object)}
     *     method, then calling the {@code hashCode} method on each of
     *     the two objects must produce the same integer result.
     * <li>It is <em>not</em> required that if two objects are unequal
     *     according to the {@link java.lang.Object#equals(java.lang.Object)}
     *     method, then calling the {@code hashCode} method on each of the
     *     two objects must produce distinct integer results.  However, the
     *     programmer should be aware that producing distinct integer results
     *     for unequal objects may improve the performance of hash tables.
     * </ul>
     * <p>
    
     * As much as is reasonably practical, the hashCode method defined by
     * class {@code Object} does return distinct integers for distinct
     * objects. (This is typically implemented by converting the internal
     * address of the object into an integer, but this implementation
     * technique is not required by the
     * Java&trade; programming language.)
     *
     * @return  a hash code value for this object.
     * @see     java.lang.Object#equals(java.lang.Object)
     * @see     java.lang.System#identityHashCode
     */
    public native int hashCode();

二、既然瞭解了面試題，那麼就要更深刻的看看源碼函數

2.1hashset源碼底層是hashmap，構造函數初始化也就是new了個hashmap，怪不得也叫hashui

/**
 * Constructs a new, empty set; the backing <tt>HashMap</tt> instance has
 * default initial capacity (16) and load factor (0.75).
 */
 public HashSet() {
     map = new HashMap<>();
 }

2.2接下來讓我關心的是add方法，畢竟是set能夠往裏塞數據，那麼底層是hashmap了，就要放入key和value，然而用hashset時只有放入一個變量this

// Dummy value to associate with an Object in the backing Map
    private static final Object PRESENT = new Object();
/**
     * Adds the specified element to this set if it is not already present.
     * More formally, adds the specified element <tt>e</tt> to this set if
     * this set contains no element <tt>e2</tt> such that
     * <tt>(e==null&nbsp;?&nbsp;e2==null&nbsp;:&nbsp;e.equals(e2))</tt>.
     * If this set already contains the element, the call leaves the set
     * unchanged and returns <tt>false</tt>.
     *
     * @param e element to be added to this set
     * @return <tt>true</tt> if this set did not already contain the specified
     * element
     */
    public boolean add(E e) {
        return map.put(e, PRESENT)==null;
    }

看了源碼才發現，傳入的參數給成map的key，才能去重，value直接給了一個靜態的Object對象常量。spa

綜上所述，hashSet去重即hashMap源碼中對key去重code

hashmap在執行put方法時會調用putval方法

/**
     * Associates the specified value with the specified key in this map.
     * If the map previously contained a mapping for the key, the old
     * value is replaced.
     *
     * @param key key with which the specified value is to be associated
     * @param value value to be associated with the specified key
     * @return the previous value associated with <tt>key</tt>, or
     *         <tt>null</tt> if there was no mapping for <tt>key</tt>.
     *         (A <tt>null</tt> return can also indicate that the map
     *         previously associated <tt>null</tt> with <tt>key</tt>.)
     */
    public V put(K key, V value) {
        return putVal(hash(key), key, value, false, true);
    }

關鍵點來了，hash這個參數是 hash(key)，也就是調用Object的hashcode方法

static final int hash(Object key) {
        int h;
        return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
    }

那在putval中，要同時知足hash相等而且equals相等才能執行e=p的覆蓋操做，實現方法，重寫equals和hashCode 方法，記住這兩個方法是要一塊兒重寫的，一個被重寫，另外一個也要被重寫，有兩種重寫方式，一個是本身重寫，一個系統自動生成，hashset=》hashmap的key就能去重。當一個key進到hashset時會先判斷hashcode是否相等，若相等再用equals方法判斷一遍，故兩個方法都要重寫

/**
     * Implements Map.put and related methods
     *
     * @param hash hash for key
     * @param key the key
     * @param value the value to put
     * @param onlyIfAbsent if true, don't change existing value
     * @param evict if false, the table is in creation mode.
     * @return previous value, or null if none
     */
    final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
                   boolean evict) {
        Node<K,V>[] tab; Node<K,V> p; int n, i;
        if ((tab = table) == null || (n = tab.length) == 0)
            n = (tab = resize()).length;
        if ((p = tab[i = (n - 1) & hash]) == null)
            tab[i] = newNode(hash, key, value, null);
        else {
            Node<K,V> e; K k;
            if (p.hash == hash &&
                ((k = p.key) == key || (key != null && key.equals(k))))
                e = p;
            else if (p instanceof TreeNode)
                e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
            else {
                for (int binCount = 0; ; ++binCount) {
                    if ((e = p.next) == null) {
                        p.next = newNode(hash, key, value, null);
                        if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
                            treeifyBin(tab, hash);
                        break;
                    }
                    if (e.hash == hash &&
                        ((k = e.key) == key || (key != null && key.equals(k))))
                        break;
                    p = e;
                }
            }
            if (e != null) { // existing mapping for key
                V oldValue = e.value;
                if (!onlyIfAbsent || oldValue == null)
                    e.value = value;
                afterNodeAccess(e);
                return oldValue;
            }
        }
        ++modCount;
        if (++size > threshold)
            resize();
        afterNodeInsertion(evict);
        return null;
    }

HashMap和HashSet中hasCode方法做用都是同樣的，就是求出哈希值，而後找到在哈希值在線性數組中的位置。equals方法對於HashSet來講就是重複用的，若是對象A、B的哈希值相同，equals值相同那麼對象A、B就是重複對象，去掉一個便可。