ArrayMap詳解及源碼分析

時間 2019-11-24

標籤 arraymap 詳解源碼分析简体版

原文原文鏈接

1、前言

在《SparseArray詳解及源碼簡析》中，咱們熟悉了 SparseArray 的基本用法、特色以及實現原理。而在 Android SDK 的這個工具包中還有一個一樣重要的數據結構 ArrayMap，其目的也是在當數據量較小，好比幾百個的時候，能夠用來替代 HashMap，以提升內存的使用效率。算法

若是對 HashMap 的實現感興趣的話，能夠看看《HashMap詳解以及源碼分析》，而這篇文章就來了解一下 ArrayMap 的使用及其實現原理。數組

2、源碼簡析

1. demo 及其簡析

分析代碼以前一樣先看一段 demo，後面一樣經過 demo 進行實現原理的分析。bash

ArrayMap<String,String> arrayMap = new ArrayMap<>();
        arrayMap.put(null,"張大哥");
        arrayMap.put("abcd","A大哥");
        arrayMap.put("aabb","巴大哥");
        arrayMap.put("aacc","牛大哥");
        arrayMap.put("aadd","牛大哥");
        arrayMap.put("abcd","B大哥");

        Set<ArrayMap.Entry<String,String>> sets = arrayMap.entrySet();
        for (ArrayMap.Entry<String,String> set : sets) {
            Log.d(TAG, "arrayMapSample: key = " + set.getKey() + ";value = " + set.getValue());
        }
複製代碼

代碼中，實際插入了 6 個 Key-Value，然而輸出只有 5 個，其中 Key 爲「abcd」的重複了而發生了覆蓋。另外，還有一點注意的是 null 爲 key 是容許插入的。如下是其輸出的結果。數據結構

arrayMapSample: key = null;value = 張大哥 arrayMapSample: key = aabb;value = 巴大哥 arrayMapSample: key = aacc;value = 牛大哥 arrayMapSample: key = aadd;value = 牛大哥 arrayMapSample: key = abcd;value = B大哥app

經過 Android Studio 的 Debug 功能，也能夠簡單觀察一下其在內存中的存儲。 ide

2.源碼分析

先來簡單看一下 ArrayMap 的類圖結構。工具

與 HashMap 不一樣的是，它是直接實現自接口 map。一樣，存儲 key-value 的方式也不一樣。ArrayMap 是經過數組直接存儲了全部的 key-value。其中，mHashes 在 index 處存儲了 key 的 hash code，而 mArray 則在 hash code 的 index<<1 處存儲 key，在 index<<1 + 1 處存儲 value。簡單點說就是偶數處存儲 key，相鄰奇數處存儲 value。源碼分析

ArrayMap 的初始化

/**
     * Create a new empty ArrayMap.  The default capacity of an array map is 0, and
     * will grow once items are added to it.
     */
    public ArrayMap() {
        this(0, false);
    }

    /**
     * Create a new ArrayMap with a given initial capacity.
     */
    public ArrayMap(int capacity) {
        this(capacity, false);
    }

    /** {@hide} */
    public ArrayMap(int capacity, boolean identityHashCode) {
        mIdentityHashCode = identityHashCode;

        // If this is immutable, use the sentinal EMPTY_IMMUTABLE_INTS
        // instance instead of the usual EmptyArray.INT. The reference
        // is checked later to see if the array is allowed to grow.
        if (capacity < 0) {
            mHashes = EMPTY_IMMUTABLE_INTS;
            mArray = EmptyArray.OBJECT;
        } else if (capacity == 0) {
            mHashes = EmptyArray.INT;
            mArray = EmptyArray.OBJECT;
        } else {
            allocArrays(capacity);
        }
        mSize = 0;
    }
複製代碼

ArrayMap 的構造方法有 3 個重載的版本都列在上面了，通常咱們都用默認的構造方法，那也就是說默認容量大小就是 0，須要等待到插入元素時纔會進行擴容的動做。構造方法中的另外一個參數 identityHashCode 控制 hashCode 是由 System 類產生仍是由 Object.hashCode() 返回。這二者之間的實現其實沒太大區別，由於 System 類最終也是經過 Object.hashCode() 來實現的。其主要就是對 null 進行了特殊處理，好比一概爲 0。而在 ArrayMap 的 put() 方法中，若是 key 爲 null 也將其 hashCode 視爲 0 了。因此這裏 identityHashCode 爲 true 或者 false 都是同樣的。ui

插入元素 put()

public V put(K key, V value) {
        final int osize = mSize;
        // 1.計算 hash code 並獲取 index
        final int hash;
        int index;
        if (key == null) {
            // 爲空直接取 0
            hash = 0;
            index = indexOfNull();
        } else {
            // 不然取 Object.hashCode()
            hash = mIdentityHashCode ? System.identityHashCode(key) : key.hashCode();
            index = indexOf(key, hash);
        }
        // 2.若是 index 大於等於 0 ，說明以前存在相同的 hash code 且 key 也相同，則直接覆蓋
        if (index >= 0) {
            index = (index<<1) + 1;
            final V old = (V)mArray[index];
            mArray[index] = value;
            return old;
        }
        // 3.若是沒有找到則上面的 indexOf() 或者  indexOfNull() 就會返回一個負數，而這個負數就是由將要插入的位置 index 取反獲得的，因此這裏再次取反就變成了將進行插入的位置
        index = ~index;
        // 4.判斷是否須要擴容
        if (osize >= mHashes.length) {
            final int n = osize >= (BASE_SIZE*2) ? (osize+(osize>>1))
                    : (osize >= BASE_SIZE ? (BASE_SIZE*2) : BASE_SIZE);

            if (DEBUG) Log.d(TAG, "put: grow from " + mHashes.length + " to " + n);

            final int[] ohashes = mHashes;
            final Object[] oarray = mArray;
            // 5.申請新的空間
            allocArrays(n);

            if (CONCURRENT_MODIFICATION_EXCEPTIONS && osize != mSize) {
                throw new ConcurrentModificationException();
            }

            if (mHashes.length > 0) {
                if (DEBUG) Log.d(TAG, "put: copy 0-" + osize + " to 0");
                // 將數據複製到新的數組中
                System.arraycopy(ohashes, 0, mHashes, 0, ohashes.length);
                System.arraycopy(oarray, 0, mArray, 0, oarray.length);
            }
            // 6.釋放舊的數組
            freeArrays(ohashes, oarray, osize);
        }

        if (index < osize) {
            // 7.若是 index 在當前 size 以內，則須要將 index 開始的數據移到 index + 1 處，以騰出 index 的位置
            if (DEBUG) Log.d(TAG, "put: move " + index + "-" + (osize-index)
                    + " to " + (index+1));
            System.arraycopy(mHashes, index, mHashes, index + 1, osize - index);
            System.arraycopy(mArray, index << 1, mArray, (index + 1) << 1, (mSize - index) << 1);
        }

        if (CONCURRENT_MODIFICATION_EXCEPTIONS) {
            if (osize != mSize || index >= mHashes.length) {
                throw new ConcurrentModificationException();
            }
        }
        // 8.而後根據計算獲得的 index 分別插入 hash，key，以及 code
        mHashes[index] = hash;
        mArray[index<<1] = key;
        mArray[(index<<1)+1] = value;
        mSize++;
        return null;
    }
複製代碼

put 方法調用了其餘幾個內部的方法，其中關於擴容以及如何釋放空間，申請新的空間這些，從算法層來說其實不重要，只要知道一點就是，擴容會發生數據的複製，這個是會影響效率的就能夠了。而與算法相關性較大的 indexOfNull() 方法以及 indexOf() 方法的實現。因爲這兩個方法的實現基本同樣，所以這裏只分析 indexOf() 的實現。this

int indexOf(Object key, int hash) {
        final int N = mSize;

        // Important fast case: if nothing is in here, nothing to look for.
        if (N == 0) {
            return ~0;
        }

        int index = binarySearchHashes(mHashes, N, hash);

        // If the hash code wasn't found, then we have no entry for this key. if (index < 0) { return index; } // If the key at the returned index matches, that's what we want.
        if (key.equals(mArray[index<<1])) {
            return index;
        }

        // Search for a matching key after the index.
        int end;
        for (end = index + 1; end < N && mHashes[end] == hash; end++) {
            if (key.equals(mArray[end << 1])) return end;
        }

        // Search for a matching key before the index.
        for (int i = index - 1; i >= 0 && mHashes[i] == hash; i--) {
            if (key.equals(mArray[i << 1])) return i;
        }

        // Key not found -- return negative value indicating where a
        // new entry for this key should go.  We use the end of the
        // hash chain to reduce the number of array entries that will
        // need to be copied when inserting.
        return ~end;
    }

複製代碼

其實它原來的註釋已經很詳細了，詳細的步驟是：

(1) 若是當前爲空表，則直接返回 ~0，注意不是 0 ，而是最大的負數。

(2) 在 mHashs 數組中進行二分查找，找到 hash 的 index。

(3) 若是 index < 0，說明沒有找到。

(4) 若是 index >= 0，且在 mArray 中對應的 index<<1 處的 key 與要找的 key 又相同，則認爲是同一個 key，說明找到了。

(5) 若是 key 不相同，說明只是 hash code 相同，那麼分別向後和向前進行搜索，若是找到了就返回。若是沒找到，那麼對 end 取反就是當前須要插入的 index 位置。

再回過頭來看 put() 方法， put() 方法的具體實現都在源碼中加以了詳細的說明，感興趣的能夠詳細閱讀一下。而從 put 方法得出如下幾個結論：

(1) mHashs 數組以升序的方式保存了全部的 hash code。

(2) 經過 hash code 在 mHashs 數組裏的 index 值來肯定 key 以及 value 在 mArrays 數組中的存儲位置。通常來講分別就是 index << 1 以及 index << 1 + 1。再簡單點說就是 index * 2 以及 index * 2 + 1。

(3) hashCode 必然可能存在衝突，這裏是怎麼解決的呢？這個是由上面的第 3 步和第 7 步所決定。第 3 步是得出應該插入的 index 的位置，而第 7 步則是若是 index < osize ，則說明原來 mArrays 中必然已經存在相同 hashCode 的值了，那麼就把數據所有日後移一位，從而在 mHashs 中插入多個相同的 hash code 而且必定是鏈接在一塊兒的，而在 mArrays 中插入新的 key 和 value，最終得以解決 hash 衝突。

上面的結論可能仍是讓人以爲有點暈，那麼再來看看下面的圖吧，就必定能明白了。

上面圖說， index == 0 時和 index == 1時的 hash code 是同樣的，說明 key1 與 key2 的 hash code 是同樣的，也就是存在 hash 衝突了。那麼，如上，這裏的解決辦法就是 hash code 存儲了 2 份，而 key-value 分別存儲一份。

get() 方法

public V get(Object key) {
        final int index = indexOfKey(key);
        return index >= 0 ? (V)mArray[(index<<1)+1] : null;
    }
複製代碼

主要就是經過 indexOfKey() 計算出 index，而 indexOfKey() 的實現就是調用 indexOfNull () 和 indexOf()，其具體的實現已經上面分析過了。這裏若是返了 index >= 0，則說明必定是找到了，那麼根據前面的規則，在 mArray 中，index<<1 + 1 就是所要獲取的 value 了。

remove() 方法

public V remove(Object key) {
        final int index = indexOfKey(key);
        if (index >= 0) {
            return removeAt(index);
        }
        return null;
    }
複製代碼

首先經過 indexOfKey() 計算出 index 以判斷其是否存在，若是存在則進一步調用 removeAt() 來刪除相應的 hash code 以及 key-value。

public V removeAt(int index) {
        final Object old = mArray[(index << 1) + 1];
        final int osize = mSize;
        final int nsize;
        // 若是 size 小於等於1 ，移除後數組長度將爲 0。爲了壓縮內存，這裏直接將mHashs 以及 mArray 置爲了空數組
        if (osize <= 1) {
            // Now empty.
            if (DEBUG) Log.d(TAG, "remove: shrink from " + mHashes.length + " to 0");
            final int[] ohashes = mHashes;
            final Object[] oarray = mArray;
            mHashes = EmptyArray.INT;
            mArray = EmptyArray.OBJECT;
            freeArrays(ohashes, oarray, osize);
            nsize = 0;
        } else {
            // size > 1 的狀況，則先將 size - 1
            nsize = osize - 1;
            if (mHashes.length > (BASE_SIZE*2) && mSize < mHashes.length/3) {
                // 若是上面的條件符合，那麼就要進行數據的壓縮。 
                // Shrunk enough to reduce size of arrays.  We don't allow it to // shrink smaller than (BASE_SIZE*2) to avoid flapping between // that and BASE_SIZE. final int n = osize > (BASE_SIZE*2) ? (osize + (osize>>1)) : (BASE_SIZE*2); if (DEBUG) Log.d(TAG, "remove: shrink from " + mHashes.length + " to " + n); final int[] ohashes = mHashes; final Object[] oarray = mArray; allocArrays(n); if (CONCURRENT_MODIFICATION_EXCEPTIONS && osize != mSize) { throw new ConcurrentModificationException(); } if (index > 0) { if (DEBUG) Log.d(TAG, "remove: copy from 0-" + index + " to 0"); System.arraycopy(ohashes, 0, mHashes, 0, index); System.arraycopy(oarray, 0, mArray, 0, index << 1); } if (index < nsize) { if (DEBUG) Log.d(TAG, "remove: copy from " + (index+1) + "-" + nsize + " to " + index); System.arraycopy(ohashes, index + 1, mHashes, index, nsize - index); System.arraycopy(oarray, (index + 1) << 1, mArray, index << 1, (nsize - index) << 1); } } else { if (index < nsize) { // 若是 index 在 size 內，則將數據往前移一位 if (DEBUG) Log.d(TAG, "remove: move " + (index+1) + "-" + nsize + " to " + index); System.arraycopy(mHashes, index + 1, mHashes, index, nsize - index); System.arraycopy(mArray, (index + 1) << 1, mArray, index << 1, (nsize - index) << 1); } // 而後將最後一位數據置 null mArray[nsize << 1] = null; mArray[(nsize << 1) + 1] = null; } } if (CONCURRENT_MODIFICATION_EXCEPTIONS && osize != mSize) { throw new ConcurrentModificationException(); } mSize = nsize; return (V)old; } 複製代碼

通常狀況下刪除一個數據，只須要將 index 後面的數據都往 index 方向移一位，而後刪除末位數便可。而若是當前的數組中的條件達到 mHashs 的長度大於 BASE_SIZE2 且實際大小又小於其長度的 1/3，那麼就要進行數據的壓縮。而壓縮後的空間至少也是 BASE_SIZE2 的大小。

3、總結

ArrayMap 中比較重要的是 put() 方法以及 remvoeAt() 方法的實現，這兩個方法基本實現了 ArrayMap 的全部重要的特性。這裏再重複一下以做爲全文的總結。

mHashs 數組以升序的方式保存了全部的 hash code，在查找數據時則經過二分查找 hash code 所對應的 index。這也是它的 get() 比 HashMap 慢的根據緣由所在。
經過 hash code 在 mHashs 數組裏的 index 值來肯定 key 以及 value 在 mArrays 數組中的存儲位置。通常來講分別就是 index << 1 以及 index << 1 + 1。再簡單點說就是 index * 2 以及 index * 2 + 1。
hashCode 必然可能存在衝突，這裏是怎麼解決的呢？簡單點說就是，在 mHashs 中相鄰地存多份 hash code，而在 mArray 中分別以它們的 index 來計算 key-value 的存儲位置。
當進行 remove 操做時，在必定條件下，可能會發生數據的壓縮，從而節省內存的使用。

最後，感謝你能讀到並讀完此文章。受限於做者水平有限，若是存在錯誤或者疑問都歡迎留言討論。若是個人分享可以幫助到你，也請記得幫忙點個贊吧，鼓勵我繼續寫下去，謝謝。