SimpleArrayMap源碼(你仍是隻會用HashMap麼？)

時間 2019-12-11

標籤 simplearraymap 源碼仍是只會 hashmap 简体版

原文原文鏈接

本文SimpleArrayMap源碼分析是基於support v4 23.3.0版本的。
另外，因ArrayMap涉及的可能是算法知識，而主要的思想比較簡單，因此本文會主要以代碼爲主，細講其每一個實現。java

爲何要引入ArrayMap？

在Android設備上，由於App的內存限制，出現OOM的錯誤，致使開發者不得不關注一些底層數據結構以及去分析App的內存使用狀況。說起數據結構，HashMap是咱們最常用到的，而咱們是否會注意其實現的細節以及有什麼優缺點呢？算法

這裏簡單說起一下HashMap在擴容時採起的作法是：將當前的數據結構所佔空間*2，而這對安卓稀缺的資源來講，但是很是大的消耗。因此就誕生了ArrayMap，它是在API19引入的，這樣咱們在兼容之前版本的時候，support包就派上用場了，但是爲何不直接是使用ArrayMap，而會多出來一個SimpleArrayMap呢？不得不說這是谷歌的厚道、人性化處，考慮咱們使用ArrayMap時，可能不須要使用Java標準的集合API，而給咱們提供的一個純算法實現的ArrayMap。數組

上面提到的集合API，是SimpleArrayMap跟v4包中的ArrayMap最大的區別，證實就是ArrayMap繼承了SimpleArrayMap，又實現了Map的接口；主要的操做，則是經過引入MapCollections類，使用Map中的Entry結構，這樣在ArrayMap中就能夠經過Iterator來進行數據的的迭代操做。緩存

實現思想

簡單地瞭解一下其思想，是咱們接下來進行源碼分析的必要步驟，方便咱們帶着問題去驗證咱們所想。兵馬未動，糧草先行。作事前必定要先把準備工做作好，事情理順，儘可能地充分考慮工做的細節，再開始進行工做。正如咱們如今項目開發以前，必定要先進行任務點的分解，而這時思惟導圖、UML建模工具則是咱們必須玩轉的東西。數據結構

思想：SimpleArrayMap採用了兩個數組來進行hash值與key、value值得保存，另外，數組大小超過8時，並須要進行擴容時，只增大當前數組大小的一半，並對大小爲4和8的數組進行緩存。這樣最後帶來的好處就是最大程度保證了數組空間都可以被使用，必定程度上避免了內存空間的浪費。app
數據結構方式：使用了兩個數組，一個是Hash數組，另外一個是大小*2的Array數組，爲了保證通用性，這裏所使用的是Object數組。Array數組中使用key+value間隔存取的方式，偶數爲即0 -> key1 1 -> value1 2 -> key2 3 -> value2 。另外Hash數組，則是對應的Key的Hash值數組，而且這是一個有序的int數組，這樣在進行Key的查找時，使用二分查找則是最有效率的方式了。以下圖：工具

數據結構定義

1.數據結構

int[] mHashes;
Object[] mArray;
int mSize;

代碼中，mHashes數組爲mArray中的key對應的hash值得數組，而mArray便是HashMap中key與value間隔混合的一個數組。源碼分析

2.初始化

默認構造器(初始大小爲0)this

/**
 * Create a new empty ArrayMap.  The default capacity of an array map is 0, and
 * will grow once items are added to it.
 */
public SimpleArrayMap() {
   mHashes = ContainerHelpers.EMPTY_INTS;
   mArray = ContainerHelpers.EMPTY_OBJECTS;
   mSize = 0;
}

指定初始大小spa

/**
 * Create a new ArrayMap with a given initial capacity.
 */
public SimpleArrayMap(int capacity) {
   if (capacity == 0) {
      mHashes = ContainerHelpers.EMPTY_INTS;
      mArray = ContainerHelpers.EMPTY_OBJECTS;
   } else {
      allocArrays(capacity);
   }
   mSize = 0;
}

經過SimpleArrayMap賦值

/**
 * Create a new ArrayMap with the mappings from the given ArrayMap.
 */
public SimpleArrayMap(SimpleArrayMap map) {
   this();
   if (map != null) {
      putAll(map);
   }
}

3.釋放

/**
 * Make the array map empty.  All storage is released.
 */
public void clear() {
   if (mSize != 0) {
      freeArrays(mHashes, mArray, mSize);
      mHashes = ContainerHelpers.EMPTY_INTS;
      mArray = ContainerHelpers.EMPTY_OBJECTS;
      mSize = 0;
   }
}

代碼中說起的EMPTY_INTS及EMPTY_OBJECTS，僅僅以下的兩個空數組:

static final int[] EMPTY_INTS = new int[0];

static final Object[] EMPTY_OBJECTS = new Object[0];

算法

1. 存數據put(key, value)

存數據的操做，按咱們數據結構的定義，應該是須要針對key，獲取其對應的hash值，在Hash數組中，採起二分查找，定位到指定hash值所對應的index值；以後根據index值，來調整並存放key跟value的值。來看看源碼的實現吧：

/**
 * Add a new value to the array map.
 * @param key The key under which to store the value.  <b>Must not be null.</b>  If
 * this key already exists in the array, its value will be replaced.
 * @param value The value to store for the given key.
 * @return Returns the old value that was stored for the given key, or null if there
 * was no such key.
 */
public V put(K key, V value) {
   final int hash;
   int index;
   if (key == null) {
      // 查找key爲null的狀況
      hash = 0;
      index = indexOfNull();
   } else {
      hash = key.hashCode();
      index = indexOf(key, hash);
   }
   if (index >= 0) {
      // 數組中存在相同的key，則更新並返回舊的值
      index = (index<<1) + 1;
      final V old = (V)mArray[index];
      mArray[index] = value;
      return old;
   }

   index = ~index;
   if (mSize >= mHashes.length) {
      // 當容量不夠時，須要創建一個新的數組，來進行擴容操做。
      final int n = mSize >= (BASE_SIZE*2) ? (mSize+(mSize>>1))
         : (mSize >= BASE_SIZE ? (BASE_SIZE*2) : BASE_SIZE);

      if (DEBUG) Log.d(TAG, "put: grow from " + mHashes.length + " to " + n);

      final int[] ohashes = mHashes;
      final Object[] oarray = mArray;
      allocArrays(n);

      if (mHashes.length > 0) {
         if (DEBUG) Log.d(TAG, "put: copy 0-" + mSize + " to 0");
         System.arraycopy(ohashes, 0, mHashes, 0, ohashes.length);
         System.arraycopy(oarray, 0, mArray, 0, oarray.length);
      }

      freeArrays(ohashes, oarray, mSize);
   }

   // 將index以後的數據進行後移
   if (index < mSize) {
      if (DEBUG) Log.d(TAG, "put: move " + index + "-" + (mSize-index)
            + " to " + (index+1));
      System.arraycopy(mHashes, index, mHashes, index + 1, mSize - index);
      System.arraycopy(mArray, index << 1, mArray, (index + 1) << 1, (mSize - index) << 1);
   }

   // 賦值給index位置上hash值
   mHashes[index] = hash;
   // 更新array數組中對應的key跟value值。
   mArray[index<<1] = key;
   mArray[(index<<1)+1] = value;
   mSize++;
   return null;
}

代碼中，能夠看出arrayMap容許key爲空，全部的key都不能重複。
另外，在進行容量修改的時候，進行的操做是：mSize跟hash數組長度的判斷，當大於等於的時候，須要對數組的容量進行一些擴容，並拷貝數組到新的數組中。（擴容操做：當size大於8, 取size + size /2 ; 當size大於4小於8時，取8 ，當size小於4時，取4）

2. 取數據get（key)

/**
 * Retrieve a value from the array.
 * @param key The key of the value to retrieve.
 * @return Returns the value associated with the given key,
 * or null if there is no such key.
 */
public V get(Object key) {
   final int index = indexOfKey(key);
   return index >= 0 ? (V)mArray[(index<<1)+1] : null;
}

經過key來獲取數據就很是簡單了，根據key獲取到相應的index值，在array數據中根據index乘2加1返回相應的value便可。

3. 刪除數據remove（key）

/**
 * Remove an existing key from the array map.
 * @param key The key of the mapping to remove.
 * @return Returns the value that was stored under the key, or null if there
 * was no such key.
 */
public V remove(Object key) {
   final int index = indexOfKey(key);
   if (index >= 0) {
      return removeAt(index);
   }

   return null;
}

根據key來刪除時，先會根據key來獲取其對應的index值，再經過removeAt(int index)方法來進行刪除操做。

/**
 * Remove the key/value mapping at the given index.
 * @param index The desired index, must be between 0 and {@link #size()}-1.
 * @return Returns the value that was stored at this index.
 */
public V removeAt(int index) {
   final Object old = mArray[(index << 1) + 1];
   if (mSize <= 1) {
      // Now empty.
      if (DEBUG) Log.d(TAG, "remove: shrink from " + mHashes.length + " to 0");
      freeArrays(mHashes, mArray, mSize);
      mHashes = ContainerHelpers.EMPTY_INTS;
      mArray = ContainerHelpers.EMPTY_OBJECTS;
      mSize = 0;
   } else {
      // 知足條件，對數組進行加入緩存的操做。
      if (mHashes.length > (BASE_SIZE*2) && mSize < mHashes.length/3) {
         // Shrunk enough to reduce size of arrays.  We don't allow it to
         // shrink smaller than (BASE_SIZE*2) to avoid flapping between
         // that and BASE_SIZE.
         final int n = mSize > (BASE_SIZE*2) ? (mSize + (mSize>>1)) : (BASE_SIZE*2);

         if (DEBUG) Log.d(TAG, "remove: shrink from " + mHashes.length + " to " + n);

         final int[] ohashes = mHashes;
         final Object[] oarray = mArray;
         allocArrays(n);

         mSize--;
         if (index > 0) {
            if (DEBUG) Log.d(TAG, "remove: copy from 0-" + index + " to 0");
            System.arraycopy(ohashes, 0, mHashes, 0, index);
            System.arraycopy(oarray, 0, mArray, 0, index << 1);
         }
         if (index < mSize) {
            if (DEBUG) Log.d(TAG, "remove: copy from " + (index+1) + "-" + mSize
                  + " to " + index);
            System.arraycopy(ohashes, index + 1, mHashes, index, mSize - index);
            System.arraycopy(oarray, (index + 1) << 1, mArray, index << 1,
                  (mSize - index) << 1);
         }
      } else {
         mSize--;
         if (index < mSize) {
            if (DEBUG) Log.d(TAG, "remove: move " + (index+1) + "-" + mSize
                  + " to " + index);
            System.arraycopy(mHashes, index + 1, mHashes, index, mSize - index);
            System.arraycopy(mArray, (index + 1) << 1, mArray, index << 1,
                  (mSize - index) << 1);
         }
         mArray[mSize << 1] = null;
         mArray[(mSize << 1) + 1] = null;
      }
   }
   return (V)old;
}

這裏先忽略hash數組長度的判斷（主要進行數組緩存的操做）只看主要的代碼，即最後的一個else的代碼，使用System.arraycopy方法將hash數組跟array數組中index以後的數據往前移動1位，而將最後一位的數據進行至空。

4. indexOfKey (key)

上面代碼中，均可以看到indexOfKey身影的出現，來看到其中如何實現的：

/**
 * Returns the index of a key in the set.
 *
 * @param key The key to search for.
 * @return Returns the index of the key if it exists, else a negative integer.
 */
public int indexOfKey(Object key) {
   return key == null ? indexOfNull() : indexOf(key, key.hashCode());
}

由上發現容許key爲null，進行index的查詢，當key不爲空時，經過key及其key的hashCode,來進行查詢。

int indexOf(Object key, int hash) {
   final int N = mSize;

   // Important fast case: if nothing is in here, nothing to look for.
   if (N == 0) {
      return ~0;
   }

   int index = ContainerHelpers.binarySearch(mHashes, N, hash);

   // If the hash code wasn't found, then we have no entry for this key.
   if (index < 0) {
      return index;
   }

   // If the key at the returned index matches, that's what we want.
   if (key.equals(mArray[index<<1])) {
      return index;
   }

   // Search for a matching key after the index.
   int end;
   for (end = index + 1; end < N && mHashes[end] == hash; end++) {
      if (key.equals(mArray[end << 1])) return end;
   }

   // Search for a matching key before the index.
   for (int i = index - 1; i >= 0 && mHashes[i] == hash; i--) {
      if (key.equals(mArray[i << 1])) return i;
   }

   // Key not found -- return negative value indicating where a
   // new entry for this key should go.  We use the end of the
   // hash chain to reduce the number of array entries that will
   // need to be copied when inserting.
   return ~end;
}

代碼中，是先對Hash數組進行二分查找，獲取index，以後根據index獲取hash數組中對應的值，經過與key來比較是否相等，相等則直接返回，若不相等，則先從index以後的數據進行比較，沒找到，則再找以前的數據。能夠看出這樣是支持存在多個key的hash值相同的狀況，那再看看支不支持多個key爲null的狀況呢？

int indexOfNull() {
   final int N = mSize;

   // Important fast case: if nothing is in here, nothing to look for.
   if (N == 0) {
      return ~0;
   }

   int index = ContainerHelpers.binarySearch(mHashes, N, 0);

   // If the hash code wasn't found, then we have no entry for this key.
   ！if (index < 0) {
      return index;
   }

   // If the key at the returned index matches, that's what we want.
   if (null == mArray[index<<1]) {
      return index;
   }

   // Search for a matching key after the index.
   int end;
   for (end = index + 1; end < N && mHashes[end] == 0; end++) {
      if (null == mArray[end << 1]) return end;
   }

   // Search for a matching key before the index.
   for (int i = index - 1; i >= 0 && mHashes[i] == 0; i--) {
      if (null == mArray[i << 1]) return i;
   }

   // Key not found -- return negative value indicating where a
   // new entry for this key should go.  We use the end of the
   // hash chain to reduce the number of array entries that will
   // need to be copied when inserting.
   return ~end;
}

從上能夠看出當key爲null的時候，採起獲取的方法跟key不爲null獲取是很類似的了，都要進行整個數組的遍歷，不過這裏對應的hash都是爲0。但key爲null只能在數組中存在一個的，由於在數據的put操做的時候，會對key進行檢查，這樣保證了key爲null只能存在一個。

5.二分查找

這裏，回顧一下，上面代碼中一直會用到的，經典的二分查找的算法：

// This is Arrays.binarySearch(), but doesn't do any argument validation.
static int binarySearch(int[] array, int size, int value) {
   int lo = 0;
   int hi = size - 1;

   while (lo <= hi) {
      int mid = (lo + hi) >>> 1;
      int midVal = array[mid];

      if (midVal < value) {
         lo = mid + 1;
      } else if (midVal > value) {
         hi = mid - 1;
      } else {
         return mid;  // value found
      }
   }
   return ~lo;  // value not present
}

代碼中，採用右移操做來進行除2的操做，而經過三個大於號，則表示無符號操做。

緩存的實現

講到這裏，就基本能夠結束了，而源碼中看到了兩個神奇的數組，他倆主要的目的是對固定的數組來進行緩存，官方給的說法是避免內存抖動，畢竟這裏是純數組來實現的，而當數組容量不夠的時候，就須要創建一個新的數組，這樣舊的數組不就浪費了，因此這裏的緩存仍是灰常必要的。接下來看看他倆是怎樣玩的，不感興趣的能夠略過這裏了。先看一下數據結構的實現：

1.數據結構

/**
 * The minimum amount by which the capacity of a ArrayMap will increase.
 * This is tuned to be relatively space-efficient.
 */
private static final int BASE_SIZE = 4;

/**
 * Maximum number of entries to have in array caches.
 */
private static final int CACHE_SIZE = 10;

/**
 * Caches of small array objects to avoid spamming garbage.  The cache
 * Object[] variable is a pointer to a linked list of array objects.
 * The first entry in the array is a pointer to the next array in the
 * list; the second entry is a pointer to the int[] hash code array for it.
 */
static Object[] mBaseCache;
static int mBaseCacheSize;
static Object[] mTwiceBaseCache;
static int mTwiceBaseCacheSize;

代碼中有兩個靜態的Object數組，這兩個靜態數組採用鏈表的方式來緩存全部的數組。即Object數組會用來指向array數組，而這個array的第一個值爲指針，指向下一個array，而第二個值是對應的hash數組，其餘的值則爲空。另外，緩存數組即baseCache和twiceBaseCache，它倆大小容量的限制：最小值爲4，最大值爲10，而BaseCache數組主要存儲的是容量爲4的數組，twiceBaseCache主要存儲容量爲8的數組。如圖：

2.緩存數據添加

private static void freeArrays(final int[] hashes, final Object[] array, final int size) {
   if (hashes.length == (BASE_SIZE*2)) {
      synchronized (ArrayMap.class) {
         if (mTwiceBaseCacheSize < CACHE_SIZE) {
            array[0] = mTwiceBaseCache;
            array[1] = hashes;
            for (int i=(size<<1)-1; i>=2; i--) {
               array[i] = null;
            }
            mTwiceBaseCache = array;
            mTwiceBaseCacheSize++;
            if (DEBUG) Log.d(TAG, "Storing 2x cache " + array
                  + " now have " + mTwiceBaseCacheSize + " entries");
         }
      }
   } else if (hashes.length == BASE_SIZE) {
      synchronized (ArrayMap.class) {
         if (mBaseCacheSize < CACHE_SIZE) {
            array[0] = mBaseCache;
            array[1] = hashes;
            for (int i=(size<<1)-1; i>=2; i--) {
               array[i] = null;
            }
            mBaseCache = array;
            mBaseCacheSize++;
            if (DEBUG) Log.d(TAG, "Storing 1x cache " + array
                  + " now have " + mBaseCacheSize + " entries");
         }
      }
   }
}

這個方法主要調用的地方在於ArrayMap進行容量改變時，代碼中，會對當前數組的array進行清空操做，但第一個值指向以前cache數組，第二個值指向hash數組。

3.緩存數組使用

private void allocArrays(final int size) {
   if (size == (BASE_SIZE*2)) {
      synchronized (ArrayMap.class) {
         if (mTwiceBaseCache != null) {
            final Object[] array = mTwiceBaseCache;
            mArray = array;
            mTwiceBaseCache = (Object[])array[0];
            mHashes = (int[])array[1];
            array[0] = array[1] = null;
            mTwiceBaseCacheSize--;
            if (DEBUG) Log.d(TAG, "Retrieving 2x cache " + mHashes
                  + " now have " + mTwiceBaseCacheSize + " entries");
            return;
         }
      }
   } else if (size == BASE_SIZE) {
      synchronized (ArrayMap.class) {
         if (mBaseCache != null) {
            final Object[] array = mBaseCache;
            mArray = array;
            mBaseCache = (Object[])array[0];
            mHashes = (int[])array[1];
            array[0] = array[1] = null;
            mBaseCacheSize--;
            if (DEBUG) Log.d(TAG, "Retrieving 1x cache " + mHashes
                  + " now have " + mBaseCacheSize + " entries");
            return;
         }
      }
   }

   mHashes = new int[size];
   mArray = new Object[size<<1];
}

這個時候，當size跟緩存的數組大小相同，即要麼等於4，要麼等於8，便可從緩存中拿取數組來用。這裏主要的操做就是baseCache指針的移動，指向array[0]指向的指針，hash數組即爲array[0]，而當前的這個array我們就可使用了。

總結

SimpleArrayMap是能夠替代ArrayMap來使用的，區別只是其內部採用單純的數組來實現，而ArrayMap中採用了EntrySet跟KeySet的結構，這樣方便使用Iterator來數據的遍歷獲取。
ArrayMap適用於少許的數據，由於存取的複雜度，對數量過大的就不太合適。這個量筆者建議破百就放棄ArrayMap的使用吧。
ArrayMap支持key爲null，但數組只能有一個key爲null的存在。另外，容許多個key的hash值相同，不過儘可能避免吧，否則二分查找獲取不到，又會進行遍歷查找；而key都必須是惟一，不能重複的。
主要目的是避免佔用大量的內存切沒法獲得地充分利用。
對容量爲4和容量爲8的數組，進行緩存，來防止內存抖動的發生。