SimpleArrayMap源碼(你仍是隻會用HashMap麼?)

本文SimpleArrayMap源碼分析是基於support v4 23.3.0版本的。
另外,因ArrayMap涉及的可能是算法知識,而主要的思想比較簡單,因此本文會主要以代碼爲主,細講其每一個實現。java

爲何要引入ArrayMap?

在Android設備上,由於App的內存限制,出現OOM的錯誤,致使開發者不得不關注一些底層數據結構以及去分析App的內存使用狀況。說起數據結構,HashMap是咱們最常用到的,而咱們是否會注意其實現的細節以及有什麼優缺點呢?算法

這裏簡單說起一下HashMap在擴容時採起的作法是:將當前的數據結構所佔空間*2,而這對安卓稀缺的資源來講,但是很是大的消耗。因此就誕生了ArrayMap,它是在API19引入的,這樣咱們在兼容之前版本的時候,support包就派上用場了,但是爲何不直接是使用ArrayMap,而會多出來一個SimpleArrayMap呢?不得不說這是谷歌的厚道、人性化處,考慮咱們使用ArrayMap時,可能不須要使用Java標準的集合API,而給咱們提供的一個純算法實現的ArrayMap。數組

上面提到的集合API,是SimpleArrayMap跟v4包中的ArrayMap最大的區別,證實就是ArrayMap繼承了SimpleArrayMap,又實現了Map的接口;主要的操做,則是經過引入MapCollections類,使用Map中的Entry結構,這樣在ArrayMap中就能夠經過Iterator來進行數據的的迭代操做。緩存

實現思想

簡單地瞭解一下其思想,是咱們接下來進行源碼分析的必要步驟,方便咱們帶着問題去驗證咱們所想。兵馬未動,糧草先行。作事前必定要先把準備工做作好,事情理順,儘可能地充分考慮工做的細節 ,再開始進行工做。正如咱們如今項目開發以前,必定要先進行任務點的分解,而這時思惟導圖、UML建模工具則是咱們必須玩轉的東西。數據結構

  • 思想:SimpleArrayMap採用了兩個數組來進行hash值與key、value值得保存,另外,數組大小超過8時,並須要進行擴容時,只增大當前數組大小的一半,並對大小爲4和8的數組進行緩存。這樣最後帶來的好處就是最大程度保證了數組空間都可以被使用,必定程度上避免了內存空間的浪費。app

  • 數據結構方式:使用了兩個數組,一個是Hash數組,另外一個是大小*2的Array數組,爲了保證通用性,這裏所使用的是Object數組。Array數組中使用key+value間隔存取的方式,偶數爲即0 -> key1 1 -> value1 2 -> key2 3 -> value2 。另外Hash數組,則是對應的Key的Hash值數組,而且這是一個有序的int數組,這樣在進行Key的查找時,使用二分查找則是最有效率的方式了。以下圖:工具

SimpleArrayMap結構圖

數據結構定義

1.數據結構

int[] mHashes;
Object[] mArray;
int mSize;

代碼中,mHashes數組爲mArray中的key對應的hash值得數組,而mArray便是HashMap中key與value間隔混合的一個數組。源碼分析

2.初始化

  • 默認構造器(初始大小爲0)this

/**
 * Create a new empty ArrayMap.  The default capacity of an array map is 0, and
 * will grow once items are added to it.
 */
public SimpleArrayMap() {
   mHashes = ContainerHelpers.EMPTY_INTS;
   mArray = ContainerHelpers.EMPTY_OBJECTS;
   mSize = 0;
}
  • 指定初始大小spa

/**
 * Create a new ArrayMap with a given initial capacity.
 */
public SimpleArrayMap(int capacity) {
   if (capacity == 0) {
      mHashes = ContainerHelpers.EMPTY_INTS;
      mArray = ContainerHelpers.EMPTY_OBJECTS;
   } else {
      allocArrays(capacity);
   }
   mSize = 0;
}
  • 經過SimpleArrayMap賦值

/**
 * Create a new ArrayMap with the mappings from the given ArrayMap.
 */
public SimpleArrayMap(SimpleArrayMap map) {
   this();
   if (map != null) {
      putAll(map);
   }
}

3.釋放

/**
 * Make the array map empty.  All storage is released.
 */
public void clear() {
   if (mSize != 0) {
      freeArrays(mHashes, mArray, mSize);
      mHashes = ContainerHelpers.EMPTY_INTS;
      mArray = ContainerHelpers.EMPTY_OBJECTS;
      mSize = 0;
   }
}

代碼中說起的EMPTY_INTSEMPTY_OBJECTS,僅僅以下的兩個空數組:

static final int[] EMPTY_INTS = new int[0];

static final Object[] EMPTY_OBJECTS = new Object[0];

算法

1. 存數據put(key, value)

存數據的操做,按咱們數據結構的定義,應該是須要針對key,獲取其對應的hash值,在Hash數組中,採起二分查找,定位到指定hash值所對應的index值;以後根據index值,來調整並存放key跟value的值。來看看源碼的實現吧:

/**
 * Add a new value to the array map.
 * @param key The key under which to store the value.  <b>Must not be null.</b>  If
 * this key already exists in the array, its value will be replaced.
 * @param value The value to store for the given key.
 * @return Returns the old value that was stored for the given key, or null if there
 * was no such key.
 */
public V put(K key, V value) {
   final int hash;
   int index;
   if (key == null) {
      // 查找key爲null的狀況
      hash = 0;
      index = indexOfNull();
   } else {
      hash = key.hashCode();
      index = indexOf(key, hash);
   }
   if (index >= 0) {
      // 數組中存在相同的key,則更新並返回舊的值
      index = (index<<1) + 1;
      final V old = (V)mArray[index];
      mArray[index] = value;
      return old;
   }

   index = ~index;
   if (mSize >= mHashes.length) {
      // 當容量不夠時,須要創建一個新的數組,來進行擴容操做。
      final int n = mSize >= (BASE_SIZE*2) ? (mSize+(mSize>>1))
         : (mSize >= BASE_SIZE ? (BASE_SIZE*2) : BASE_SIZE);

      if (DEBUG) Log.d(TAG, "put: grow from " + mHashes.length + " to " + n);

      final int[] ohashes = mHashes;
      final Object[] oarray = mArray;
      allocArrays(n);

      if (mHashes.length > 0) {
         if (DEBUG) Log.d(TAG, "put: copy 0-" + mSize + " to 0");
         System.arraycopy(ohashes, 0, mHashes, 0, ohashes.length);
         System.arraycopy(oarray, 0, mArray, 0, oarray.length);
      }

      freeArrays(ohashes, oarray, mSize);
   }

   // 將index以後的數據進行後移
   if (index < mSize) {
      if (DEBUG) Log.d(TAG, "put: move " + index + "-" + (mSize-index)
            + " to " + (index+1));
      System.arraycopy(mHashes, index, mHashes, index + 1, mSize - index);
      System.arraycopy(mArray, index << 1, mArray, (index + 1) << 1, (mSize - index) << 1);
   }

   // 賦值給index位置上hash值
   mHashes[index] = hash;
   // 更新array數組中對應的key跟value值。
   mArray[index<<1] = key;
   mArray[(index<<1)+1] = value;
   mSize++;
   return null;
}

代碼中,能夠看出arrayMap容許key爲空,全部的key都不能重複。
另外,在進行容量修改的時候,進行的操做是:mSize跟hash數組長度的判斷,當大於等於的時候,須要對數組的容量進行一些擴容,並拷貝數組到新的數組中。(擴容操做:當size大於8, 取size + size /2 ; 當size大於4小於8時, 取8 ,當size小於4時,取4)

2. 取數據get(key)

/**
 * Retrieve a value from the array.
 * @param key The key of the value to retrieve.
 * @return Returns the value associated with the given key,
 * or null if there is no such key.
 */
public V get(Object key) {
   final int index = indexOfKey(key);
   return index >= 0 ? (V)mArray[(index<<1)+1] : null;
}

經過key來獲取數據就很是簡單了,根據key獲取到相應的index值,在array數據中根據index乘2加1返回相應的value便可。

3. 刪除數據remove(key)

/**
 * Remove an existing key from the array map.
 * @param key The key of the mapping to remove.
 * @return Returns the value that was stored under the key, or null if there
 * was no such key.
 */
public V remove(Object key) {
   final int index = indexOfKey(key);
   if (index >= 0) {
      return removeAt(index);
   }

   return null;
}

根據key來刪除時,先會根據key來獲取其對應的index值,再經過removeAt(int index)方法來進行刪除操做。

/**
 * Remove the key/value mapping at the given index.
 * @param index The desired index, must be between 0 and {@link #size()}-1.
 * @return Returns the value that was stored at this index.
 */
public V removeAt(int index) {
   final Object old = mArray[(index << 1) + 1];
   if (mSize <= 1) {
      // Now empty.
      if (DEBUG) Log.d(TAG, "remove: shrink from " + mHashes.length + " to 0");
      freeArrays(mHashes, mArray, mSize);
      mHashes = ContainerHelpers.EMPTY_INTS;
      mArray = ContainerHelpers.EMPTY_OBJECTS;
      mSize = 0;
   } else {
      // 知足條件,對數組進行加入緩存的操做。
      if (mHashes.length > (BASE_SIZE*2) && mSize < mHashes.length/3) {
         // Shrunk enough to reduce size of arrays.  We don't allow it to
         // shrink smaller than (BASE_SIZE*2) to avoid flapping between
         // that and BASE_SIZE.
         final int n = mSize > (BASE_SIZE*2) ? (mSize + (mSize>>1)) : (BASE_SIZE*2);

         if (DEBUG) Log.d(TAG, "remove: shrink from " + mHashes.length + " to " + n);

         final int[] ohashes = mHashes;
         final Object[] oarray = mArray;
         allocArrays(n);

         mSize--;
         if (index > 0) {
            if (DEBUG) Log.d(TAG, "remove: copy from 0-" + index + " to 0");
            System.arraycopy(ohashes, 0, mHashes, 0, index);
            System.arraycopy(oarray, 0, mArray, 0, index << 1);
         }
         if (index < mSize) {
            if (DEBUG) Log.d(TAG, "remove: copy from " + (index+1) + "-" + mSize
                  + " to " + index);
            System.arraycopy(ohashes, index + 1, mHashes, index, mSize - index);
            System.arraycopy(oarray, (index + 1) << 1, mArray, index << 1,
                  (mSize - index) << 1);
         }
      } else {
         mSize--;
         if (index < mSize) {
            if (DEBUG) Log.d(TAG, "remove: move " + (index+1) + "-" + mSize
                  + " to " + index);
            System.arraycopy(mHashes, index + 1, mHashes, index, mSize - index);
            System.arraycopy(mArray, (index + 1) << 1, mArray, index << 1,
                  (mSize - index) << 1);
         }
         mArray[mSize << 1] = null;
         mArray[(mSize << 1) + 1] = null;
      }
   }
   return (V)old;
}

這裏先忽略hash數組長度的判斷(主要進行數組緩存的操做)只看主要的代碼,即最後的一個else的代碼,使用System.arraycopy方法將hash數組跟array數組中index以後的數據往前移動1位,而將最後一位的數據進行至空。

4. indexOfKey (key)

上面代碼中,均可以看到indexOfKey身影的出現,來看到其中如何實現的:

/**
 * Returns the index of a key in the set.
 *
 * @param key The key to search for.
 * @return Returns the index of the key if it exists, else a negative integer.
 */
public int indexOfKey(Object key) {
   return key == null ? indexOfNull() : indexOf(key, key.hashCode());
}

由上發現容許key爲null,進行index的查詢,當key不爲空時,經過key及其key的hashCode,來進行查詢。

int indexOf(Object key, int hash) {
   final int N = mSize;

   // Important fast case: if nothing is in here, nothing to look for.
   if (N == 0) {
      return ~0;
   }

   int index = ContainerHelpers.binarySearch(mHashes, N, hash);

   // If the hash code wasn't found, then we have no entry for this key.
   if (index < 0) {
      return index;
   }

   // If the key at the returned index matches, that's what we want.
   if (key.equals(mArray[index<<1])) {
      return index;
   }

   // Search for a matching key after the index.
   int end;
   for (end = index + 1; end < N && mHashes[end] == hash; end++) {
      if (key.equals(mArray[end << 1])) return end;
   }

   // Search for a matching key before the index.
   for (int i = index - 1; i >= 0 && mHashes[i] == hash; i--) {
      if (key.equals(mArray[i << 1])) return i;
   }

   // Key not found -- return negative value indicating where a
   // new entry for this key should go.  We use the end of the
   // hash chain to reduce the number of array entries that will
   // need to be copied when inserting.
   return ~end;
}

代碼中,是先對Hash數組進行二分查找,獲取index,以後根據index獲取hash數組中對應的值,經過與key來比較是否相等,相等則直接返回,若不相等,則先從index以後的數據進行比較,沒找到,則再找以前的數據。能夠看出這樣是支持存在多個key的hash值相同的狀況,那再看看支不支持多個key爲null的狀況呢?

int indexOfNull() {
   final int N = mSize;

   // Important fast case: if nothing is in here, nothing to look for.
   if (N == 0) {
      return ~0;
   }

   int index = ContainerHelpers.binarySearch(mHashes, N, 0);

   // If the hash code wasn't found, then we have no entry for this key.
   !if (index < 0) {
      return index;
   }

   // If the key at the returned index matches, that's what we want.
   if (null == mArray[index<<1]) {
      return index;
   }

   // Search for a matching key after the index.
   int end;
   for (end = index + 1; end < N && mHashes[end] == 0; end++) {
      if (null == mArray[end << 1]) return end;
   }

   // Search for a matching key before the index.
   for (int i = index - 1; i >= 0 && mHashes[i] == 0; i--) {
      if (null == mArray[i << 1]) return i;
   }

   // Key not found -- return negative value indicating where a
   // new entry for this key should go.  We use the end of the
   // hash chain to reduce the number of array entries that will
   // need to be copied when inserting.
   return ~end;
}

從上能夠看出當key爲null的時候,採起獲取的方法跟key不爲null獲取是很類似的了,都要進行整個數組的遍歷,不過這裏對應的hash都是爲0。但key爲null只能在數組中存在一個的,由於在數據的put操做的時候,會對key進行檢查,這樣保證了key爲null只能存在一個。

5.二分查找

這裏,回顧一下,上面代碼中一直會用到的,經典的二分查找的算法:

// This is Arrays.binarySearch(), but doesn't do any argument validation.
static int binarySearch(int[] array, int size, int value) {
   int lo = 0;
   int hi = size - 1;

   while (lo <= hi) {
      int mid = (lo + hi) >>> 1;
      int midVal = array[mid];

      if (midVal < value) {
         lo = mid + 1;
      } else if (midVal > value) {
         hi = mid - 1;
      } else {
         return mid;  // value found
      }
   }
   return ~lo;  // value not present
}

代碼中,採用右移操做來進行除2的操做,而經過三個大於號,則表示無符號操做。

緩存的實現

講到這裏,就基本能夠結束了,而源碼中看到了兩個神奇的數組,他倆主要的目的是對固定的數組來進行緩存,官方給的說法是避免內存抖動,畢竟這裏是純數組來實現的,而當數組容量不夠的時候,就須要創建一個新的數組,這樣舊的數組不就浪費了,因此這裏的緩存仍是灰常必要的。接下來看看他倆是怎樣玩的,不感興趣的能夠略過這裏了。先看一下數據結構的實現:

1.數據結構

/**
 * The minimum amount by which the capacity of a ArrayMap will increase.
 * This is tuned to be relatively space-efficient.
 */
private static final int BASE_SIZE = 4;

/**
 * Maximum number of entries to have in array caches.
 */
private static final int CACHE_SIZE = 10;

/**
 * Caches of small array objects to avoid spamming garbage.  The cache
 * Object[] variable is a pointer to a linked list of array objects.
 * The first entry in the array is a pointer to the next array in the
 * list; the second entry is a pointer to the int[] hash code array for it.
 */
static Object[] mBaseCache;
static int mBaseCacheSize;
static Object[] mTwiceBaseCache;
static int mTwiceBaseCacheSize;

代碼中有兩個靜態的Object數組,這兩個靜態數組採用鏈表的方式來緩存全部的數組。即Object數組會用來指向array數組,而這個array的第一個值爲指針,指向下一個array,而第二個值是對應的hash數組,其餘的值則爲空。另外,緩存數組即baseCache和twiceBaseCache,它倆大小容量的限制:最小值爲4,最大值爲10,而BaseCache數組主要存儲的是容量爲4的數組,twiceBaseCache主要存儲容量爲8的數組。如圖:

SimpleArrayMap緩存圖

2.緩存數據添加

private static void freeArrays(final int[] hashes, final Object[] array, final int size) {
   if (hashes.length == (BASE_SIZE*2)) {
      synchronized (ArrayMap.class) {
         if (mTwiceBaseCacheSize < CACHE_SIZE) {
            array[0] = mTwiceBaseCache;
            array[1] = hashes;
            for (int i=(size<<1)-1; i>=2; i--) {
               array[i] = null;
            }
            mTwiceBaseCache = array;
            mTwiceBaseCacheSize++;
            if (DEBUG) Log.d(TAG, "Storing 2x cache " + array
                  + " now have " + mTwiceBaseCacheSize + " entries");
         }
      }
   } else if (hashes.length == BASE_SIZE) {
      synchronized (ArrayMap.class) {
         if (mBaseCacheSize < CACHE_SIZE) {
            array[0] = mBaseCache;
            array[1] = hashes;
            for (int i=(size<<1)-1; i>=2; i--) {
               array[i] = null;
            }
            mBaseCache = array;
            mBaseCacheSize++;
            if (DEBUG) Log.d(TAG, "Storing 1x cache " + array
                  + " now have " + mBaseCacheSize + " entries");
         }
      }
   }
}

這個方法主要調用的地方在於ArrayMap進行容量改變時,代碼中,會對當前數組的array進行清空操做,但第一個值指向以前cache數組,第二個值指向hash數組。

3.緩存數組使用

private void allocArrays(final int size) {
   if (size == (BASE_SIZE*2)) {
      synchronized (ArrayMap.class) {
         if (mTwiceBaseCache != null) {
            final Object[] array = mTwiceBaseCache;
            mArray = array;
            mTwiceBaseCache = (Object[])array[0];
            mHashes = (int[])array[1];
            array[0] = array[1] = null;
            mTwiceBaseCacheSize--;
            if (DEBUG) Log.d(TAG, "Retrieving 2x cache " + mHashes
                  + " now have " + mTwiceBaseCacheSize + " entries");
            return;
         }
      }
   } else if (size == BASE_SIZE) {
      synchronized (ArrayMap.class) {
         if (mBaseCache != null) {
            final Object[] array = mBaseCache;
            mArray = array;
            mBaseCache = (Object[])array[0];
            mHashes = (int[])array[1];
            array[0] = array[1] = null;
            mBaseCacheSize--;
            if (DEBUG) Log.d(TAG, "Retrieving 1x cache " + mHashes
                  + " now have " + mBaseCacheSize + " entries");
            return;
         }
      }
   }

   mHashes = new int[size];
   mArray = new Object[size<<1];
}

這個時候,當size跟緩存的數組大小相同,即要麼等於4,要麼等於8,便可從緩存中拿取數組來用。這裏主要的操做就是baseCache指針的移動,指向array[0]指向的指針,hash數組即爲array[0],而當前的這個array我們就可使用了。

總結

  • SimpleArrayMap是能夠替代ArrayMap來使用的,區別只是其內部採用單純的數組來實現,而ArrayMap中採用了EntrySet跟KeySet的結構,這樣方便使用Iterator來數據的遍歷獲取。

  • ArrayMap適用於少許的數據,由於存取的複雜度,對數量過大的就不太合適。這個量筆者建議破百就放棄ArrayMap的使用吧。

  • ArrayMap支持key爲null,但數組只能有一個key爲null的存在。另外,容許多個key的hash值相同,不過儘可能避免吧,否則二分查找獲取不到,又會進行遍歷查找;而key都必須是惟一,不能重複的。

  • 主要目的是避免佔用大量的內存切沒法獲得地充分利用。

  • 對容量爲4和容量爲8的數組,進行緩存,來防止內存抖動的發生。

PS: 轉載請註明原文連接

相關文章
相關標籤/搜索