ArrayList的底層實現原理

時間 2019-11-18

原文原文鏈接

工做3年了，一直熟練運用的技術須要整理分享一下html

1、 ArrayList概述： java

ArrayList是基於數組實現的，是一個動態數組，其容量能自動增加，相似於C語言中的動態申請內存，動態增加內存。數組

ArrayList不是線程安全的，只能用在單線程環境下，多線程環境下能夠考慮用Collections.synchronizedList(List l)函數返回一個線程安全的ArrayList類，也可使用concurrent併發包下的CopyOnWriteArrayList類。安全

ArrayList實現了Serializable接口，所以它支持序列化，可以經過序列化傳輸，實現了RandomAccess接口，支持快速隨機訪問，實際上就是經過下標序號進行快速訪問，實現了Cloneable接口，能被克隆。多線程

每一個ArrayList實例都有一個容量，該容量是指用來存儲列表元素的數組的大小。它老是至少等於列表的大小。隨着向ArrayList中不斷添加元素，其容量也自動增加。自動增加會帶來數據向新數組的從新拷貝，所以，若是可預知數據量的多少，可在構造ArrayList時指定其容量。在添加大量元素前，應用程序也可使用ensureCapacity操做來增長ArrayList實例的容量，這能夠減小遞增式再分配的數量。
注意，此實現不是同步的。若是多個線程同時訪問一個ArrayList實例，而其中至少一個線程從結構上修改了列表，那麼它必須保持外部同步。併發

2、 ArrayList的實現：dom

對於ArrayList而言，它實現List接口、底層使用數組保存全部元素。其操做基本上是對數組的操做。下面咱們來分析ArrayList的源代碼：函數

1) 私有屬性：學習

ArrayList定義只定義類兩個私有屬性：this

    /** 
      * The array buffer into which the elements of the ArrayList are stored. 
      * The capacity of the ArrayList is the length of this array buffer. 
      */  
     private transient Object[] elementData;  
   
     /** 
      * The size of the ArrayList (the number of elements it contains). 
      * 
      * @serial 
      */  
     private int size;

很容易理解，elementData存儲ArrayList內的元素，size表示它包含的元素的數量。

有個關鍵字須要解釋：transient。

Java的serialization提供了一種持久化對象實例的機制。當持久化對象時，可能有一個特殊的對象數據成員，咱們不想用serialization機制來保存它。爲了在一個特定對象的一個域上關閉serialization，能夠在這個域前加上關鍵字transient。

有點抽象，看個例子應該能明白。

public class UserInfo implements Serializable {  
     private static final long serialVersionUID = 996890129747019948L;  
     private String name;  
     private transient String psw;  
   
     public UserInfo(String name, String psw) {  
         this.name = name;  
         this.psw = psw;  
     }  
   
     public String toString() {  
         return "name=" + name + ", psw=" + psw;  
     }  
 }  
   
 public class TestTransient {  
     public static void main(String[] args) {  
         UserInfo userInfo = new UserInfo("張三", "123456");  
         System.out.println(userInfo);  
         try {  
             // 序列化，被設置爲transient的屬性沒有被序列化  
             ObjectOutputStream o = new ObjectOutputStream(new FileOutputStream(  
                     "UserInfo.out"));  
             o.writeObject(userInfo);  
             o.close();  
         } catch (Exception e) {  
             // TODO: handle exception  
             e.printStackTrace();  
         }  
         try {  
             // 從新讀取內容  
             ObjectInputStream in = new ObjectInputStream(new FileInputStream(  
                     "UserInfo.out"));  
             UserInfo readUserInfo = (UserInfo) in.readObject();  
             //讀取後psw的內容爲null  
             System.out.println(readUserInfo.toString());  
         } catch (Exception e) {  
             // TODO: handle exception  
             e.printStackTrace();  
         }  
     }  
 }

被標記爲transient的屬性在對象被序列化的時候不會被保存。

接着回到ArrayList的分析中......

2) 構造方法：
ArrayList提供了三種方式的構造器，能夠構造一個默認初始容量爲10的空列表、構造一個指定初始容量的空列表以及構造一個包含指定collection的元素的列表，這些元素按照該collection的迭代器返回它們的順序排列的。

    // ArrayList帶容量大小的構造函數。    
    public ArrayList(int initialCapacity) {    
        super();    
        if (initialCapacity < 0)    
            throw new IllegalArgumentException("Illegal Capacity: "+    
                                               initialCapacity);    
        // 新建一個數組    
        this.elementData = new Object[initialCapacity];    
    }    
   
    // ArrayList無參構造函數。默認容量是10。    
    public ArrayList() {    
        this(10);    
    }    
   
    // 建立一個包含collection的ArrayList    
    public ArrayList(Collection<? extends E> c) {    
        elementData = c.toArray();    
        size = elementData.length;    
        if (elementData.getClass() != Object[].class)    
            elementData = Arrays.copyOf(elementData, size, Object[].class);    
    }

3) 元素存儲：

ArrayList提供了set(int index, E element)、add(E e)、add(int index, E element)、addAll(Collection<? extends E> c)、addAll(int index, Collection<? extends E> c)這些添加元素的方法。下面咱們一一講解：

20 // 用指定的元素替代此列表中指定位置上的元素，並返回之前位於該位置上的元素。  
21 public E set(int index, E element) {  
22    RangeCheck(index);  
23  
24    E oldValue = (E) elementData[index];  
25    elementData[index] = element;  
26    return oldValue;  
27 }    
28 // 將指定的元素添加到此列表的尾部。  
29 public boolean add(E e) {  
30    ensureCapacity(size + 1);   
31    elementData[size++] = e;  
32    return true;  
33 }    
34 // 將指定的元素插入此列表中的指定位置。  
35 // 若是當前位置有元素，則向右移動當前位於該位置的元素以及全部後續元素（將其索引加1）。  
36 public void add(int index, E element) {  
37    if (index > size || index < 0)  
38        throw new IndexOutOfBoundsException("Index: "+index+", Size: "+size);  
39    // 若是數組長度不足，將進行擴容。  
40    ensureCapacity(size+1);  // Increments modCount!!  
41    // 將 elementData中從Index位置開始、長度爲size-index的元素，  
42    // 拷貝到從下標爲index+1位置開始的新的elementData數組中。  
43    // 即將當前位於該位置的元素以及全部後續元素右移一個位置。  
44    System.arraycopy(elementData, index, elementData, index + 1, size - index);  
45    elementData[index] = element;  
46    size++;  
47 }    
48 // 按照指定collection的迭代器所返回的元素順序，將該collection中的全部元素添加到此列表的尾部。  
49 public boolean addAll(Collection<? extends E> c) {  
50    Object[] a = c.toArray();  
51    int numNew = a.length;  
52    ensureCapacity(size + numNew);  // Increments modCount  
53    System.arraycopy(a, 0, elementData, size, numNew);  
54    size += numNew;  
55    return numNew != 0;  
56 }    
57 // 從指定的位置開始，將指定collection中的全部元素插入到此列表中。  
58 public boolean addAll(int index, Collection<? extends E> c) {  
59    if (index > size || index < 0)  
60        throw new IndexOutOfBoundsException(  
61            "Index: " + index + ", Size: " + size);  
62  
63    Object[] a = c.toArray();  
64    int numNew = a.length;  
65    ensureCapacity(size + numNew);  // Increments modCount  
66  
67    int numMoved = size - index;  
68    if (numMoved > 0)  
69        System.arraycopy(elementData, index, elementData, index + numNew, numMoved);  
70  
71    System.arraycopy(a, 0, elementData, index, numNew);  
72    size += numNew;  
73    return numNew != 0;  
   }

書上都說ArrayList是基於數組實現的，屬性中也看到了數組，具體是怎麼實現的呢？好比就這個添加元素的方法，若是數組大，則在將某個位置的值設置爲指定元素便可，若是數組容量不夠了呢？

看到add(E e)中先調用了ensureCapacity(size+1)方法，以後將元素的索引賦給elementData[size]，然後size自增。例如初次添加時，size爲0，add將elementData[0]賦值爲e，而後size設置爲1（相似執行如下兩條語句elementData[0]=e;size=1）。將元素的索引賦給elementData[size]不是會出現數組越界的狀況嗎？這裏關鍵就在ensureCapacity(size+1)中了。

4) 元素讀取：

 // 返回此列表中指定位置上的元素。  
 public E get(int index) {  
    RangeCheck(index);  
  
    return (E) elementData[index];  
  }

5) 元素刪除：

ArrayList提供了根據下標或者指定對象兩種方式的刪除功能。以下：

romove(int index):

 1 // 移除此列表中指定位置上的元素。  
 2  public E remove(int index) {  
 3     RangeCheck(index);  
 4   
 5     modCount++;  
 6     E oldValue = (E) elementData[index];  
 7   
 8     int numMoved = size - index - 1;  
 9     if (numMoved > 0)  
10         System.arraycopy(elementData, index+1, elementData, index, numMoved);  
11     elementData[--size] = null; // Let gc do its work  
12   
13     return oldValue;  
14  }

首先是檢查範圍，修改modCount，保留將要被移除的元素，將移除位置以後的元素向前挪動一個位置，將list末尾元素置空（null），返回被移除的元素。

remove(Object o)

 1  // 移除此列表中首次出現的指定元素（若是存在）。這是應爲ArrayList中容許存放重複的元素。  
 2  public boolean remove(Object o) {  
 3     // 因爲ArrayList中容許存放null，所以下面經過兩種狀況來分別處理。  
 4     if (o == null) {  
 5         for (int index = 0; index < size; index++)  
 6             if (elementData[index] == null) {  
 7                 // 相似remove(int index)，移除列表中指定位置上的元素。  
 8                 fastRemove(index);  
 9                 return true;  
10             }  
11     } else {  
12         for (int index = 0; index < size; index++)  
13             if (o.equals(elementData[index])) {  
14                 fastRemove(index);  
15                 return true;  
16             }  
17         }  
18         return false;  
19     } 
20 }

首先經過代碼能夠看到，當移除成功後返回true，不然返回false。remove(Object o)中經過遍歷element尋找是否存在傳入對象，一旦找到就調用fastRemove移除對象。爲何找到了元素就知道了index，不經過remove(index)來移除元素呢？由於fastRemove跳過了判斷邊界的處理，由於找到元素就至關於肯定了index不會超過邊界，並且fastRemove並不返回被移除的元素。下面是fastRemove的代碼，基本和remove(index)一致。

1 private void fastRemove(int index) {  
2          modCount++;  
3          int numMoved = size - index - 1;  
4          if (numMoved > 0)  
5              System.arraycopy(elementData, index+1, elementData, index,  
6                               numMoved);  
7          elementData[--size] = null; // Let gc do its work  
8  }

removeRange(int fromIndex,int toIndex)

 1 protected void removeRange(int fromIndex, int toIndex) {  
 2      modCount++;  
 3      int numMoved = size - toIndex;  
 4          System.arraycopy(elementData, toIndex, elementData, fromIndex,  
 5                           numMoved);  
 6    
 7      // Let gc do its work  
 8      int newSize = size - (toIndex-fromIndex);  
 9      while (size != newSize)  
10          elementData[--size] = null;  
11 }

執行過程是將elementData從toIndex位置開始的元素向前移動到fromIndex，而後將toIndex位置以後的元素所有置空順便修改size。

這個方法是protected，及受保護的方法，爲何這個方法被定義爲protected呢？

這是一個解釋，可是可能不容易看明白。http://stackoverflow.com/questions/2289183/why-is-javas-abstractlists-removerange-method-protected

先看下面這個例子

         ArrayList<Integer> ints = new ArrayList<Integer>(Arrays.asList(0, 1, 2,  
                 3, 4, 5, 6));  
         // fromIndex low endpoint (inclusive) of the subList  
         // toIndex high endpoint (exclusive) of the subList  
        ints.subList(2, 4).clear();  
         System.out.println(ints);

輸出結果是[0, 1, 4, 5, 6]，結果是否是像調用了removeRange(int fromIndex,int toIndex)！哈哈哈，就是這樣的。可是爲何效果相同呢？是否是調用了removeRange(int fromIndex,int toIndex)呢？

6) 調整數組容量ensureCapacity：

從上面介紹的向ArrayList中存儲元素的代碼中，咱們看到，每當向數組中添加元素時，都要去檢查添加後元素的個數是否會超出當前數組的長度，若是超出，數組將會進行擴容，以知足添加數據的需求。數組擴容經過一個公開的方法ensureCapacity(int minCapacity)來實現。在實際添加大量元素前，我也可使用ensureCapacity來手動增長ArrayList實例的容量，以減小遞增式再分配的數量。

public void ensureCapacity(int minCapacity) {  
    modCount++;  
    int oldCapacity = elementData.length;  
    if (minCapacity > oldCapacity) {  
        Object oldData[] = elementData;  
        int newCapacity = (oldCapacity * 3)/2 + 1;  //增長50%+1
            if (newCapacity < minCapacity)  
                newCapacity = minCapacity;  
      // minCapacity is usually close to size, so this is a win:  
      elementData = Arrays.copyOf(elementData, newCapacity);  
    }  
 }

從上述代碼中能夠看出，數組進行擴容時，會將老數組中的元素從新拷貝一份到新的數組中，每次數組容量的增加大約是其原容量的1.5倍。這種操做的代價是很高的，所以在實際使用時，咱們應該儘可能避免數組容量的擴張。當咱們可預知要保存的元素的多少時，要在構造ArrayList實例時，就指定其容量，以免數組擴容的發生。或者根據實際需求，經過調用ensureCapacity方法來手動增長ArrayList實例的容量。

Object oldData[] = elementData;//爲何要用到oldData[]
乍一看來後面並無用到關於oldData，這句話顯得畫蛇添足！可是這是一個牽涉到內存管理的類，因此要了解內部的問題。並且爲何這一句還在if的內部，這跟elementData = Arrays.copyOf(elementData, newCapacity); 這句是有關係的，下面這句Arrays.copyOf的實現時新建立了newCapacity大小的內存，而後把老的elementData放入。好像也沒有用到oldData，有什麼問題呢。問題就在於舊的內存的引用是elementData， elementData指向了新的內存塊，若是有一個局部變量oldData變量引用舊的內存塊的話，在copy的過程當中就會比較安全，由於這樣證實這塊老的內存依然有引用，分配內存的時候就不會被侵佔掉，而後copy完成後這個局部變量的生命期也過去了，而後釋放纔是安全的。否則在copy的的時候萬一新的內存或其餘線程的分配內存侵佔了這塊老的內存，而copy尚未結束，這將是個嚴重的事情。

關於ArrayList和Vector區別以下：

ArrayList在內存不夠時默認是擴展50% + 1個，Vector是默認擴展1倍。
Vector提供indexOf(obj, start)接口，ArrayList沒有。
Vector屬於線程安全級別的，可是大多數狀況下不使用Vector，由於線程安全須要更大的系統開銷。

ArrayList還給咱們提供了將底層數組的容量調整爲當前列表保存的實際元素的大小的功能。它能夠經過trimToSize方法來實現。代碼以下：

127 public void trimToSize() {  
128    modCount++;  
129    int oldCapacity = elementData.length;  
130    if (size < oldCapacity) {  
131        elementData = Arrays.copyOf(elementData, size);  
132    }  
    }

因爲elementData的長度會被拓展，size標記的是其中包含的元素的個數。因此會出現size很小但elementData.length很大的狀況，將出現空間的浪費。trimToSize將返回一個新的數組給elementData，元素內容保持不變，length和size相同，節省空間。

7)轉爲靜態數組toArray

四、注意ArrayList的兩個轉化爲靜態數組的toArray方法。

第一個，調用Arrays.copyOf將返回一個數組，數組內容是size個elementData的元素，即拷貝elementData從0至size-1位置的元素到新數組並返回。

public Object[] toArray() {  
         return Arrays.copyOf(elementData, size);  
 }

第二個，若是傳入數組的長度小於size，返回一個新的數組，大小爲size，類型與傳入數組相同。所傳入數組長度與size相等，則將elementData複製到傳入數組中並返回傳入的數組。若傳入數組長度大於size，除了複製elementData外，還將把返回數組的第size個元素置爲空。

public <T> T[] toArray(T[] a) {
        if (a.length < size)
            // Make a new array of a's runtime type, but my contents:
            return (T[]) Arrays.copyOf(elementData, size, a.getClass());
    System.arraycopy(elementData, 0, a, 0, size);
        if (a.length > size)
            a[size] = null;
        return a;
    }

Fail-Fast機制：
ArrayList也採用了快速失敗的機制，經過記錄modCount參數來實現。在面對併發的修改時，迭代器很快就會徹底失敗，而不是冒着在未來某個不肯定時間發生任意不肯定行爲的風險。具體介紹請參考這篇文章深刻Java集合學習系列：HashMap的實現原理中的Fail-Fast機制。

總結:

關於ArrayList的源碼，給出幾點比較重要的總結：

一、注意其三個不一樣的構造方法。無參構造方法構造的ArrayList的容量默認爲10，帶有Collection參數的構造方法，將Collection轉化爲數組賦給ArrayList的實現數組elementData。

二、注意擴充容量的方法ensureCapacity。ArrayList在每次增長元素（多是1個，也多是一組）時，都要調用該方法來確保足夠的容量。當容量不足以容納當前的元素個數時，就設置新的容量爲舊的容量的1.5倍加1，若是設置後的新容量還不夠，則直接新容量設置爲傳入的參數（也就是所需的容量），然後用Arrays.copyof()方法將元素拷貝到新的數組（詳見下面的第3點）。從中能夠看出，當容量不夠時，每次增長元素，都要將原來的元素拷貝到一個新的數組中，很是之耗時，也所以建議在事先能肯定元素數量的狀況下，才使用ArrayList，不然建議使用LinkedList。

三、ArrayList的實現中大量地調用了Arrays.copyof()和System.arraycopy()方法。咱們有必要對這兩個方法的實現作下深刻的瞭解。

首先來看Arrays.copyof()方法。它有不少個重載的方法，但實現思路都是同樣的，咱們來看泛型版本的源碼：

public static <T> T[] copyOf(T[] original, int newLength) {  
    return (T[]) copyOf(original, newLength, original.getClass());  
}

很明顯調用了另外一個copyof方法，該方法有三個參數，最後一個參數指明要轉換的數據的類型，其源碼以下：

public static <T,U> T[] copyOf(U[] original, int newLength, Class<? extends T[]> newType) {  
    T[] copy = ((Object)newType == (Object)Object[].class)  
        ? (T[]) new Object[newLength]  
        : (T[]) Array.newInstance(newType.getComponentType(), newLength);  
    System.arraycopy(original, 0, copy, 0,  
                     Math.min(original.length, newLength));  
    return copy;  
}

這裏能夠很明顯地看出，該方法其實是在其內部又建立了一個長度爲newlength的數組，調用System.arraycopy()方法，將原來數組中的元素複製到了新的數組中。

下面來看System.arraycopy()方法。該方法被標記了native，調用了系統的C/C++代碼，在JDK中是看不到的，但在openJDK中能夠看到其源碼。該函數實際上最終調用了C語言的memmove()函數，所以它能夠保證同一個數組內元素的正確複製和移動，比通常的複製方法的實現效率要高不少，很適合用來批量處理數組。Java強烈推薦在複製大量數組元素時用該方法，以取得更高的效率。

四、ArrayList基於數組實現，能夠經過下標索引直接查找到指定位置的元素，所以查找效率高，但每次插入或刪除元素，就要大量地移動元素，插入刪除元素的效率低。

五、在查找給定元素索引值等的方法中，源碼都將該元素的值分爲null和不爲null兩種狀況處理，ArrayList中容許元素爲null。

知識共享路徑：http://www.cnblogs.com/maoyali/p/8805975.html