Java集合源碼分析（一）ArrayList

時間 2019-11-13

標籤 java 集合源碼分析 arraylist 欄目 Java 简体版

原文原文鏈接

前言java

　　在前面的學習集合中只是介紹了集合的相關用法，咱們想要更深刻的去了解集合那就要經過咱們去分析它的源碼來了解它。但願對集合有一個更進一步的理解！設計模式

　　既然是看源碼那咱們要怎麼看一個類的源碼呢？這裏我推薦的方法是：api

　　　　1）看繼承結構數組

　　　　　　看這個類的層次結構，處於一個什麼位置，能夠在本身內心有個大概的瞭解。安全

　　　　2）看構造方法數據結構

　　　　　　在構造方法中，看作了哪些事情，跟蹤方法中裏面的方法。app

　　　　3）看經常使用的方法less

　　　　　　跟構造方法同樣，這個方法實現功能是如何實現的dom

　　注：既然是源碼，爲何要這樣設計類，有這樣的繼承關係。這就要說到設計模式的問題了。因此咱們要了解經常使用的設計模式，才能更深入的去理解這個類。ide

1、ArrayList簡介

1.一、ArrayList概述

　　1）ArrayList是能夠動態增加和縮減的索引序列，它是基於數組實現的List類。

　　2）該類封裝了一個動態再分配的Object[]數組，每個類對象都有一個capacity屬性，表示它們所封裝的Object[]數組的長度，當向ArrayList中添加元素時，該屬性值會自動增長。

　　　　若是想ArrayList中添加大量元素，可以使用ensureCapacity方法一次性增長capacity，能夠減小增長重分配的次數提升性能。

　　3）ArrayList的用法和Vector向相似，可是Vector是一個較老的集合，具備不少缺點，不建議使用。

　　　　另外，ArrayList和Vector的區別是：ArrayList是線程不安全的，當多條線程訪問同一個ArrayList集合時，程序須要手動保證該集合的同步性，而Vector則是線程安全的。

　　4）ArrayList和Collection的關係：

1.二、ArrayList的數據結構

　　分析一個類的時候，數據結構每每是它的靈魂所在，理解底層的數據結構其實就理解了該類的實現思路，具體的實現細節再具體分析。

　　ArrayList的數據結構是：

　　說明：底層的數據結構就是數組，數組元素類型爲Object類型，便可以存放全部類型數據。咱們對ArrayList類的實例的全部的操做底層都是基於數組的。

2、ArrayList源碼分析

2.一、繼承結構和層次關係

　　咱們看一下ArrayList的繼承結構：

　　　　　　　　ArrayList extends AbstractList

　　　　　　　　AbstractList extends AbstractCollection

　　全部類都繼承Object 因此ArrayList的繼承結構就是上圖這樣。

　　分析：

　　　　1）爲何要先繼承AbstractList，而讓AbstractList先實現List<E>？而不是讓ArrayList直接實現List<E>？

　　　　　　這裏是有一個思想，接口中全都是抽象的方法，而抽象類中能夠有抽象方法，還能夠有具體的實現方法，正是利用了這一點，讓AbstractList是實現接口中一些通用的方法，而具體的類，

　　　　　　如ArrayList就繼承這個AbstractList類，拿到一些通用的方法，而後本身在實現一些本身特有的方法，這樣一來，讓代碼更簡潔，就繼承結構最底層的類中通用的方法都抽取出來，

　　　　　　先一塊兒實現了，減小重複代碼。因此通常看到一個類上面還有一個抽象類，應該就是這個做用。

　　　　2）ArrayList實現了哪些接口？

　　　　　　List<E>接口：咱們會出現這樣一個疑問，在查看了ArrayList的父類AbstractList也實現了List<E>接口，那爲何子類ArrayList仍是去實現一遍呢？

　　　　　　　　　　　　這是想不通的地方，因此我就去查資料，有的人說是爲了查看代碼方便，使觀看者一目瞭然，說法不一，但每個讓我感受合理的，可是在stackOverFlow中找到了答案，這裏其實頗有趣。

　　　　　　　　　　　　網址貼出來 http://stackoverflow.com/questions/2165204/why-does-linkedhashsete-extend-hashsete-and-implement-sete開發這個collection 的做者Josh說。

　　　　　　　　　　　　這實際上是一個mistake，由於他寫這代碼的時候以爲這個會有用處，可是其實並沒什麼用，但由於沒什麼影響，就一直留到了如今。

　　　　　　RandomAccess接口：這個是一個標記性接口，經過查看api文檔，它的做用就是用來快速隨機存取，有關效率的問題，在實現了該接口的話，那麼使用普通的for循環來遍歷，性能更高，例如arrayList。

　　　　　　　　　　　　　　　　而沒有實現該接口的話，使用Iterator來迭代，這樣性能更高，例如linkedList。因此這個標記性只是爲了讓咱們知道咱們用什麼樣的方式去獲取數據性能更好。

　　　　　　Cloneable接口：實現了該接口，就可使用Object.Clone()方法了。

　　　　　　Serializable接口：實現該序列化接口，代表該類能夠被序列化，什麼是序列化？簡單的說，就是可以從類變成字節流傳輸，而後還能從字節流變成原來的類。

2.二、類中的屬性

public class ArrayList<E> extends AbstractList<E>
        implements List<E>, RandomAccess, Cloneable, java.io.Serializable
{
    // 版本號
    private static final long serialVersionUID = 8683452581122892189L;
    // 缺省容量
    private static final int DEFAULT_CAPACITY = 10;
    // 空對象數組
    private static final Object[] EMPTY_ELEMENTDATA = {};
    // 缺省空對象數組
    private static final Object[] DEFAULTCAPACITY_EMPTY_ELEMENTDATA = {};
    // 元素數組
    transient Object[] elementData;
    // 實際元素大小，默認爲0
    private int size;
    // 最大數組容量
    private static final int MAX_ARRAY_SIZE = Integer.MAX_VALUE - 8;
}

2.三、構造方法

　　ArrayList有三個構造方法：

　　1）無參構造方法　　

/**
    * Constructs an empty list with an initial capacity of ten.　　這裏就說明了默認會給10的大小，因此說一開始arrayList的容量是10.
    */
　　　　//ArrayList中儲存數據的其實就是一個數組，這個數組就是elementData，在123行定義的 private transient Object[] elementData;
　　 public ArrayList() {　　
        super();        //調用父類中的無參構造方法，父類中的是個空的構造方法
        this.elementData = EMPTY_ELEMENTDATA;//EMPTY_ELEMENTDATA：是個空的Object[]， 將elementData初始化，elementData也是個Object[]類型。空的Object[]會給默認大小10，等會會解釋何時賦值的。
    }

　　　備註：

　　2）有參構造函數一

/**
     * Constructs an empty list with the specified initial capacity.
     *
     * @param  initialCapacity  the initial capacity of the list
     * @throws IllegalArgumentException if the specified initial capacity
     *         is negative
     */
    public ArrayList(int initialCapacity) {
        super(); //父類中空的構造方法
        if (initialCapacity < 0)    //判斷若是自定義大小的容量小於0，則報下面這個非法數據異常
            throw new IllegalArgumentException("Illegal Capacity: "+
                                               initialCapacity);
        this.elementData = new Object[initialCapacity]; //將自定義的容量大小當成初始化elementData的大小
    }

　　3）有參構造方法三(不經常使用)

//這個構造方法不經常使用，舉個例子就能明白什麼意思
    /*
        Strudent exends Person
         ArrayList<Person>、 Person這裏就是泛型
        我還有一個Collection<Student>、因爲這個Student繼承了Person，那麼根據這個構造方法，我就能夠把這個Collection<Student>轉換爲ArrayList<Sudent>這就是這個構造方法的做用 
    */
     public ArrayList(Collection<? extends E> c) {
        elementData = c.toArray();    //轉換爲數組
        size = elementData.length;   //數組中的數據個數
        // c.toArray might (incorrectly) not return Object[] (see 6260652)
        if (elementData.getClass() != Object[].class) //每一個集合的toarray()的實現方法不同，因此須要判斷一下，若是不是Object[].class類型，那麼久須要使用ArrayList中的方法去改造一下。
            elementData = Arrays.copyOf(elementData, size, Object[].class);
    }

　　總結：arrayList的構造方法就作一件事情，就是初始化一下儲存數據的容器，其實本質上就是一個數組，在其中就叫elementData。

2.四、核心方法

　　2.4.一、add()方法（有四個）

　　　　1）boolean add(E)；//默認直接在末尾添加元素

/**
     * Appends the specified element to the end of this list.添加一個特定的元素到list的末尾。
     *
     * @param e element to be appended to this list
     * @return <tt>true</tt> (as specified by {@link Collection#add})
     */
    public boolean add(E e) {    
    //肯定內部容量是否夠了，size是數組中數據的個數，由於要添加一個元素，因此size+1，先判斷size+1的這個個數數組可否放得下，就在這個方法中去判斷是否數組.length是否夠用了。
        ensureCapacityInternal(size + 1);  // Increments modCount!!
     //在數據中正確的位置上放上元素e，而且size++
        elementData[size++] = e;
        return true;
    }

　　　　分析：

　　　　　　ensureCapacityInternal(xxx);　肯定內部容量的方法　　　

private void ensureCapacityInternal(int minCapacity) {
        if (elementData == EMPTY_ELEMENTDATA) { //看，判斷初始化的elementData是否是空的數組，也就是沒有長度
    //由於若是是空的話，minCapacity=size+1；其實就是等於1，空的數組沒有長度就存放不了，因此就將minCapacity變成10，也就是默認大小，可是帶這裏，尚未真正的初始化這個elementData的大小。
            minCapacity = Math.max(DEFAULT_CAPACITY, minCapacity);
        }
    //確認實際的容量，上面只是將minCapacity=10，這個方法就是真正的判斷elementData是否夠用
        ensureExplicitCapacity(minCapacity);
    }

　　　　　　ensureExplicitCapacity(xxx)；

private void ensureExplicitCapacity(int minCapacity) {
        modCount++;

        // overflow-conscious code
//minCapacity若是大於了實際elementData的長度，那麼就說明elementData數組的長度不夠用，不夠用那麼就要增長elementData的length。這裏有的同窗就會模糊minCapacity究竟是什麼呢，這裏給大家分析一下

/*第一種狀況：因爲elementData初始化時是空的數組，那麼第一次add的時候，minCapacity=size+1；也就minCapacity=1，在上一個方法(肯定內部容量ensureCapacityInternal)就會判斷出是空的數組，就會給 　　將minCapacity=10，到這一步爲止，尚未改變elementData的大小。
　第二種狀況：elementData不是空的數組了，那麼在add的時候，minCapacity=size+1；也就是minCapacity表明着elementData中增長以後的實際數據個數，拿着它判斷elementData的length是否夠用，若是length 不夠用，那麼確定要擴大容量，否則增長的這個元素就會溢出。 */


        if (minCapacity - elementData.length > 0)
    //arrayList能自動擴展大小的關鍵方法就在這裏了
            grow(minCapacity);
    }

　　　　　　grow(xxx); arrayList核心的方法，能擴展數組大小的真正祕密。

private void grow(int minCapacity) {
        // overflow-conscious code
        int oldCapacity = elementData.length;  //將擴充前的elementData大小給oldCapacity
        int newCapacity = oldCapacity + (oldCapacity >> 1);//newCapacity就是1.5倍的oldCapacity
        if (newCapacity - minCapacity < 0)//這句話就是適應於elementData就空數組的時候，length=0，那麼oldCapacity=0，newCapacity=0，因此這個判斷成立，在這裏就是真正的初始化elementData的大小了，就是爲10.前面的工做都是準備工做。
            newCapacity = minCapacity;
        if (newCapacity - MAX_ARRAY_SIZE > 0)//若是newCapacity超過了最大的容量限制，就調用hugeCapacity，也就是將能給的最大值給newCapacity
            newCapacity = hugeCapacity(minCapacity);
        // minCapacity is usually close to size, so this is a win:
    //新的容量大小已經肯定好了，就copy數組，改變容量大小咯。
        elementData = Arrays.copyOf(elementData, newCapacity);
    }

　　　　　hugeCapacity();

//這個就是上面用到的方法，很簡單，就是用來賦最大值。
    private static int hugeCapacity(int minCapacity) {
        if (minCapacity < 0) // overflow
            throw new OutOfMemoryError();
//若是minCapacity都大於MAX_ARRAY_SIZE，那麼就Integer.MAX_VALUE返回，反之將MAX_ARRAY_SIZE返回。由於maxCapacity是三倍的minCapacity，可能擴充的太大了，就用minCapacity來判斷了。
//Integer.MAX_VALUE:2147483647   MAX_ARRAY_SIZE：2147483639  也就是說最大也就能給到第一個數值。仍是超過了這個限制，就要溢出了。至關於arraylist給了兩層防禦。
        return (minCapacity > MAX_ARRAY_SIZE) ?
            Integer.MAX_VALUE :
            MAX_ARRAY_SIZE;
    }

　　　　2）void add(int，E)；在特定位置添加元素，也就是插入元素

public void add(int index, E element) {
        rangeCheckForAdd(index);//檢查index也就是插入的位置是否合理。

//跟上面的分析同樣，具體看上面
        ensureCapacityInternal(size + 1);  // Increments modCount!!
//這個方法就是用來在插入元素以後，要將index以後的元素都日後移一位，
        System.arraycopy(elementData, index, elementData, index + 1,
                         size - index);
//在目標位置上存放元素
        elementData[index] = element;
        size++;//size增長1
    }

　　　　分析：

　　　　　　rangeCheckForAdd(index)　　

    private void rangeCheckForAdd(int index) {
        if (index > size || index < 0)   //插入的位置確定不能大於size 和小於0
//若是是，就報這個越界異常
            throw new IndexOutOfBoundsException(outOfBoundsMsg(index));
    }

　　　　　　System.arraycopy(...)：就是將elementData在插入位置後的全部元素日後面移一位。查看api文檔　

public static void arraycopy(Object src,
int srcPos,
Object dest,
int destPos,
int length)
src：源對象
srcPos：源對象對象的起始位置
dest：目標對象
destPost：目標對象的起始位置
length：從起始位置日後複製的長度。

//這段的大概意思就是解釋這個方法的用法，複製src到dest，複製的位置是從src的srcPost開始，到srcPost+length-1的位置結束，複製到destPost上，從destPost開始到destPost+length-1的位置上，
Copies an array from the specified source array, beginning at the specified position, to the specified position of the destination array. A subsequence of array components are copied from
the source array referenced by src to the destination array referenced by dest. The number of components copied is equal to the length argument. The components at positions srcPos through srcPos+length-1
in the source array are copied into positions destPos through destPos+length-1, respectively, of the destination array.

//告訴你複製的一種狀況，若是A和B是同樣的，那麼先將A複製到臨時數組C，而後經過C複製到B，用了一個第三方參數
If the src and dest arguments refer to the same array object, then the copying is performed as if the components at positions srcPos through srcPos+length-1 were first copied to
a temporary array with length components and then the contents of the temporary array were copied into positions destPos through destPos+length-1 of the destination array.

//這一大段，就是來講明會出現的一些問題，NullPointerException和IndexOutOfBoundsException 還有ArrayStoreException 這三個異常出現的緣由。
If dest is null, then a NullPointerException is thrown.

If src is null, then a NullPointerException is thrown and the destination array is not modified.

Otherwise, if any of the following is true, an ArrayStoreException is thrown and the destination is not modified:

The src argument refers to an object that is not an array.
The dest argument refers to an object that is not an array.
The src argument and dest argument refer to arrays whose component types are different primitive types.
The src argument refers to an array with a primitive component type and the dest argument refers to an array with a reference component type.
The src argument refers to an array with a reference component type and the dest argument refers to an array with a primitive component type.
Otherwise, if any of the following is true, an IndexOutOfBoundsException is thrown and the destination is not modified:

The srcPos argument is negative.
The destPos argument is negative.
The length argument is negative.
srcPos+length is greater than src.length, the length of the source array.
destPos+length is greater than dest.length, the length of the destination array.

//這裏描述了一種特殊的狀況，就是當A的長度大於B的長度的時候，會複製一部分，而不是徹底失敗。
Otherwise, if any actual component of the source array from position srcPos through srcPos+length-1 cannot be converted to the component type of the destination array by assignment conversion, an ArrayStoreException is thrown.
In this case, let k be the smallest nonnegative integer less than length such that src[srcPos+k] cannot be converted to the component type of the destination array; when the exception is thrown, source array components from positions
srcPos through srcPos+k-1 will already have been copied to destination array positions destPos through destPos+k-1 and no other positions of the destination array will have been modified. (Because of the restrictions already itemized,

this paragraph effectively applies only to the situation where both arrays have component types that are reference types.)

//這個參數列表的解釋，一開始就說了，
Parameters:
src - the source array.
srcPos - starting position in the source array.
dest - the destination array.
destPos - starting position in the destination data.
length - the number of array elements to be copied.

arraycopy

　　總結：

　　　　正常狀況下會擴容1.5倍，特殊狀況下（新擴展數組大小已經達到了最大值）則只取最大值。

　　　　當咱們調用add方法時，實際上的函數調用以下：

　　　　說明：程序調用add，實際上還會進行一系列調用，可能會調用到grow，grow可能會調用hugeCapacity。

　　舉例說明一：　　　　

　　List<Integer> lists = new ArrayList<Integer>(6);
　　lists.add(8);

　　　　說明：初始化lists大小爲0，調用的ArrayList()型構造函數，那麼在調用lists.add(8)方法時，會通過怎樣的步驟呢？下圖給出了該程序執行過程和最初與最後的elementData的大小。

　　　　說明：咱們能夠看到，在add方法以前開始elementData = {}；調用add方法時會繼續調用，直至grow，最後elementData的大小變爲10，以後再返回到add函數，把8放在elementData[0]中。

　　舉例說明二：　　　

　　List<Integer> lists = new ArrayList<Integer>(6);
　　lists.add(8);

　　　　說明：調用的ArrayList(int)型構造函數，那麼elementData被初始化爲大小爲6的Object數組，在調用add(8)方法時，具體的步驟以下：

　　　　說明：咱們能夠知道，在調用add方法以前，elementData的大小已經爲6，以後再進行傳遞，不會進行擴容處理。

　　2.4.二、刪除方法

　　　　其實這幾個刪除方法都是相似的。咱們選擇幾個講，其中fastRemove(int)方法是private的，是提供給remove(Object)這個方法用的。

　　　　1）remove(int)：經過刪除指定位置上的元素

public E remove(int index) {
        rangeCheck(index);//檢查index的合理性

        modCount++;//這個做用不少，好比用來檢測快速失敗的一種標誌。
        E oldValue = elementData(index);//經過索引直接找到該元素

        int numMoved = size - index - 1;//計算要移動的位數。
        if (numMoved > 0)
//這個方法也已經解釋過了，就是用來移動元素的。
            System.arraycopy(elementData, index+1, elementData, index,
                             numMoved);
//將--size上的位置賦值爲null，讓gc(垃圾回收機制)更快的回收它。
        elementData[--size] = null; // clear to let GC do its work
//返回刪除的元素。
        return oldValue;
    }

　　　　2）remove(Object)：這個方法能夠看出來，arrayList是能夠存放null值得。

//感受這個不怎麼要分析吧，都看得懂，就是經過元素來刪除該元素，就依次遍歷，若是有這個元素，就將該元素的索引傳給fastRemobe(index)，使用這個方法來刪除該元素， //fastRemove(index)方法的內部跟remove(index)的實現幾乎同樣，這裏最主要是知道arrayList能夠存儲null值
     public boolean remove(Object o) {
        if (o == null) {
            for (int index = 0; index < size; index++)
                if (elementData[index] == null) {
                    fastRemove(index);
                    return true;
                }
        } else {
            for (int index = 0; index < size; index++)
                if (o.equals(elementData[index])) {
                    fastRemove(index);
                    return true;
                }
        }
        return false;
    }

　　　　3）clear()：將elementData中每一個元素都賦值爲null，等待垃圾回收將這個給回收掉，因此叫clear

public void clear() {
        modCount++;

        // clear to let GC do its work
        for (int i = 0; i < size; i++)
            elementData[i] = null;

        size = 0;
    }

　　　　4）removeAll(collection c)：

     public boolean removeAll(Collection<?> c) {
         return batchRemove(c, false);//批量刪除
     }

　　　　5）batchRemove(xx,xx)：用於兩個方法，一個removeAll()：它只清楚指定集合中的元素，retainAll()用來測試兩個集合是否有交集。　

//這個方法，用於兩處地方，若是complement爲false，則用於removeAll若是爲true，則給retainAll()用，retainAll（）是用來檢測兩個集合是否有交集的。
   private boolean batchRemove(Collection<?> c, boolean complement) {
        final Object[] elementData = this.elementData; //將原集合，記名爲A
        int r = 0, w = 0;   //r用來控制循環，w是記錄有多少個交集
        boolean modified = false;  
        try {
            for (; r < size; r++)
//參數中的集合C一次檢測集合A中的元素是否有，
                if (c.contains(elementData[r]) == complement)
//有的話，就給集合A
                    elementData[w++] = elementData[r];
        } finally {
            // Preserve behavioral compatibility with AbstractCollection,
            // even if c.contains() throws.
//若是contains方法使用過程報異常
            if (r != size) {
//將剩下的元素都賦值給集合A，
                System.arraycopy(elementData, r,
                                 elementData, w,
                                 size - r);
                w += size - r;
            }
            if (w != size) {
//這裏有兩個用途，在removeAll()時，w一直爲0，就直接跟clear同樣，全是爲null。
//retainAll()：沒有一個交集返回true，有交集但不全交也返回true，而兩個集合相等的時候，返回false，因此不能根據返回值來確認兩個集合是否有交集，而是經過原集合的大小是否發生改變來判斷，若是原集合中還有元素，則表明有交集，而元集合沒有元素了，說明兩個集合沒有交集。
                // clear to let GC do its work
                for (int i = w; i < size; i++)
                    elementData[i] = null;
                modCount += size - w;
                size = w;
                modified = true;
            }
        }
        return modified;
    }

　　總結：：remove函數用戶移除指定下標的元素，此時會把指定下標到數組末尾的元素向前移動一個單位，而且會把數組最後一個元素設置爲null，

　　　　　　這樣是爲了方便以後將整個數組不被使用時，會被GC，能夠做爲小的技巧使用。

　　2.4.三、set()方法

public E set(int index, E element) {
        // 檢驗索引是否合法
        rangeCheck(index);
        // 舊值
        E oldValue = elementData(index);
        // 賦新值
        elementData[index] = element;
        // 返回舊值
        return oldValue;
    }

　　說明：設定指定下標索引的元素值

　　2.4.四、indexOf()方法

// 從首開始查找數組裏面是否存在指定元素
    public int indexOf(Object o) {
        if (o == null) { // 查找的元素爲空
            for (int i = 0; i < size; i++) // 遍歷數組，找到第一個爲空的元素，返回下標
                if (elementData[i]==null)
                    return i;
        } else { // 查找的元素不爲空
            for (int i = 0; i < size; i++) // 遍歷數組，找到第一個和指定元素相等的元素，返回下標
                if (o.equals(elementData[i]))
                    return i;
        } 
        // 沒有找到，返回空
        return -1;
    }

　　說明：從頭開始查找與指定元素相等的元素，注意，是能夠查找null元素的，意味着ArrayList中能夠存放null元素的。與此函數對應的lastIndexOf，表示從尾部開始查找。

　　2.4.五、get()方法

public E get(int index) {
        // 檢驗索引是否合法
        rangeCheck(index);

        return elementData(index);
    }

　　說明：get函數會檢查索引值是否合法（只檢查是否大於size，而沒有檢查是否小於0），值得注意的是，在get函數中存在element函數，element函數用於返回具體的元素，具體函數以下：

E elementData(int index) {
        return (E) elementData[index];
    }

　　說明：返回的值都通過了向下轉型（Object -> E），這些是對咱們應用程序屏蔽的小細節。

3、總結　

1）arrayList能夠存放null。
2）arrayList本質上就是一個elementData數組。
3）arrayList區別於數組的地方在於可以自動擴展大小，其中關鍵的方法就是gorw()方法。
4）arrayList中removeAll(collection c)和clear()的區別就是removeAll能夠刪除批量指定的元素，而clear是全是刪除集合中的元素。
5）arrayList因爲本質是數組，因此它在數據的查詢方面會很快，而在插入刪除這些方面，性能降低不少，有移動不少數據才能達到應有的效果
6）arrayList實現了RandomAccess，因此在遍歷它的時候推薦使用for循環。

喜歡就「推薦」哦！