JDK容器學習之ArrayList：底層存儲和動態擴容

時間 2019-11-11

標籤 jdk 容器學習 arraylist 底層存儲動態擴容欄目 Java 简体版

原文原文鏈接

ArrayList 底層存儲和動態擴容邏輯

ArrayList 做爲最經常使用的容器之一，一般用來存儲一系列的數據對象，O(1)級別的數據讀寫java

I. 底層數據模型

查看源碼，其內部定義的成員變量數組

// 默認數組容量
private static final int DEFAULT_CAPACITY = 10;

// 靜態成員，建立一個空的ArrayList時，內部數組實際使用這個
// 避免每次建立一個ArrayList對象，都要新建立一個對象數組
private static final Object[] EMPTY_ELEMENTDATA = {};

private static final Object[] DEFAULTCAPACITY_EMPTY_ELEMENTDATA = {};

// 實際保存數據的數組
transient Object[] elementData; // non-private to simplify nested class access

private int size;

所以ArrayList的底層數據模型比較清晰，就是一個數組，默認初始容量爲10安全

II. 新增，刪除，讀取邏輯

由於底層的數據結構爲數組，因此根據index查詢元素是常量級別開銷，等同於獲取數組中所索引爲index處的元素微信

所以須要關注的就是新增一個元素，若數組容量不夠，如何進行擴容數據結構

刪除一個元素，數組的連續性又是如何保障併發

1. 獲取接口

獲取List中某索引處的值，實現邏輯比較簡單，以下ide

public E get(int index) {
    // 判斷是否數組越界
    rangeCheck(index);
    // 獲取數組中的元素
    return elementData(index);
}

private void rangeCheck(int index) {
    if (index >= size)
        throw new IndexOutOfBoundsException(outOfBoundsMsg(index));
}

E elementData(int index) {
    return (E) elementData[index];
}

另外一個比較常見的讀取接口就是contain和indexOf兩個接口，用於判斷列表中是否包含某個元素or某個元素在數組中的索引this

若讓咱們本身來設計上面兩個接口，多半是遍歷數組，依次判斷每一個元素，是否知足要求線程

JDK實際實現代碼以下設計

public boolean contains(Object o) {
    return indexOf(o) >= 0;
}

public int indexOf(Object o) {
    if (o == null) {
        for (int i = 0; i < size; i++)
            if (elementData[i]==null)
                return i;
    } else {
        for (int i = 0; i < size; i++)
            if (o.equals(elementData[i]))
                return i;
    }
    return -1;
}

從具體實現，能夠注意一下幾點

size表示列表中元素的實際個數
列表中容許保存NULL
列表中容許屢次加入統一個對象，但indexOf返回的是第一個匹配的位置
方法indexOf返回-1表示不存在

2. 刪除元素

在添加元素以前，先看刪除元素的接口實現，由於不涉及到動態擴容問題, 在分析中考慮下面幾點

刪除中間的元素，是否會形成後續的數組遷移

刪除最後一個元素，是否會形成重排（仍是直接size-1便可）

首先看刪除指定索引處的值

public E remove(int index) {
    // 數組越界判斷
    rangeCheck(index);

    modCount++;
    E oldValue = elementData(index);

    int numMoved = size - index - 1;
    if (numMoved > 0) { // 若是移動不是最後一個則須要數組拷貝
        // native 方法實現數組拷貝
        System.arraycopy(elementData, index+1, elementData, index,
                         numMoved);
    }
    // 消滅最後一個元素
    elementData[--size] = null; // clear to let GC do its work

    return oldValue;
}

從源碼解決上面兩個問題

刪除中間元素，會致使數組拷貝
刪除最後一個元素，不用數組拷貝，直接將最後一個元素設置爲null
刪除不會致使數組容量縮水，也就是List只有擴容的邏輯，沒有縮小容量的邏輯

3. 新增元素

結合刪除的邏輯，新增元素邏輯應該比較清晰，將添加索引處及以後的元素，總體後移一位，而後賦值新的值；須要注意擴容的機制

添加一個元素的實現

public boolean add(E e) {
    ensureCapacityInternal(size + 1);  // Increments modCount!!
    elementData[size++] = e;
    return true;
}
public void add(int index, E element) {
    // 判斷索引是否越界
    rangeCheckForAdd(index);

    // 擴容
    ensureCapacityInternal(size + 1);  // Increments modCount!!
    // 數組拷貝
    System.arraycopy(elementData, index, elementData, index + 1,
                     size - index);
    // 設置值
    elementData[index] = element;
    size++;
}

擴容的邏輯以下

private void ensureCapacityInternal(int minCapacity) {
    if (elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA) {
        minCapacity = Math.max(DEFAULT_CAPACITY, minCapacity);
    }

    ensureExplicitCapacity(minCapacity);
}

private void ensureExplicitCapacity(int minCapacity) {
    modCount++;
    if (minCapacity - elementData.length > 0) {
    // 當前的數組容量，已經超過數組長度
        grow(minCapacity);
    }
}

private static final int MAX_ARRAY_SIZE = Integer.MAX_VALUE - 8;

private void grow(int minCapacity) {
    // overflow-conscious code
    int oldCapacity = elementData.length;
    // 擴容原則： 
    // 新增原來容量的一半，即變爲以前容量的 1.5倍
    // 若是上面容量依然不夠，則選擇擴容到剛好容下全部元素的容量
    int newCapacity = oldCapacity + (oldCapacity >> 1);
    if (newCapacity - minCapacity < 0) // 擴充後的容量依然不夠
        newCapacity = minCapacity;
    if (newCapacity - MAX_ARRAY_SIZE > 0)
        newCapacity = hugeCapacity(minCapacity);
    // minCapacity is usually close to size, so this is a win:
    elementData = Arrays.copyOf(elementData, newCapacity);
}

針對上面的邏輯進行小結：

先擴容，後數組遷移，最後進行賦值
擴容邏輯：
- 優先擴容原來容量的1.5倍
- 若依舊不夠，則擴容到剛好能容納全部元素
在列表的最後添加元素，不要使用add(index,object)方法，會形成不必的數組遷移調用

插入刪除示意圖

III. 遍歷邏輯

容器基本上都是實現了 Iterable 接口，因此遍歷則主要是依據迭代器的next()方法來實現

List的遍歷，說白了就是數組的遍歷，實現邏輯比較簡單，惟一有意思的就是併發修改拋異常的問題

先看下迭代器類

private class Itr implements Iterator<E> {
    int cursor;       // index of next element to return
    int lastRet = -1; // index of last element returned; -1 if no such
    int expectedModCount = modCount;

    public boolean hasNext() {
        return cursor != size;
    }

    @SuppressWarnings("unchecked")
    public E next() {
        // 下面方法確保在遍歷過程當中，如有其餘線程修改了列表的內容，則拋異常
        checkForComodification();
        int i = cursor;
        if (i >= size)
            throw new NoSuchElementException();
        Object[] elementData = ArrayList.this.elementData;
        if (i >= elementData.length)
            throw new ConcurrentModificationException();
        
        // 遍歷的實際邏輯，就是索引的遞增
        cursor = i + 1;
        return (E) elementData[lastRet = i];
    }

    public void remove() {
       // ...
    }

    @Override
    @SuppressWarnings("unchecked")
    public void forEachRemaining(Consumer<? super E> consumer) {
        //
    }

    final void checkForComodification() {
        if (modCount != expectedModCount)
            throw new ConcurrentModificationException();
    }
}