CopyOnWriteArrayList 使用入門及源碼詳解

時間 2021-01-19

標籤 html java api 數組安全數據結構多線程併發 oracle app 欄目 Java 简体版

原文原文鏈接

CopyOnWriteArrayList

官方定義

CopyOnWriteArrayList是ArrayList的線程安全變體，其中經過建立底層數組的新副原本實現全部可變操做（添加，設置等）。html

這一般成本過高，可是當遍歷操做大大超過突變時，它可能比替代方法更有效，而且當您不能或不想同步遍歷但須要排除併發線程之間的干擾時很是有用。java

「快照」樣式迭代器方法在建立迭代器時使用對數組狀態的引用。api

這個數組在迭代器的生命週期中永遠不會改變，因此干擾是不可能的，而且保證迭代器不會拋出ConcurrentModificationException。自迭代器建立以來，迭代器不會反映列表的添加，刪除或更改。不支持對迭代器自己進行元素更改操做（刪除，設置和添加）。這些方法拋出UnsupportedOperationException。數組

容許全部元素，包括null。安全

內存一致性效果：與其餘併發集合同樣，在將對象放入CopyOnWriteArrayList以前，線程中的操做發生在從另外一個線程中的CopyOnWriteArrayList訪問或刪除該元素以後的操做以前。數據結構

使用例子

網上這種代碼大同小異。多線程

ArrayList 版本

下面來看一個列子：兩個線程一個線程循環讀取，一個線程修改list的值。併發

import java.util.ArrayList;
import java.util.List;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;

public class CopyOnWriteArrayListDemo {
    /**
     * 讀線程
     */
    private static class ReadTask implements Runnable {
        List<String> list;

        public ReadTask(List<String> list) {
            this.list = list;
        }

        public void run() {
            for (String str : list) {
                System.out.println(str);
            }
        }
    }
    /**
     * 寫線程
     */
    private static class WriteTask implements Runnable {
        List<String> list;
        int index;

        public WriteTask(List<String> list, int index) {
            this.list = list;
            this.index = index;
        }

        public void run() {
            list.remove(index);
            list.add(index, "write_" + index);
        }
    }

    public void run() {
        final int NUM = 10;
        List<String> list = new ArrayList<String>();
        for (int i = 0; i < NUM; i++) {
            list.add("main_" + i);
        }
        ExecutorService executorService = Executors.newFixedThreadPool(NUM);
        for (int i = 0; i < NUM; i++) {
            executorService.execute(new ReadTask(list));
            executorService.execute(new WriteTask(list, i));
        }
        executorService.shutdown();
    }

    public static void main(String[] args) {
        new CopyOnWriteArrayListDemo().run();
    }
}

這個運行結果會報錯。oracle

由於咱們在讀取的時候，對列表進行了修改。app

CopyOnWriteArrayList 版本

直接列表建立替換便可：

List<String> list = new CopyOnWriteArrayList<String>();

則運行結果正常。

CopyOnWriteArrayList 優缺點

優勢

保證多線程的併發讀寫的線程安全

缺點

內存消耗

有數組拷貝天然有內存問題。若是實際應用數據比較多，並且比較大的狀況下，佔用內存會比較大，這個能夠用ConcurrentHashMap來代替。

如何避免

內存佔用問題。由於CopyOnWrite的寫時複製機制，因此在進行寫操做的時候，內存裏會同時駐紮兩個對象的內存，舊的對象和新寫入的對象（注意:在複製的時候只是複製容器裏的引用，只是在寫的時候會建立新對象添加到新容器裏，而舊容器的對象還在使用，因此有兩份對象內存）。若是這些對象佔用的內存比較大，好比說200M左右，那麼再寫入100M數據進去，內存就會佔用300M，那麼這個時候頗有可能形成頻繁的Yong GC和Full GC。以前咱們系統中使用了一個服務因爲每晚使用CopyOnWrite機制更新大對象，形成了每晚15秒的Full GC，應用響應時間也隨之變長。

針對內存佔用問題，能夠經過壓縮容器中的元素的方法來減小大對象的內存消耗，好比，若是元素全是10進制的數字，能夠考慮把它壓縮成36進制或64進制。或者不使用CopyOnWrite容器，而使用其餘的併發容器，如ConcurrentHashMap。

數據一致性

CopyOnWrite容器只能保證數據的最終一致性，不能保證數據的實時一致性。因此若是你但願寫入的的數據，立刻能讀到，請不要使用CopyOnWrite容器

使用場景

CopyOnWrite併發容器用於讀多寫少的併發場景。

好比白名單，黑名單，商品類目的訪問和更新場景，假如咱們有一個搜索網站，用戶在這個網站的搜索框中，輸入關鍵字搜索內容，可是某些關鍵字不容許被搜索。這些不能被搜索的關鍵字會被放在一個黑名單當中，黑名單天天晚上更新一次。當用戶搜索時，會檢查當前關鍵字在不在黑名單當中，若是在，則提示不能搜索。

實現代碼以下：

/**
 * 黑名單服務
 */
public class BlackListServiceImpl {
    private static CopyOnWriteMap<String, Boolean> blackListMap = new CopyOnWriteMap<String, Boolean>(
            1000);
    public static boolean isBlackList(String id) {
        return blackListMap.get(id) == null ? false : true;
    }
    public static void addBlackList(String id) {
        blackListMap.put(id, Boolean.TRUE);
    }
    /**
     * 批量添加黑名單
     *
     * @param ids
     */
    public static void addBlackList(Map<String,Boolean> ids) {
        blackListMap.putAll(ids);
    }
}

代碼很簡單，可是使用CopyOnWriteMap須要注意兩件事情：

減小擴容開銷。根據實際須要，初始化CopyOnWriteMap的大小，避免寫時CopyOnWriteMap擴容的開銷。
使用批量添加。由於每次添加，容器每次都會進行復制，因此減小添加次數，能夠減小容器的複製次數。如使用上面代碼裏的addBlackList方法。

爲何沒有併發列表？

問：JDK 5在java.util.concurrent裏引入了ConcurrentHashMap，在須要支持高併發的場景，咱們可使用它代替HashMap。

可是爲何沒有ArrayList的併發實現呢？

難道在多線程場景下咱們只有Vector這一種線程安全的數組實現能夠選擇麼？爲何在java.util.concurrent 沒有一個類能夠代替Vector呢？

別人的理解

ConcurrentHashMap的出現更多的在於保證併發，從它採用了鎖分段技術和弱一致性的Map迭代器去避免併發瓶頸可知。(jdk7 及其之前)

而ArrayList中不少操做很難避免鎖整表，就如contains()、隨機取get()等，進行查詢搜索時都是要整張表操做的，那多線程時數據的實時一致性就只能經過鎖來保證，這就限制了併發。

我的的理解

這裏說的並不確切。

若是沒有數組的長度變化，那麼能夠經過下標進行分段，不一樣的範圍進行鎖。可是這種有個問題，若是數組出現刪除，增長就會不行。

說到底，仍是性能和安全的平衡。

比較中肯的回答

在java.util.concurrent包中沒有加入併發的ArrayList實現的主要緣由是：很難去開發一個通用而且沒有併發瓶頸的線程安全的List。

像ConcurrentHashMap這樣的類的真正價值（The real point/value of classes）並非它們保證了線程安全。而在於它們在保證線程安全的同時不存在併發瓶頸。

舉個例子，ConcurrentHashMap採用了鎖分段技術和弱一致性的Map迭代器去規避併發瓶頸。

因此問題在於，像「Array List」這樣的數據結構，你不知道如何去規避併發的瓶頸。拿contains() 這樣一個操做來講，當你進行搜索的時候如何避免鎖住整個list？

另外一方面，Queue 和Deque (基於Linked List)有併發的實現是由於他們的接口相比List的接口有更多的限制，這些限制使得實現併發成爲可能。

CopyOnWriteArrayList是一個有趣的例子，它規避了只讀操做（如get/contains）併發的瓶頸，可是它爲了作到這點，在修改操做中作了不少工做和修改可見性規則。

此外，修改操做還會鎖住整個List，所以這也是一個併發瓶頸。

因此從理論上來講，CopyOnWriteArrayList並不算是一個通用的併發List。

源碼解讀

類定義

public class CopyOnWriteArrayList<E>
    implements List<E>, RandomAccess, Cloneable, java.io.Serializable {
    private static final long serialVersionUID = 8673264195747942595L;
    }

實現了最基本的 List 接口。

屬性

咱們看到前幾回反覆說起的 ReentrantLock 可重入鎖。

array 比較好理解，之前的 List 也是經過數組實現的。

/** The lock protecting all mutators */
final transient ReentrantLock lock = new ReentrantLock();

/** The array, accessed only via getArray/setArray. */
private transient volatile Object[] array;
/**
 * Gets the array.  Non-private so as to also be accessible
 * from CopyOnWriteArraySet class.
 */
final Object[] getArray() {
    return array;
}
/**
 * Sets the array.
 */
final void setArray(Object[] a) {
    array = a;
}

構造器

/**
 * Creates an empty list.
 */
public CopyOnWriteArrayList() {
    setArray(new Object[0]);
}

/**
 * Creates a list containing the elements of the specified
 * collection, in the order they are returned by the collection's
 * iterator.
 *
 * @param c the collection of initially held elements
 * @throws NullPointerException if the specified collection is null
 */
public CopyOnWriteArrayList(Collection<? extends E> c) {
    Object[] elements;
    if (c.getClass() == CopyOnWriteArrayList.class)
        elements = ((CopyOnWriteArrayList<?>)c).getArray();
    else {
        elements = c.toArray();
        // c.toArray might (incorrectly) not return Object[] (see 6260652)
        if (elements.getClass() != Object[].class)
            elements = Arrays.copyOf(elements, elements.length, Object[].class);
    }
    setArray(elements);
}

/**
 * Creates a list holding a copy of the given array.
 *
 * @param toCopyIn the array (a copy of this array is used as the
 *        internal array)
 * @throws NullPointerException if the specified array is null
 */
public CopyOnWriteArrayList(E[] toCopyIn) {
    setArray(Arrays.copyOf(toCopyIn, toCopyIn.length, Object[].class));
}

這幾種構造器都是統一調用的 setArray 方法：

/**
 * Sets the array.
 */
final void setArray(Object[] a) {
    array = a;
}

這個方法很是簡單，就是初始化對應的數組信息。

核心方法

我大概看了下，不少方法和之前大同小異，咱們來重點關注下幾個修改元素值的方法：

set

方法經過 ReentrantLock 可重入鎖控制加鎖和解鎖。

這裏最巧妙的地方在於，首先會判斷指定 index 的值是否和預期值相同。

按理說相同，是能夠不進行更新的，這樣性能更好；不過 jdk 中仍是會進行一次設置。

若是值不一樣，則會對原來的 array 進行拷貝，而後更新，最後從新設置。

這樣作的好處就是寫是不阻塞讀的，缺點就是比較浪費內存，拷貝數組也是要花時間的。

/**
 * Replaces the element at the specified position in this list with the
 * specified element.
 *
 * @throws IndexOutOfBoundsException {@inheritDoc}
 */
public E set(int index, E element) {
    final ReentrantLock lock = this.lock;
    lock.lock();
    try {
        Object[] elements = getArray();
        E oldValue = get(elements, index);
        if (oldValue != element) {
            int len = elements.length;
            Object[] newElements = Arrays.copyOf(elements, len);
            newElements[index] = element;
            setArray(newElements);
        } else {
            // 不是徹底禁止操做； 確保可變的寫語義
            // Not quite a no-op; ensures volatile write semantics
            setArray(elements);
        }
        return oldValue;
    } finally {
        lock.unlock();
    }
}

ps: 這裏的全部變動操做是互斥的。

add

/**
 * Appends the specified element to the end of this list.
 *
 * @param e element to be appended to this list
 * @return {@code true} (as specified by {@link Collection#add})
 */
public boolean add(E e) {
    final ReentrantLock lock = this.lock;
    lock.lock();
    try {
        Object[] elements = getArray();
        int len = elements.length;
        Object[] newElements = Arrays.copyOf(elements, len + 1);
        newElements[len] = e;
        setArray(newElements);
        return true;
    } finally {
        lock.unlock();
    }
}

也是經過 ReentrantLock 進行加鎖。

這裏比起更新更加簡單直接，由於是添加元素，因此新數組的長度直接+1。

jdk 中數組的這種複製都是使用的 Arrays.copy 方法，這個之前實測，性能仍是不錯的。

add(int, E)

/**
 * Inserts the specified element at the specified position in this
 * list. Shifts the element currently at that position (if any) and
 * any subsequent elements to the right (adds one to their indices).
 *
 * @throws IndexOutOfBoundsException {@inheritDoc}
 */
public void add(int index, E element) {
    final ReentrantLock lock = this.lock;
    lock.lock();
    try {
        Object[] elements = getArray();
        int len = elements.length;

        // 越界校驗
        if (index > len || index < 0)
            throw new IndexOutOfBoundsException("Index: "+index+
                                                ", Size: "+len);
        Object[] newElements;
        int numMoved = len - index;

        // 若是是放在數組的最後，其實就等價於上面的 add 方法了。
        if (numMoved == 0)
            newElements = Arrays.copyOf(elements, len + 1);
        else {
            // 若是元素不是在最後，就從兩邊開始複製便可：

            //0...index-1
            //index+1..len

            newElements = new Object[len + 1];
            System.arraycopy(elements, 0, newElements, 0, index);
            System.arraycopy(elements, index, newElements, index + 1,
                             numMoved);
        }

        // 統一設置 index 位置的元素信息
        newElements[index] = element;
        setArray(newElements);
    } finally {
        lock.unlock();
    }
}

remove 刪除元素

/**
 * Removes the element at the specified position in this list.
 * Shifts any subsequent elements to the left (subtracts one from their
 * indices).  Returns the element that was removed from the list.
 *
 * @throws IndexOutOfBoundsException {@inheritDoc}
 */
public E remove(int index) {
    final ReentrantLock lock = this.lock;
    lock.lock();
    try {
        Object[] elements = getArray();
        int len = elements.length;
        E oldValue = get(elements, index);
        int numMoved = len - index - 1;

        // 若是刪除最後一個元素，直接從 0..len-1 進行拷貝便可。
        if (numMoved == 0)
            setArray(Arrays.copyOf(elements, len - 1));
        else {
            // 新的數組比原來長度-1
            Object[] newElements = new Object[len - 1];

            //0...index-1 拷貝
            //indx+1...len-1 拷貝
            System.arraycopy(elements, 0, newElements, 0, index);
            System.arraycopy(elements, index + 1, newElements, index,
                             numMoved);
            setArray(newElements);
        }
        return oldValue;
    } finally {
        lock.unlock();
    }
}

刪除元素實際上和添加元素的流程是相似的。

不過很奇怪，沒有作越界判斷？

迭代器

說明

COWList 的迭代器和常規的 ArrayList 迭代器仍是有差別的，咱們之前可能會被問過，一邊遍歷一邊刪除如何實現？

答案可能就是 Iterator。

可是 COW 的 Iterator 偏偏是不能支持變動的，我的理解是爲了保證併發只在上面說起的幾個變動中控制。

實現

迭代器定義

static final class COWIterator<E> implements ListIterator<E> {
        /** Snapshot of the array */
        private final Object[] snapshot;
        /** Index of element to be returned by subsequent call to next.  */
        private int cursor;

        private COWIterator(Object[] elements, int initialCursor) {
            cursor = initialCursor;
            snapshot = elements;
        }
}

基礎方法

這裏提供了一些基礎的最經常使用的方法。

public boolean hasNext() {
    return cursor < snapshot.length;
}
public boolean hasPrevious() {
    return cursor > 0;
}
@SuppressWarnings("unchecked")
public E next() {
    if (! hasNext())
        throw new NoSuchElementException();
    return (E) snapshot[cursor++];
}
@SuppressWarnings("unchecked")
public E previous() {
    if (! hasPrevious())
        throw new NoSuchElementException();
    return (E) snapshot[--cursor];
}
public int nextIndex() {
    return cursor;
}
public int previousIndex() {
    return cursor-1;
}

不支持的操做

public void remove() {
    throw new UnsupportedOperationException();
}
public void set(E e) {
    throw new UnsupportedOperationException();
}
public void add(E e) {
    throw new UnsupportedOperationException();
}

@Override
public void forEachRemaining(Consumer<? super E> action) {
    Objects.requireNonNull(action);
    Object[] elements = snapshot;
    final int size = elements.length;
    for (int i = cursor; i < size; i++) {
        @SuppressWarnings("unchecked") E e = (E) elements[i];
        action.accept(e);
    }
    cursor = size;
}