老哥你真的知道ArrayList#sublist的正確用法麼

時間 2019-12-05

標籤老哥知道 arraylist#sublist arraylist sublist 正確用法欄目 Java 简体版

原文原文鏈接

咱們有這麼一個場景，給你一個列表，能夠動態的新增，可是最終要求列表升序，要求長度小於20，能夠怎麼作？java

這個還不簡單，幾行代碼就能夠了git

public List<Integer> trimList(List<Integer> list, int add) {
    list.add(add);
    list.sort(null);
    if (list.size() > 20) {
        list = list.subList(0, 20);
    }
    return list;
}

1. 測試驗證

上面的代碼先不考慮性能的優化方面，有沒有問題？github

寫了個簡單的測試case，咱們來看下會出現什麼狀況算法

@Test
public void testTri() throws InterruptedException {
    List<Integer> list = new ArrayList<>(30);
    Random random = new Random();
    int cnt = 0;
    while (true) {
        list = trimList(list, random.nextInt(100000));

        Thread.sleep(1);
        ++cnt;
        System.out.println(list + " >> " + cnt);
    }
}

啓動參數修改下，添加jvm最大內存條件 -Xmx3m，而後跑上面代碼，一段時間以後竟然出現stack over flow數組

有意思的問題來了，從邏輯上看，這個數組固定長度爲20，頂多有21條數據，怎麼就會內存溢出呢？bash

2. SubList 方法揭祕

咱們看下ArrayList#sublis方法的實現邏輯，就能夠發現獲取子列表，竟然只是重置了一下內部數組的索引dom

public List<E> subList(int fromIndex, int toIndex) {
    subListRangeCheck(fromIndex, toIndex, size);
    return new SubList(this, 0, fromIndex, toIndex);
}

private class SubList extends AbstractList<E> implements RandomAccess {
    private final AbstractList<E> parent;
    private final int parentOffset;
    private final int offset;
    int size;
  
    SubList(AbstractList<E> parent,
            int offset, int fromIndex, int toIndex) {
        this.parent = parent;
        this.parentOffset = fromIndex;
        this.offset = offset + fromIndex;
        this.size = toIndex - fromIndex;
        this.modCount = ArrayList.this.modCount;
    }
    ...
}

返回的是一個SubList類型對象，這個對象和原來的List公用一個存儲數據的數組，可是多了兩個記錄子列表起始的偏移;jvm

而後再看下SubList的add方法，也是直接在原來的數組中新增數據，想到與原來的列表在指定位置插入數據源碼分析

public void add(int index, E e) {
    rangeCheckForAdd(index);
    checkForComodification();
    parent.add(parentOffset + index, e);
    this.modCount = parent.modCount;
    this.size++;
}

因此上面實現的代碼中 list = list.subList(0, 20); 這一行，有內存泄露，貌似是隻返回了一個20長度大小的列表，可是這個列表中的數組長度，可能遠遠不止20性能

爲了驗證上面的說法，debug下上面的測試用例

動圖演示以下

3. 正確使用姿式

上面知道sublist並不會新建立一個列表，舊的數據依然還在，只是咱們用不了而已，因此改動也很簡單，根據sublist的結果建立一個新的數組就行了

public List<Integer> trimList(List<Integer> list, int add) {
    list.add(add);
    list.sort(null);
    if (list.size() > 20) {
        list = new ArrayList<>(list.subList(0, 20));
    }
    return list;
}

再次測試，代碼一直在順利的執行，看下後面的計數，都已經5w多，前面1w多久報錯了

雖然上面解決了內存泄露，可是gc也很頻繁了，本篇的重點主要是指出sublist的錯誤使用姿式，因此上面算法的優化就不詳細展開了

4. 知識點擴展

看下下面的測試代碼輸出應該是什麼

@ToString
public static class InnerC {
    private String name;
    private Integer id;

    public InnerC(String name, Integer id) {
        this.name = name;
        this.id = id;
    }
}

@Test
public void subList() {
    List<Integer> list = new ArrayList<>();
    for (int i = 0; i < 20; i++) {
        list.add(i);
    }

    // case 1
    List<Integer> sub = list.subList(10, 15);
    sub.add(100);
    System.out.println("list: " + list);
    System.out.println("sub: " + sub);

    // case 2
    list.set(11, 200);
    System.out.println("list: " + list);
    System.out.println("sub: " + sub);

    // case 3
    list = new ArrayList<>(sub);
    sub.set(0, 999);
    System.out.println("list: " + list);
    System.out.println("sub: " + sub);

    // case 4
    List<InnerC> cl = new ArrayList<>();
    cl.add(new InnerC("a", 1));
    cl.add(new InnerC("a2", 2));
    cl.add(new InnerC("a3", 3));
    cl.add(new InnerC("a4", 4));

    List<InnerC> cl2 = new ArrayList<>(cl.subList(1, 3));
    cl2.get(0).name = "a5";
    cl2.get(0).id = 5;
    System.out.println("list cl: " + cl);
    System.out.println("list cl2: " + cl2);
}

再看具體的答案以前，先分析一下

針對case1/2，咱們知道sublist返回的列表和原列表公用一個底層數組，因此這兩個列表的增刪，都是相互影響的

case1 執行以後至關於在list數組的下標15這裏，插入數據100
case2 執行以後，list的下標11，至關於sub的下標1，也就是說sub[1] 變成了200

對於case3/4 而言，根據sub建立了一個新的列表，這個時候修改新的列表中的值，會影響到原來的列表中的值麼？

分析這個場景，就須要看一下源碼了

public ArrayList(Collection<? extends E> c) {
    elementData = c.toArray();
    if ((size = elementData.length) != 0) {
        // c.toArray might (incorrectly) not return Object[] (see 6260652)
        if (elementData.getClass() != Object[].class)
            elementData = Arrays.copyOf(elementData, size, Object[].class);
    } else {
        // replace with empty array.
        this.elementData = EMPTY_ELEMENTDATA;
    }
}

// 對應的核心邏輯就在 Arrays.copyOf，而這個方法主要調用的是native方法`System.arraycopy`

public static <T,U> T[] copyOf(U[] original, int newLength, Class<? extends T[]> newType) {
    @SuppressWarnings("unchecked")
    T[] copy = ((Object)newType == (Object)Object[].class)
        ? (T[]) new Object[newLength]
        : (T[]) Array.newInstance(newType.getComponentType(), newLength);
    System.arraycopy(original, 0, copy, 0,
                     Math.min(original.length, newLength));
    return copy;
}

從上面的源碼分析，會不會相互影響就看這個數組拷貝是怎麼實現的了（深拷貝？淺拷貝？）

接下來看下實際的輸出結果

list: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 100, 15, 16, 17, 18, 19]
sub: [10, 11, 12, 13, 14, 100]
list: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 200, 12, 13, 14, 100, 15, 16, 17, 18, 19]
sub: [10, 200, 12, 13, 14, 100]
list: [10, 200, 12, 13, 14, 100]
sub: [999, 200, 12, 13, 14, 100]
list cl: [BasicTest.InnerC(name=a, id=1), BasicTest.InnerC(name=a5, id=5), BasicTest.InnerC(name=a3, id=3), BasicTest.InnerC(name=a4, id=4)]
list cl2: [BasicTest.InnerC(name=a5, id=5), BasicTest.InnerC(name=a3, id=3)]

從上面能夠知道，case1/2的分析沒啥問題，case三、4的輸出有點意思了