扯淡 Java 集合

時間 2020-07-08

標籤扯淡 java 集合欄目 Java 简体版

原文原文鏈接

大體分類：List、Set、Queue、Maphtml

Iterable

Collection 接口中繼承 Iterable 接口。這個接口爲 for each 循環設計、接口方法中有返回Iterator對象java

public interface Iterable<T> {

    Iterator<T> iterator();

    default void forEach(Consumer<? super T> action) {
        Objects.requireNonNull(action);
        for (T t : this) {
            action.accept(t);
        }
    }

    default Spliterator<T> spliterator() {
        return Spliterators.spliteratorUnknownSize(iterator(), 0);
    }
}

咱們看個例子來理解一下上面的話算法

LinkedList<Integer> linkedList = new LinkedList<>();
linkedList.add(1);
linkedList.add(2);
linkedList.add(3);

for (Integer integer : linkedList) {
    System.out.println(integer);
}

反編譯以後數組

LinkedList<Integer> linkedList = new LinkedList();
linkedList.add(1);
linkedList.add(2);
linkedList.add(3);
Iterator var4 = linkedList.iterator();

while(var4.hasNext()) {
    Integer integer = (Integer)var4.next();
    System.out.println(integer);
}

Iterator

在 Iterable 接口中出現了這麼一個迭代器安全

public interface Iterator<E> {
 
    boolean hasNext();

    E next();

    default void remove() {
        throw new UnsupportedOperationException("remove");
    }

    default void forEachRemaining(Consumer<? super E> action) {
        Objects.requireNonNull(action);
        while (hasNext())
            action.accept(next());
    }
}

主要是爲了統一遍歷方式、使集合的數據結構和訪問方式解耦數據結構

咱們來看看最多見的 ArrayList 類中的內部類多線程

private class Itr implements Iterator<E> {
    int cursor;       // 下一次要返回的下標
    int lastRet = -1; // 這一次next 要返回的下標
    int expectedModCount = modCount; // 修改次數

    public boolean hasNext() {
        return cursor != size;
    }

    @SuppressWarnings("unchecked")
    public E next() {
        checkForComodification();
        int i = cursor;
        if (i >= size)
            throw new NoSuchElementException();
        Object[] elementData = ArrayList.this.elementData;
        if (i >= elementData.length)
            throw new ConcurrentModificationException();
        cursor = i + 1;
        return (E) elementData[lastRet = i];
    }

    public void remove() {
        if (lastRet < 0)
            throw new IllegalStateException();
        checkForComodification();

        try {
            ArrayList.this.remove(lastRet);
            cursor = lastRet;
            lastRet = -1;
            expectedModCount = modCount;
        } catch (IndexOutOfBoundsException ex) {
            throw new ConcurrentModificationException();
        }
    }

    @Override
    @SuppressWarnings("unchecked")
    public void forEachRemaining(Consumer<? super E> consumer) {
        Objects.requireNonNull(consumer);
        final int size = ArrayList.this.size;
        int i = cursor;
        if (i >= size) {
            return;
        }
        final Object[] elementData = ArrayList.this.elementData;
        if (i >= elementData.length) {
            throw new ConcurrentModificationException();
        }
        while (i != size && modCount == expectedModCount) {
            consumer.accept((E) elementData[i++]);
        }
        // update once at end of iteration to reduce heap write traffic
        cursor = i;
        lastRet = i - 1;
        checkForComodification();
    }

    final void checkForComodification() {
        if (modCount != expectedModCount)
            throw new ConcurrentModificationException();
    }
}

咱們都知道在 ArrayList 中 forEach 中的時候 remove 會致使 ConcurrentModificationException併發

ArrayList<Integer> arrayList = new ArrayList<>();
arrayList.add(1);
arrayList.add(1);
arrayList.add(1);

for (Integer integer : arrayList) {
    arrayList.remove(integer);
}

Exception in thread "main" java.util.ConcurrentModificationException

而咱們使用 Iterator 進行 remove 的時候就不會有這個問題、dom

public void remove() {
    if (lastRet < 0)
        throw new IllegalStateException();
    checkForComodification();

    try {
        ArrayList.this.remove(lastRet);
        cursor = lastRet;
        lastRet = -1;
        expectedModCount = modCount;
    } catch (IndexOutOfBoundsException ex) {
        throw new ConcurrentModificationException();
    }
}

List

ArrayList

動態數組
線程不安全
元素容許爲 null
實現了 List、RandomAccess、Cloneable、Serializable
連續的內存空間
增長和刪除都會致使 modCount 的值改變
默認擴容爲一半

Vector

線程安全
擴容是上一次的一倍
存在 modCount
每一個操做數組的方法都加上了 synchronized

CopyOnWriteArrayList

寫時複製、加鎖
耗內存
實時性不高
不存在 ConcurrentModificationException
數據量最好不要太大
使用 ReentrantLock 進行加鎖

Collections.synchronizedList

synchronized 代碼塊
對象鎖能夠參數傳進去、或者當前對象
須要傳 List 對象進去

SynchronizedList(List<E> list) {
    super(list);
    this.list = list;
}
SynchronizedList(List<E> list, Object mutex) {
    super(list, mutex);
    this.list = list;
}

LinkedList

ArrayList 增刪效率低、改查效率高、而 LinkedList剛剛相反
鏈表實現
for 循環的時候、根據 index 是靠近前半段仍是後半段來決定是順序仍是逆序
增刪的時候會改變 modCount

Map

常見的四個實現類ide

HashMap
HashTable
LinkedHashMap
TreeMap

HashMap

HashMap 是數組+鏈表+紅黑樹（JDK1.8增長了紅黑樹部分）實現的，以下如所示。

transient Node<K,V>[] table;
// 實際存儲的 key-value 的數量
transient int size;
// 閾值、當存放在 table 中的 key-value 大於這個值的時候須要進行擴容
int threshold;
// 負載因子 由於 threshold = loadFactor * table.length
final float loadFactor;

table 的長度默認是 16 、loadFactor 的默認值是 0.75

繼續看看 Node 的數據結構

static class Node<K,V> implements Map.Entry<K,V> {
    final int hash;
    final K key;
    V value;
    Node<K,V> next;
}

肯定哈希桶數組索引的位置

方法一：
static final int hash(Object key) {   //jdk1.8 & jdk1.7
     int h;
     // h = key.hashCode() 爲第一步 取hashCode值
     // h ^ (h >>> 16)  爲第二步 高位參與運算
     return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
}
方法二：
static int indexFor(int h, int length) {  //jdk1.7的源碼，jdk1.8 直接使用裏面的方法體、沒有定義這個方法
     return h & (length-1);  //第三步 取模運算
}

JDK 1.8 的
final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
                   boolean evict) {
        Node<K,V>[] tab; Node<K,V> p; int n, i;
        if ((tab = table) == null || (n = tab.length) == 0)
            n = (tab = resize()).length;
        // 這裏
         if ((p = tab[i = (n - 1) & hash]) == null)
            tab[i] = newNode(hash, key, value, null);
.....................
....................
}

這裏的Hash算法本質上就是三步：取key的hashCode值、高位運算、取模運算

取模運算就是 h & (length - 1 ) 、其實它是等價於 h%length 、由於 length 老是 2 的 n 次方。由於 &比%具備更高的效率

(h = key.hashCode()) ^ (h >>> 16) 將 key 的 hashCode 與它的高 16 位進行異或的操做

其實爲啥這麼操做呢、是由於當 table 的數組的大小比較小的時候、key 的 hashCode 的高位信息就會直接被丟棄掉、這個時候就會增長了低位的衝突、因此將高位的信息經過異或保留下來

那其實爲啥要異或呢？雙目運算不是還有 & || 嗎

來自知乎的解答

方法一其實叫作一個擾動函數、hashCode的高位和低位作異或、就是爲了混合原始哈希碼的高位和低位、以此加大低位的隨機性、並且混合後的低位摻雜了高位的部分特徵、這樣高位的信息也被變相地保留下來、通過擾動以後、有效減小了哈希衝突
至於這裏爲何使用異或運行、由於在雙目運算 & || ^ 中異或是混洗效果最好的、結果佔雙目運算兩個數的50% 、混洗性是比較好的

https://www.zhihu.com/questio...

https://codeday.me/bug/201709...

關於 JDK 1.7 擴容致使循環鏈表問題

下面是 JDK 1.7 的擴容代碼

void resize(int newCapacity) {   //傳入新的容量
 2     Entry[] oldTable = table;    //引用擴容前的Entry數組
 3     int oldCapacity = oldTable.length;         
 4     if (oldCapacity == MAXIMUM_CAPACITY) {  //擴容前的數組大小若是已經達到最大(2^30)了
 5         threshold = Integer.MAX_VALUE; //修改閾值爲int的最大值(2^31-1)，這樣之後就不會擴容了
 6         return;
 7     }
 8  
 9     Entry[] newTable = new Entry[newCapacity];  //初始化一個新的Entry數組
10     transfer(newTable);                         //！！將數據轉移到新的Entry數組裏
11     table = newTable;                           //HashMap的table屬性引用新的Entry數組
12     threshold = (int)(newCapacity * loadFactor);//修改閾值
13 }

void transfer(Entry[] newTable) {
 2     Entry[] src = table;                   //src引用了舊的Entry數組
 3     int newCapacity = newTable.length;
 4     for (int j = 0; j < src.length; j++) { //遍歷舊的Entry數組
 5         Entry<K,V> e = src[j];             //取得舊Entry數組的每一個元素
 6         if (e != null) {
 7             src[j] = null;//釋放舊Entry數組的對象引用（for循環後，舊的Entry數組再也不引用任何對象）
 8             do {
 9                 Entry<K,V> next = e.next;
10                 int i = indexFor(e.hash, newCapacity); //！！從新計算每一個元素在數組中的位置
11                 e.next = newTable[i]; //標記[1]
12                 newTable[i] = e;      //將元素放在數組上
13                 e = next;             //訪問下一個Entry鏈上的元素
14             } while (e != null);
15         }
16     }
17 }

咱們先看看美團博客上面的例子

單線程環境下是正常完成擴容的、可是有沒有發現、倒置了、key7 在 key3 前面了。這個很關鍵

咱們再來看看多線程下、致使循環鏈表的問題

其實出現循環鏈表這種狀況、就是由於擴容的時候、鏈表倒置了

而 JDK1.8 中、使用兩個變量解決鏈表倒置而發生了循環鏈表的問題

Node<K,V> loHead = null, loTail = null;
Node<K,V> hiHead = null, hiTail = null;

經過 head 和 tail 兩個變量、將擴容時鏈表倒置的問題解決了、循環鏈表的問題就解決了

可是不管如何、在併發的狀況下、都會發生丟失數據的問題、就好比說上面的例子就丟失了 key5

HashTable

遺留類、不少功能和 HashMap 相似、可是它是線程安全的、可是任意時刻只能有一個線程寫 HashTable、併發性不如 ConcurrentHashMap，由於 ConcurrentHashMap 使用分段鎖。不建議使用

LinkedHashMap

LinkedHashMap繼承自HashMap、在HashMap基礎上、經過維護一條雙向鏈表、解決了HashMap不能隨時保持遍歷順序和插入順序一致的問題

重寫了 HashMap 的 newNode 方法

而且重寫了 afterNodeInsertion 方法、這個方法原本在 HashMap 中是空方法

void afterNodeInsertion(boolean evict) { // possibly remove eldest
    LinkedHashMap.Entry<K,V> first;
    if (evict && (first = head) != null && removeEldestEntry(first)) {
        K key = first.key;
        removeNode(hash(key), key, null, false, true);
    }
}

而方法 removeEldestEntry 在 LinkedHashMap 中返回 false 、咱們能夠經過重寫此方法來實現一個 LRU 隊列的

/**
 * The iteration ordering method for this linked hash map: <tt>true</tt>
 * for access-order, <tt>false</tt> for insertion-order.
 *
 * @serial
 */
final boolean accessOrder;

默認爲 false 遍歷的時候控制順序

TreeMap

static final class Entry<K,V> implements Map.Entry<K,V> {
    K key;
    V value;
    Entry<K,V> left;
    Entry<K,V> right;
    Entry<K,V> parent;
    boolean color = BLACK;

TreeMap底層基於紅黑樹實現

Set

沒啥好說的

Queue

PriorityQueue

默認小頂堆、能夠看看關於堆排序的實現八種常見的排序算法

public boolean offer(E e) {
    if (e == null)
        throw new NullPointerException();
    modCount++;
    int i = size;
    if (i >= queue.length)
        grow(i + 1);
    size = i + 1;
    if (i == 0)
        queue[0] = e;
    else
        siftUp(i, e);
    return true;
}

public boolean add(E e) {
    return offer(e);
}

強烈推薦文章參考的美團的這篇文章、關於 HashMap 的
https://tech.meituan.com/2016...

相關標籤/搜索

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。