揭開ThreadLocal的面紗

時間 2019-12-06

原文原文鏈接

當初使用C#時，研究過好一陣它的ThreadLocal，以及能夠跨線程傳遞的LogicalCallContext（ExecutionContext）,無奈C#不開源（所幸有了.Net Core），只能滿世界找文檔，找博客。切換到Java後，終於接觸到了另外一種研究問題的方法：相比於查資料，更能夠看代碼，調試代碼。而後，一切都不那麼神祕了。html

做用及核心原理

在我看來，Thread Local主要提供兩個功能：git

方便傳參。提供一個方便的「貨架子」，想存就存，想取的時候能取到，不用每層方法調用都傳一大堆參數。（咱們一般傾向於把公共的數據放到貨架子裏）
線程隔離。各個線程的值互不相干，屏蔽了多線程的煩惱。

This class provides thread-local variables. These variables differ from their normal counterparts in that each thread that accesses one (via its get or set method) has its own, independently initialized copy of the variable. ThreadLocal instances are typically private static fields in classes that wish to associate state with a thread (e.g., a user ID or Transaction ID)github

代碼的註釋太到位了。ThreadLocal應該翻譯爲【線程本地變量】，意爲和普通變量相對。ThreadLocal一般是一個靜態變量，但其get()獲得的值在各個線程中互不相干。web

ThreadLocal的幾個核心方法：spring

get() 獲得變量的值。若是此ThreadLocal在當前線程中被設置過值，則返回該值；不然，間接地調用initialValue()初始化當前線程中的變量，再返回初始值。
set() 設置當前線程中的變量值。
protected initialValue() 初始化方法。默認實現是返回null。
remove() 刪除當前線程中的變量。

原理簡述

每一個線程都有一個 ThreadLocalMap 類型的 threadLocals 屬性，ThreadLocalMap 類至關於一個Map，key 是 ThreadLocal 自己，value 就是咱們設置的值。

public class Thread implements Runnable {
    ThreadLocal.ThreadLocalMap threadLocals = null;
}
複製代碼

當咱們經過 threadLocal.set("xxx"); 的時候，就是在這個線程中的 threadLocals 屬性中放入一個鍵值對，key 是當前線程，value 就是你設置的值。

public void set(T value) {
    Thread t = Thread.currentThread();
    ThreadLocalMap map = getMap(t);
    if (map != null)
        map.set(this, value);
    else
        createMap(t, value);
}
複製代碼

當咱們經過 threadlocal.get() 方法的時候，就是根據當前線程做爲key來獲取這個線程設置的值。

public T get() {
    Thread t = Thread.currentThread();
    ThreadLocalMap map = getMap(t);
    if (map != null) {
        ThreadLocalMap.Entry e = map.getEntry(this);
        if (e != null) {
             @SuppressWarnings("unchecked")
             T result = (T)e.value;
             return result;
        }
    }
    return setInitialValue();
}
複製代碼

核心：ThreadLocalMap

ThreadLocalMap is a customized hash map suitable only for maintaining thread local values. To help deal with very large and long-lived usages, the hash table entries use WeakReferences for keys. However, since reference queues are not used, stale entries are guaranteed to be removed only when the table starts running out of space.數據庫

ThreadLocalMap是一個定製的Hash map，使用開放尋址法解決衝突。bash

它的Entry是一個WeakReference，準確地說是繼承了WeakReference
ThreadLocal對象的引用被傳到WeakReference的reference中，entry.get()被看成map元素的key，而Entry還多了一個字段value，用來存放ThreadLocal變量實際的值。
因爲是弱引用，若ThreadLocal對象再也不有普通引用，GC發生時會將ThreadLocal對象清除。而Entry的key，即entry.get()會變爲null。然而，GC只會清除被引用對象，Entry還被線程的ThreadLocalMap引用着，於是不會被清除。於是，value對象就不會被清除。除非線程退出，形成該線程的ThreadLocalMap總體釋放，不然value的內存就沒法釋放，內存泄漏！
JDK的做者天然想到了這一點，所以在ThreadLocalMap的不少方法中，調用expungeStaleEntries()清除entry.get() == null 的元素，將Entry的value釋放。因此，只要線程還在使用其餘ThreadLocal，已經無效的ThreadLocal內存就會被清除。
然而，咱們大部分的使用場景是，ThreadLocal是一個靜態變量，所以永遠有普通引用指向每一個線程中的ThreadLocalMap的該entry。所以該ThreadLocal的Entry永遠不會被釋放，天然expungeStaleEntries()就無能爲力，value的內存也不會被釋放。因此在咱們確實用完了ThreadLocal後，能夠主動調用remove()方法，主動刪掉entry。

然而，真的有必要調用remove()方法嗎？一般咱們的場景是服務端，線程在不斷地處理請求，每一個請求到來會致使某線程中的Thread Local變量被賦予一個新的值，而原來的值對象天然地就失去了引用，被GC清理。因此當使用static的Thread Local且不設置其爲null時，不存在泄露！session

跨線程傳遞

Thread Local是不能跨線程傳遞的，線程隔離嘛！但有些場景中咱們又想傳遞。例如：多線程

啓動一個新線程執行某個方法，但但願新線程也能經過Thread Local獲取當前線程擁有的上下文(e.g., User ID, Transaction ID)。
將任務提交給線程池執行時，但願未來執行任務的那個線程也能繼承當前線程的Thread Local，從而可使用當前的上下文。

下面咱們就來看一下有哪些方法。併發

InheritableThreadLocal

原理：InheritableThreadLocal這個類繼承了ThreadLocal，重寫了3個方法。

public class InheritableThreadLocal<T> extends ThreadLocal<T> {
    // 能夠忽略
    protected T childValue(T parentValue) {
        return parentValue;
    }

    /**
     * Get the map associated with a ThreadLocal.
     *
     * @param t the current thread
     */
    ThreadLocalMap getMap(Thread t) {
       return t.inheritableThreadLocals;
    }

    /**
     * Create the map associated with a ThreadLocal.
     *
     * @param t the current thread
     * @param firstValue value for the initial entry of the table.
     */
    void createMap(Thread t, T firstValue) {
        t.inheritableThreadLocals = new ThreadLocalMap(this, firstValue);
    }
}
複製代碼

能夠看到使用InheritableThreadLocal時，map使用了線程的inheritableThreadLocals 字段，而不是以前的threadLocals 字段。

而inheritableThreadLocals 字段既然叫可繼承的，天然在建立新線程的時候會傳遞。代碼在Thread的init()方法中：

if (inheritThreadLocals && parent.inheritableThreadLocals != null)
            this.inheritableThreadLocals =
                ThreadLocal.createInheritedMap(parent.inheritableThreadLocals);
複製代碼

到此爲止，經過inheritableThreadLocals咱們能夠在父線程建立子線程的時候將ThreadLocal中的值傳遞給子線程，這個特性已經可以知足大部分的需求了[1]。可是還有一個很嚴重的問題會出如今線程複用的狀況下[2]，好比線程池中去使用inheritableThreadLocals 進行傳值，由於inheritableThreadLocals 只是會在新建立線程的時候進行傳值，線程複用並不會作這個操做。

到這裏JDK就無能爲力了。C#提供了LogicalCallContext（以及Execution Context機制）來解決，Java要解決這個問題就得本身去擴展線程類，實現這個功能。

阿里開源的transmittable-thread-local

GitHub地址。

transmittable-thread-local使用方式分爲三種：（裝飾器模式哦！）

修飾Runnable和Callable
修飾線程池
Java Agent來修飾（運行時修改）JDK線程池實現類。

具體使用方式官方文檔很是清楚。

下面簡析原理：

既然要解決在使用線程池時的thread local傳遞問題，就要把任務提交時的當前ThreadLocal值傳遞到任務執行時的那個線程。
而如何傳遞，天然是在提交任務前**捕獲（capture）當前線程的全部ThreadLocal，存下來，而後在任務真正執行時在目標線程中放出(replay)**以前捕獲的ThreadLocal。

代碼層面，以修飾Runnable舉例：

建立TtlRunnable()時，必定先調用capture()捕獲當前線程中的ThreadLocal

private TtlCallable(@Nonnull Callable<V> callable, boolean releaseTtlValueReferenceAfterCall) {
    this.capturedRef = new AtomicReference<Object>(capture());
    ...
}
複製代碼

capture() 方法是Transmitter類的靜態方法：

public static Object capture() {
        Map<TransmittableThreadLocal<?>, Object> captured = new HashMap<TransmittableThreadLocal<?>, Object>();
        for (TransmittableThreadLocal<?> threadLocal : holder.get().keySet()) {
            captured.put(threadLocal, threadLocal.copyValue());
        }
        return captured;
}
複製代碼

在run()中，先放出以前捕獲的ThreadLocal。

public void run() {
    Object captured = capturedRef.get();
    ...
    Object backup = replay(captured);
    try {
        runnable.run();
    } finally {
        restore(backup); 
    }
}
複製代碼

時序圖：

應用

Spring MVC的靜態類 RequestContextHolder，getRequestAttributes()實際上得到的就是InheritableThreadLocal<RequestAttributes>在當前線程中的值。也能夠說明它能夠傳遞到自身建立的線程中，但對已有的線程無能爲力。

至於它是什麼什麼被設置的，能夠參考其註釋：Holder class to expose the web request in the form of a thread-bound RequestAttributes object. The request will be inherited by any child threads spawned by the current thread if the inheritable flag is set to true. Use RequestContextListener or org.springframework.web.filter.RequestContextFilter to expose the current web request. Note that org.springframework.web.servlet.DispatcherServlet already exposes the current request by default.
Spring中的數據庫鏈接，Hibernate中的session。
阿里巴巴TTL總結的幾個應用場景

...

一些坑

好文：談談ThreadLocal的設計及不足

提到了設計ThreadLocal須要考慮的兩個問題，ThreadLocal又是如何解決的。
1. 當Thread退出時，資源如何釋放，避免內存泄漏問題。
2. Map數據可能由於會被多線程訪問，存在資源競爭，須要考慮併發同步問題。
提到了TheadLocal被gc但其關聯的Entry還在的內存泄露問題，在Lucene中獲得瞭解決：
1. 我看ThreadLocal時，也在想，既然Entry的key是WeakReference，爲啥Value不也作成WeakReference，這樣不就沒泄露了？
2. 轉念一想，value是弱引用的話，就不能保證使用的時候它還在了，由於會被gc掉。
3. 而Lucene爲了解決以上，又保存了一個WeakHashMap<Thread, T>，這樣只要線程還在，value就不會被清掉。
4. 然而又帶來了多線程訪問的問題，須要加鎖。