flask 源碼解析：上下文

時間 2019-11-09

原文原文鏈接

這是 flask 源碼解析系列文章的其中一篇，本系列全部文章列表：python

上下文（application context 和 request context）

上下文一直是計算機中難理解的概念，在知乎的一個問題下面有個很通俗易懂的回答：併發

每一段程序都有不少外部變量。只有像Add這種簡單的函數纔是沒有外部變量的。一旦你的一段程序有了外部變量，這段程序就不完整，不能獨立運行。你爲了使他們運行，就要給全部的外部變量一個一個寫一些值進去。這些值的集合就叫上下文。
-- vzchapp

好比，在 flask 中，視圖函數須要知道它執行狀況的請求信息（請求的 url，參數，方法等）以及應用信息（應用中初始化的數據庫等），纔可以正確運行。frontend

最直觀地作法是把這些信息封裝成一個對象，做爲參數傳遞給視圖函數。可是這樣的話，全部的視圖函數都須要添加對應的參數，即便該函數內部並無使用到它。ide

flask 的作法是把這些信息做爲相似全局變量的東西，視圖函數須要的時候，可使用 from flask import request 獲取。可是這些對象和全局變量不一樣的是——它們必須是動態的，由於在多線程或者多協程的狀況下，每一個線程或者協程獲取的都是本身獨特的對象，不會互相干擾。

那麼如何實現這種效果呢？若是對 python 多線程比較熟悉的話，應該知道多線程中有個很是相似的概念 threading.local，能夠實現多線程訪問某個變量的時候只看到本身的數據。內部的原理提及來也很簡單，這個對象有一個字典，保存了線程 id 對應的數據，讀取該對象的時候，它動態地查詢當前線程 id 對應的數據。flaskpython 上下文的實現也相似，後面會詳細解釋。

flask 中有兩種上下文：application context 和 request context。上下文有關的內容定義在 globals.py 文件，文件的內容也很是短：

def _lookup_req_object(name):
    top = _request_ctx_stack.top
    if top is None:
        raise RuntimeError(_request_ctx_err_msg)
    return getattr(top, name)


def _lookup_app_object(name):
    top = _app_ctx_stack.top
    if top is None:
        raise RuntimeError(_app_ctx_err_msg)
    return getattr(top, name)


def _find_app():
    top = _app_ctx_stack.top
    if top is None:
        raise RuntimeError(_app_ctx_err_msg)
    return top.app


# context locals
_request_ctx_stack = LocalStack()
_app_ctx_stack = LocalStack()
current_app = LocalProxy(_find_app)
request = LocalProxy(partial(_lookup_req_object, 'request'))
session = LocalProxy(partial(_lookup_req_object, 'session'))
g = LocalProxy(partial(_lookup_app_object, 'g'))

flask 提供兩種上下文：application context 和 request context 。app lication context 又演化出來兩個變量 current_app 和 g，而 request context 則演化出來 request 和 session。

這裏的實現用到了兩個東西：LocalStack 和 LocalProxy。它們兩個的結果就是咱們能夠動態地獲取兩個上下文的內容，在併發程序中每一個視圖函數都會看到屬於本身的上下文，而不會出現混亂。

LocalStack 和 LocalProxy 都是 werkzeug 提供的，定義在 local.py 文件中。在分析這兩個類以前，咱們先介紹這個文件另一個基礎的類 Local。Local 就是實現了相似 threading.local 的效果——多線程或者多協程狀況下全局變量的隔離效果。下面是它的代碼：

# since each thread has its own greenlet we can just use those as identifiers
# for the context.  If greenlets are not available we fall back to the
# current thread ident depending on where it is.
try:
    from greenlet import getcurrent as get_ident
except ImportError:
    try:
        from thread import get_ident
    except ImportError:
        from _thread import get_ident

class Local(object):
    __slots__ = ('__storage__', '__ident_func__')

    def __init__(self):
        # 數據保存在 __storage__ 中，後續訪問都是對該屬性的操做
        object.__setattr__(self, '__storage__', {})
        object.__setattr__(self, '__ident_func__', get_ident)

    def __call__(self, proxy):
        """Create a proxy for a name."""
        return LocalProxy(self, proxy)

    # 清空當前線程/協程保存的全部數據
    def __release_local__(self):
        self.__storage__.pop(self.__ident_func__(), None)

    # 下面三個方法實現了屬性的訪問、設置和刪除。
    # 注意到，內部都調用 `self.__ident_func__` 獲取當前線程或者協程的 id，而後再訪問對應的內部字典。
    # 若是訪問或者刪除的屬性不存在，會拋出 AttributeError。
    # 這樣，外部用戶看到的就是它在訪問實例的屬性，徹底不知道字典或者多線程/協程切換的實現
    def __getattr__(self, name):
        try:
            return self.__storage__[self.__ident_func__()][name]
        except KeyError:
            raise AttributeError(name)

    def __setattr__(self, name, value):
        ident = self.__ident_func__()
        storage = self.__storage__
        try:
            storage[ident][name] = value
        except KeyError:
            storage[ident] = {name: value}

    def __delattr__(self, name):
        try:
            del self.__storage__[self.__ident_func__()][name]
        except KeyError:
            raise AttributeError(name)

能夠看到，Local 對象內部的數據都是保存在 __storage__ 屬性的，這個屬性變量是個嵌套的字典：map[ident]map[key]value。最外面字典 key 是線程或者協程的 identity，value 是另一個字典，這個內部字典就是用戶自定義的 key-value 鍵值對。用戶訪問實例的屬性，就變成了訪問內部的字典，外面字典的 key 是自動關聯的。__ident_func 是協程的 get_current 或者線程的 get_ident，從而獲取當前代碼所在線程或者協程的 id。

除了這些基本操做以外，Local 還實現了 __release_local__ ，用來清空（析構）當前線程或者協程的數據（狀態）。__call__ 操做來建立一個 LocalProxy 對象，LocalProxy 會在下面講到。

理解了 Local，咱們繼續回來看另外兩個類。

LocalStack 是基於 Local 實現的棧結構。若是說 Local 提供了多線程或者多協程隔離的屬性訪問，那麼 LocalStack 就提供了隔離的棧訪問。下面是它的實現代碼，能夠看到它提供了 push、pop 和 top 方法。

__release_local__ 能夠用來清空當前線程或者協程的棧數據，__call__ 方法返回當前線程或者協程棧頂元素的代理對象。

class LocalStack(object):
    """This class works similar to a :class:`Local` but keeps a stack
    of objects instead. """

    def __init__(self):
        self._local = Local()

    def __release_local__(self):
        self._local.__release_local__()

    def __call__(self):
        def _lookup():
            rv = self.top
            if rv is None:
                raise RuntimeError('object unbound')
            return rv
        return LocalProxy(_lookup)

    # push、pop 和 top 三個方法實現了棧的操做，
    # 能夠看到棧的數據是保存在 self._local.stack 屬性中的
    def push(self, obj):
        """Pushes a new item to the stack"""
        rv = getattr(self._local, 'stack', None)
        if rv is None:
            self._local.stack = rv = []
        rv.append(obj)
        return rv

    def pop(self):
        """Removes the topmost item from the stack, will return the
        old value or `None` if the stack was already empty.
        """
        stack = getattr(self._local, 'stack', None)
        if stack is None:
            return None
        elif len(stack) == 1:
            release_local(self._local)
            return stack[-1]
        else:
            return stack.pop()

    @property
    def top(self):
        """The topmost item on the stack.  If the stack is empty,
        `None` is returned.
        """
        try:
            return self._local.stack[-1]
        except (AttributeError, IndexError):
            return None

咱們在以前看到了 request context 的定義，它就是一個 LocalStack 的實例：

_request_ctx_stack = LocalStack()

它會當前線程或者協程的請求都保存在棧裏，等使用的時候再從裏面讀取。至於爲何要用到棧結構，而不是直接使用 Local，咱們會在後面揭曉答案，你能夠先思考一下。

LocalProxy 是一個 Local 對象的代理，負責把全部對本身的操做轉發給內部的 Local 對象。LocalProxy 的構造函數介紹一個 callable 的參數，這個 callable 調用以後須要返回一個 Local 實例，後續全部的屬性操做都會轉發給 callable 返回的對象。

class LocalProxy(object):
    """Acts as a proxy for a werkzeug local.
    Forwards all operations to a proxied object. """
    __slots__ = ('__local', '__dict__', '__name__')

    def __init__(self, local, name=None):
        object.__setattr__(self, '_LocalProxy__local', local)
        object.__setattr__(self, '__name__', name)

    def _get_current_object(self):
        """Return the current object."""
        if not hasattr(self.__local, '__release_local__'):
            return self.__local()
        try:
            return getattr(self.__local, self.__name__)
        except AttributeError:
            raise RuntimeError('no object bound to %s' % self.__name__)

    @property
    def __dict__(self):
        try:
            return self._get_current_object().__dict__
        except RuntimeError:
            raise AttributeError('__dict__')

    def __getattr__(self, name):
        if name == '__members__':
            return dir(self._get_current_object())
        return getattr(self._get_current_object(), name)

    def __setitem__(self, key, value):
        self._get_current_object()[key] = value

這裏實現的關鍵是把經過參數傳遞進來的 Local 實例保存在 __local 屬性中，並定義了 _get_current_object() 方法獲取當前線程或者協程對應的對象。

NOTE：前面雙下劃線的屬性，會保存到 _ClassName__variable 中。因此這裏經過 「_LocalProxy__local」 設置的值，後面能夠經過 self.__local 來獲取。關於這個知識點，能夠查看 stackoverflow 的這個問題。

而後 LocalProxy 重寫了全部的魔術方法（名字先後有兩個下劃線的方法），具體操做都是轉發給代理對象的。這裏只給出了幾個魔術方法，感興趣的能夠查看源碼中全部的魔術方法。

繼續回到 request context 的實現：

_request_ctx_stack = LocalStack()
request = LocalProxy(partial(_lookup_req_object, 'request'))
session = LocalProxy(partial(_lookup_req_object, 'session'))

再次看這段代碼但願能看明白，_request_ctx_stack 是多線程或者協程隔離的棧結構，request 每次都會調用 _lookup_req_object 棧頭部的數據來獲取保存在裏面的 requst context。

那麼請求上下文信息是什麼被放在 stack 中呢？還記得以前介紹的 wsgi_app() 方法有下面兩行代碼嗎？

ctx = self.request_context(environ)
ctx.push()

每次在調用 app.__call__ 的時候，都會把對應的請求信息壓棧，最後執行完請求的處理以後把它出棧。

咱們來看看request_context，這個方法只有一行代碼：

def request_context(self, environ):
    return RequestContext(self, environ)

它調用了 RequestContext，並把 self 和請求信息的字典 environ 當作參數傳遞進去。追蹤到 RequestContext 定義的地方，它出如今 ctx.py 文件中，代碼以下：

class RequestContext(object):
    """The request context contains all request relevant information.  It is
    created at the beginning of the request and pushed to the
    `_request_ctx_stack` and removed at the end of it.  It will create the
    URL adapter and request object for the WSGI environment provided.
    """

    def __init__(self, app, environ, request=None):
        self.app = app
        if request is None:
            request = app.request_class(environ)
        self.request = request
        self.url_adapter = app.create_url_adapter(self.request)
        self.match_request()

    def match_request(self):
        """Can be overridden by a subclass to hook into the matching
        of the request.
        """
        try:
            url_rule, self.request.view_args = \
                self.url_adapter.match(return_rule=True)
            self.request.url_rule = url_rule
        except HTTPException as e:
            self.request.routing_exception = e

    def push(self):
        """Binds the request context to the current context."""
        # Before we push the request context we have to ensure that there
        # is an application context.
        app_ctx = _app_ctx_stack.top
        if app_ctx is None or app_ctx.app != self.app:
            app_ctx = self.app.app_context()
            app_ctx.push()
            self._implicit_app_ctx_stack.append(app_ctx)
        else:
            self._implicit_app_ctx_stack.append(None)

        _request_ctx_stack.push(self)

        self.session = self.app.open_session(self.request)
        if self.session is None:
            self.session = self.app.make_null_session()

    def pop(self, exc=_sentinel):
        """Pops the request context and unbinds it by doing that.  This will
        also trigger the execution of functions registered by the
        :meth:`~flask.Flask.teardown_request` decorator.
        """
        app_ctx = self._implicit_app_ctx_stack.pop()

        try:
            clear_request = False
            if not self._implicit_app_ctx_stack:
                self.app.do_teardown_request(exc)

                request_close = getattr(self.request, 'close', None)
                if request_close is not None:
                    request_close()
                clear_request = True
        finally:
            rv = _request_ctx_stack.pop()

            # get rid of circular dependencies at the end of the request
            # so that we don't require the GC to be active.
            if clear_request:
                rv.request.environ['werkzeug.request'] = None

            # Get rid of the app as well if necessary.
            if app_ctx is not None:
                app_ctx.pop(exc)

    def auto_pop(self, exc):
        if self.request.environ.get('flask._preserve_context') or \
           (exc is not None and self.app.preserve_context_on_exception):
            self.preserved = True
            self._preserved_exc = exc
        else:
            self.pop(exc)

    def __enter__(self):
        self.push()
        return self

    def __exit__(self, exc_type, exc_value, tb):
        self.auto_pop(exc_value)

每一個 request context 都保存了當前請求的信息，好比 request 對象和 app 對象。在初始化的最後，還調用了 match_request 實現了路由的匹配邏輯。

push 操做就是把該請求的 ApplicationContext（若是 _app_ctx_stack 棧頂不是當前請求所在 app ，須要建立新的 app context）和 RequestContext 有關的信息保存到對應的棧上，壓棧後還會保存 session 的信息； pop 則相反，把 request context 和 application context 出棧，作一些清理性的工做。

到這裏，上下文的實現就比較清晰了：每次有請求過來的時候，flask 會先建立當前線程或者進程須要處理的兩個重要上下文對象，把它們保存到隔離的棧裏面，這樣視圖函數進行處理的時候就能直接從棧上獲取這些信息。

NOTE：由於 app 實例只有一個，所以多個 request 共享了 application context。

到這裏，關於 context 的實現和功能已經講解得差很少了。還有兩個疑惑沒有解答。

爲何要把 request context 和 application context 分開？每一個請求不是都同時擁有這兩個上下文信息嗎？
爲何 request context 和 application context 都有實現成棧的結構？每一個請求難道會出現多個 request context 或者 application context 嗎？

第一個答案是「靈活度」，第二個答案是「多 application」。雖然在實際運行中，每一個請求對應一個 request context 和一個 application context，可是在測試或者 python shell 中運行的時候，用戶能夠單首創建 request context 或者 application context，這種靈活度方便用戶的不一樣的使用場景；並且棧可讓 redirect 更容易實現，一個處理函數能夠從棧中獲取重定向路徑的多個請求信息。application 設計成棧也是相似，測試的時候能夠添加多個上下文，另一個緣由是 flask 能夠多個 application 同時運行:

from werkzeug.wsgi import DispatcherMiddleware
from frontend_app import application as frontend
from backend_app import application as backend

application = DispatcherMiddleware(frontend, {
    '/backend':     backend
})

這個例子就是使用 werkzeug 的 DispatcherMiddleware 實現多個 app 的分發，這種狀況下 _app_ctx_stack 棧裏會出現兩個 application context。