Django 源碼小剖: 初探 WSGI

時間 2019-12-09

標籤 django 源碼初探 wsgi 欄目 Python 简体版

原文原文鏈接

python 做爲一種腳本語言, 已經逐漸大量用於 web 後臺開發中, 而基於 python 的 web 應用程序框架也愈來愈多, Bottle, Django, Flask 等等.python

在一個 HTTP 請求到達服務器時, 服務器接收並調用 web 應用程序解析請求, 產生響應數據並返回給服務器. 這裏涉及了兩個方面的東西: 服務器(server)和應用程序(application). 勢必要有一個合約要求服務器和應用程序都去遵照, 如此按照此合約開發的不管是服務器仍是應用程序都會具備較大的廣泛性. 而這就好像在計算機通訊的早期, 各大公司都有屬於本身的通訊協議, 如此只會讓市場雜亂無章, 寧願只要一種通訊協議.nginx

而針對 python 的合約是 WSGI(Python Web Server Gateway Interface). 具體的規定見 PEP 333.git

實習的時候一直使用 Django, 下面是結合 Django 學習 WSGI 的筆記.github

application/應用程序

在應用程序一方面, 必須提供下面的方法:web

def simple_app(environ, start_response):
    """多是最簡單的處理了"""
    status = '200 OK'
    response_headers = [('Content-type', 'text/plain')]
    start_response(status, response_headers)
    return ['Hello world!\n'] # 返回結果必須可迭代

除了方法之外, 還能夠用實現了 __call__ 的類實現.apache

它會被服務器調用, 在這裏 environ 是一個字典, 包含了環境變量, REQUEST_METHOD,SCRIPT_NAME,QUERY_STRING 等; start_response 是一個回調函數, 會在 simple_app 中被調用, 主要用來開始響應 HTTP. start_response 原型大概是這樣:瀏覽器

def start_response(status, response_headers, exc_info=None):
    ...
    return write # 返回這 write 函數 只是爲了兼容以前的 web 框架, 新的框架根本用不到.

參數有 status 即狀態碼; response_headers HTTP 頭, 能夠修改; exc_info 是與錯誤相關的信息, 在產生相應數據過程當中可能發生錯誤, 這時須要更新 HTTP 頭部, 經過再次調用 start_response 能夠實現. 所以更爲詳盡的實現寫法多是這種:服務器

def start_response(status, response_headers, exc_info=None):
    if exc_info:
         try:
             # do stuff w/exc_info here
         finally:
             exc_info = None    # Avoid circular ref.
    return write

Server/服務器

在服務器方面, 能夠想象最簡單的工做就是調用 simple_app(), 而後向客戶端發送數據:app

result = simple_app(environ, start_response) #名字不必定爲 simple_app
try:
    for data in result:
        if data:    # don't send headers until body appears
            write(data)
    if not headers_sent:
        write('')   # send headers now if body was empty
finally:
    if hasattr(result, 'close'):
        result.close()

注意 WSGI 並無事無鉅細規定 web 應用程序和服務器內部的工做方式, 只是是規定了它們之間鏈接的標準.框架

python wsgiref 模塊

下面看看 Django 是如何實現 WSGI 的. Django 其內部已經自帶了一個方便本地測試的小服務器, 因此在剛開始學習 Django 的時候並不需搭建 apache 或者 nginx 服務器. Django 自帶的服務器基於 python wsgiref 模塊實現, 它自帶的測試代碼:

# demo_app() 是 application
def demo_app(environ,start_response):
    from StringIO import StringIO
    stdout = StringIO()
    print >>stdout, "Hello world!"
    print >>stdout
    h = environ.items(); h.sort()
    for k,v in h:
        print >>stdout, k,'=', repr(v)
    start_response("200 OK", [('Content-Type','text/plain')])
    return [stdout.getvalue()]

def make_server(
    host, port, app, server_class=WSGIServer, handler_class=WSGIRequestHandler
):
    """Create a new WSGI server listening on `host` and `port` for `app`"""
    server = server_class((host, port), handler_class)
    server.set_app(app)
    return server

if __name__ == '__main__':
    httpd = make_server('', 8000, demo_app)
    sa = httpd.socket.getsockname()
    print "Serving HTTP on", sa[0], "port", sa[1], "..."
    import webbrowser
    webbrowser.open('http://localhost:8000/xyz?abc')
    httpd.handle_request()  # serve one request, then exit

python 的庫有好多的工具, 這時可能由於須要的緣由, 會生出好多的父類, 爲了講明, 根據 wsgiref 模塊和它自帶的測試用例得出下面的 UML 圖(注意, 這只是 wsgiref, 沒有涉及 Django):

我讀完這些的時候已經暈了, 確實是裏邊的繼承關係有些複雜. 所以, 簡要的歸納了測試代碼的執行關係:

make_server() 中 WSGIServer 類已經做爲服務器類, 負責接收請求, 調用 application 的處理, 返回相應;
WSGIRequestHandler 做爲請求處理類, 並已經配置在 WSGIServer 中;
接着還設置了 WSGIServer.application 屬性(set_app(app));
返回 server 實例.
接着打開瀏覽器, 即發起請求. 服務器實例 WSGIServer httpd 調用自身 handle_request() 函數處理請求. handle_request() 的工做流程以下:請求-->WSGIServer 收到-->調用 WSGIServer.handle_request()-->調用 _handle_request_noblock()-->調用 process_request()-->調用 finish_request()-->finish_request() 中實例化 WSGIRequestHandler-->實例化過程當中會調用 handle()-->handle() 中實例化 ServerHandler-->調用 ServerHandler.run()-->run() 調用 application() 這纔是真正的邏輯.-->run() 中在調用 ServerHandler.finish_response() 返回數據-->回到 process_request() 中調用 WSGIServer.shutdown_request() 關閉請求(其實什麼也沒作)

ps: 明明 application 是 WSGIServer 的屬性, 爲何會在 ServerHandler 中調用? 由於在實例化 WSGIRequestHandler 的時候 WSGIServer 把本身搭進去了, 因此在 WSGIRequestHandler 中實例化 ServerHandler 時候能夠經過 WSGIRequestHandler.server.get_app() 獲得真正的 application.

總結

從上面能夠獲得, 啓動服務器的時候, 不管以什麼方式都要給它傳遞一個 application(), 是一個函數也好, 一個實現了 __call__ 的類也好; 當請求到達服務器的時候, 服務器自會調用 application(), 從而獲得相應數據. 至於, 對請求的數據如何相應, application() 中能夠細化.

確實, 其中的調用鏈太過長, 這期間尚未加入 HTTP 頭的分析(提取 Cookie等). 若是隻爲響應一個 "helloworld", 在 WSGIServer.finish_request() 中直接相應數據就行了, WSGIRequestHandler 和 ServerHandler 類能夠直接省去, 而只須要你提供一個 application()! 但事實上, 並不僅是相應 "helloworld" 那樣簡單...

關於 Django 中的 WSGI 如何, 下一節再說. Django 源碼剖析從這裏開始! 我已經在 github 備份了 Django 源碼的註釋: Decode-Django, 有興趣的童鞋 fork 吧. 本文結合 python wsgiref, BaseHTTPServer.py, SocketServer.py 模塊源碼看更好.

搗亂 2013-9-4

http://daoluan.net