python之WSGI與Guincorn

時間 2019-11-17

標籤 python wsgi guincorn 欄目 Python 简体版

原文原文鏈接

WSGI與Guincorn

WSGI

WSGI (Web Server Gateway Interface)，WSGI是爲Python語言定義的Web服務器和Web應用程序之間的一種通用接口。web

以下圖，WSGI就像一座橋樑，一邊連着web服務器，另外一邊連着應用程序。

wsgi server能夠理解爲一個符合wsgi規範的web server，它接收client發來的request，解析請求後封裝到環境變量environ中，按照wsgi規範調用註冊的wsgi app，最後將response返回給客戶端。

一個WSGI接口包括以下三個組件：

Server，處理請求，提供環境信息、以及一個callback 給appication，並接收web響應做爲返回值；
Middleware，鏈接server和application兩方，能夠重寫環境信息，根據目標URL，將請求路由到不一樣的應用對象；
Application，一個callable對象，

下面用一個例子來講明WSGI的工做模式：django

from wsgiref.simple_server import make_server

def simple_app(environ, start_response):
    status = '200 OK'
    response_headers = [('Content-type', 'text/plain')]
    start_response(status, response_headers)
    return [u"This is hello wsgi app".encode('utf8')]

httpd = make_server('', 8000, simple_app)
print "Serving on port 8000..."
httpd.serve_forever()

例子中， simple_app函數是符合WSGI標準的一個HTTP處理函數，它有兩個參數：flask

environ，一個dict對象，包含請求的全部信息；
start_response：callback函數，，包括HTTP響應碼，HTTP響應頭；

此外，simple_app函數的return語句返回的iterator對象做爲HTTP響應body服務器

實際上，wsgi app只要是一個callable對象就能夠了，所以不必定要是函數，一個實現了__call__方法的實例也能夠，示例代碼以下：網絡

from wsgiref.simple_server import make_server

class AppClass:

    def __call__(self, environ, start_response):
        status = '200 OK'
        response_headers = [('Content-type', 'text/plain')]
        start_response(status, response_headers)
        return ["hello world!"]

app = AppClass()
httpd = make_server('', 8000, app)
print "Serving on port 8000..."
httpd.serve_forever()

middleware

Middleware的概念沒有appllication和server那麼容易理解。
假設一個符合application標準的可調用對象，它接受可調用對象做爲參數，返回一個可調用對象的對象。
那麼對於server來講，它是一個符合標準的可調用對象，所以是application。
而對於application來講，它能夠調用application，所以是server。
這樣的可調用對象稱爲middleware，middleware的概念很是接近decorator。session

middleware的例子：app

def exampleApplication(environ, start_response):  
    if environ['superSession'].has_key('visited'):  
        text = "You have already visited!"  
    else:  
        environ['superSession']['visited'] = 1  
        text = "This is your first visit."  
    start_response('200 OK', [('Content-type','text/plain')])  
    return [text]  
      
def session(application):  
    def app(environ, start_response):  
        if "superSession" not in environ:  
            import superSession  
            environ["superSession"] = superSession.session()  
        return application(environ, start_response)  
    return app  
      
application = session(exampleApplication)

如上面，session函數用於判斷用戶訪問行爲。session函數將判斷結果至於環境變量environ字典中。
exampleApplication經過environ字典得到用戶訪問行爲。
咱們稱session函數爲middleware，它處於server與application之間，對server傳來的請求作相應的處理；它對於Server和application是透明的。
middleware的好處在於，經過middleware（本例中session函數）能夠很簡單的給WSGI程序添加新功能。框架

咱們也可見將middleware包裝成類，這樣，咱們能夠經過繼承，複用現有的中間件。類中要重載__call__。socket

class Session:  
    def __init__(self, application):  
        self.application = application  
  
    def __call__(self, environ, start_response):  
        if "superSession" not in environ:  
            import superSession  
            environ["superSession"] = superSession.session() # Options would obviously need specifying  
        return self.application(environ,start_response)  
          
application = Session(exampleApplication)

Gunicorn

Gunicorn（綠色獨角獸）是一個被普遍使用的高性能的Python WSGI UNIX HTTP服務器，移植自Ruby的獨角獸（Unicorn ）項目，使用pre-fork worker模式，具備使用很是簡單，輕量級的資源消耗，以及高性能等特色。函數

Gunicorn 服務器做爲wsgi app的容器，可以與各類Web框架兼容（flask，django等），得益於gevent等技術，使用Gunicorn可以在基本不改變wsgi app代碼的前提下，大幅度提升wsgi app的性能。

Gunicorn 使用例子：

$ cat myapp.py
    def app(environ, start_response):
        data = b"Hello, World!\n"
        start_response("200 OK", [
            ("Content-Type", "text/plain"),
            ("Content-Length", str(len(data)))
        ])
        return iter([data])
  $ gunicorn -w 4 myapp:app
  [2014-09-10 10:22:28 +0000] [30869] [INFO] Listening at: http://127.0.0.1:8000 (30869)
  [2014-09-10 10:22:28 +0000] [30869] [INFO] Using worker: sync
  [2014-09-10 10:22:28 +0000] [30874] [INFO] Booting worker with pid: 30874
  [2014-09-10 10:22:28 +0000] [30875] [INFO] Booting worker with pid: 30875
  [2014-09-10 10:22:28 +0000] [30876] [INFO] Booting worker with pid: 30876
  [2014-09-10 10:22:28 +0000] [30877] [INFO] Booting worker with pid: 30877

多進程模型

Gunicorn 有一個master進程，以及幾個的worker進程，master經過pre-fork的方式建立多個worker，跟Nginx的有點像。

以下是master進程fork出worker進程的代碼：

def spawn_worker(self):
    self.worker_age += 1
    #建立worker。請注意這裏的app 對象並非真正的wsgi app對象，而是gunicorn的app對象；
    #gunicorn的app對象負責import咱們本身寫的wsgi app對象。
    worker = self.worker_class(self.worker_age, self.pid, self.LISTENERS,
                                self.app, self.timeout / 2.0,
                                self.cfg, self.log) 
    pid = os.fork()
    if pid != 0:  #父進程，返回後繼續建立其餘worker，沒worker後進入到本身的消息循環
        self.WORKERS[pid] = worker
        return pid

    # Process Child
    worker_pid = os.getpid()
    try:
        ..........
        worker.init_process() #子進程，初始化woker，進入worker的消息循環，
        sys.exit(0)
    except SystemExit:
        raise
    ............

在worker.init_process()函數中，worker中gunicorn的app對象會去import 咱們的wsgi app。也就是說，每一個woker子進程都會單獨去實例化咱們的wsgi app對象。每一個worker中的swgi app對象是相互獨立、互不干擾的。

manager維護數量固定的worker：

def manage_workers(self):
        if len(self.WORKERS.keys()) < self.num_workers:
            self.spawn_workers()
        while len(workers) > self.num_workers:
            (pid, _) = workers.pop(0)
            self.kill_worker(pid, signal.SIGQUIT)

建立完全部的worker後，worker和master各自進入本身的消息循環。
master的事件循環就是收收信號，管理管理worker進程，而worker進程的事件循環就是監聽網絡事件並處理（如新建鏈接，斷開鏈接，處理請求發送響應等等），因此真正的鏈接最終是連到了worker進程上的。

worker

woker有不少種，包括：ggevent、geventlet、gtornado等等。這裏主要分析ggevent。

每一個ggevent worker啓動的時候會啓動多個server對象：worker首先爲每一個listener建立一個server對象（注：爲何是一組listener,由於gunicorn能夠綁定一組地址,每一個地址對於一個listener），每一個server對象都有運行在一個單獨的gevent pool對象中。真正等待連接和處理連接的操做是在server對象中進行的。

#爲每一個listener建立server對象。
    for s in self.sockets:
        pool = Pool(self.worker_connections) #建立gevent pool
        if self.server_class is not None:
           #建立server對象
            server = self.server_class(  
                s, application=self.wsgi, spawn=pool, log=self.log,
                handler_class=self.wsgi_handler, **ssl_args)
        .............
        server.start() #啓動server，開始等待連接，服務連接
        servers.append(server)
        .........

上面代碼中的server_class其實是一個gevent的WSGI SERVER的子類：

class PyWSGIServer(pywsgi.WSGIServer):
    base_env = BASE_WSGI_ENV

注意，server_class的參數中s是server用來監聽連接的套接字。spawn是gevent的協程池。application便是咱們的wsgi app（通俗點講就是你用 flask 或者 django寫成的app），咱們的app就是經過這種方式交給gunicorn的woker去跑的。 handler_class是gevent的pywsgi.WSGIHandler子類。

WSGI Server

真正等待連接和處理連接的操做是在gevent的WSGIServer 和 WSGIHandler中進行的。
最後再來看一下gevent的WSGIServer 和 WSGIHandler的主要實現:

WSGIServer 的start函數裏面調用start_accepting來處理到來的連接。在start_accepting裏面獲得接收到的套接字後調用do_handle來處理套接字：

def do_handle(self, *args):
    spawn = self._spawn
    spawn(self._handle, *args)

能夠看出，WSGIServer 其實是建立一個協程去處理該套接字，也就是說在WSGIServer 中，一個協程單獨負責一個HTTP連接。協程中運行的self._handle函數其實是調用了WSGIHandler的handle函數來不斷處理http 請求：

def handle(self):
    try:
        while self.socket is not None:
            result = self.handle_one_request()#處理HTTP請求
            if result is None:
                break
            if result is True:
                continue
            self.status, response_body = result
            self.socket.sendall(response_body)#發送迴應報文
          ..............

在handle函數的循環內部，handle_one_request函數首先讀取HTTP 請求，初始化WSGI環境，而後最終調用run_application函數來處理請求：

def run_application(self):
    self.result = self.application(self.environ, self.start_response)
    self.process_result()

在這個地方纔真正的調用了咱們的 app。

總結：gunicorn 會啓動一組 worker進程，全部worker進程公用一組listener，在每一個worker中爲每一個listener創建一個wsgi server。每當有HTTP連接到來時，wsgi server建立一個協程來處理該連接，協程處理該連接的時候，先初始化WSGI環境，而後調用用戶提供的app對象去處理HTTP請求。

相關標籤/搜索

guincorn

wsgi

nginx+apache2+python+wsgi

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。