那麼webpy是什麼呢? 閱讀它的源碼咱們又能學到什麼呢?
簡單說webpy就是一個開源的web應用框架(官方首頁:http://webpy.org/)
它的源代碼很是整潔精幹,學習它一方面可讓咱們快速瞭解python語法(遇到看不懂的語法就去google),另外一方面能夠學習到python高級特性的使用(譬如反射,裝飾器),並且在webpy中還內置了一個簡單HTTP服務器(文檔建議該服務器僅用於開發環境,生產環境應使用apache之類的),對於想簡單瞭解下HTTP服務器實現的朋友來講,這個是再好不過的例子了(而且在這個服務器代碼中,還能夠學習到線程池,消息隊列等技術),除此以外webpy還包括模板渲染引擎,DB框架等等,這裏面的每個部分均可以單獨拿出來學習.
在JavaWeb開發中有Servlet規範,那麼Python Web開發中有規範嗎?
答案就是:WSGI,它定義了服務器如何與你的webapp交互
關於WSGI規範,能夠參看下面這個連接:
http://ivory.idyll.org/articles/wsgi-intro/what-is-wsgi.html
如今咱們利用webpy內置的WSGIServer,按照WSGI規範,寫一個簡單的webapp,eg: html
- import web.wsgiserver
-
- def my_wsgi_app(env, start_response):
- status = '200 OK'
- response_headers = [('Content-type','text/plain')]
- start_response(status, response_headers)
- return ['Hello world!']
-
- server = web.wsgiserver.CherryPyWSGIServer(("127.0.0.1", 8080), my_wsgi_app);
- server.start()
執行代碼:
在具體看WSGIServer代碼以前,咱們先看一幅圖,這幅圖概述了WSGIServer內部執行流程:
接下來咱們看下代碼,ps: 爲了較清晰的梳理主幹流程,我只列出核心代碼段 python
- class CherryPyWSGIServer(HTTPServer):
-
- def __init__(self, bind_addr, wsgi_app, numthreads=10, server_name=None,
- max=-1, request_queue_size=5, timeout=10, shutdown_timeout=5):
-
- self.requests = ThreadPool(self, min=numthreads or 1, max=max)
-
- self.wsgi_app = wsgi_app
-
- self.gateway = WSGIGateway_10
-
- self.bind_addr = bind_addr
-
-
- class HTTPServer(object):
-
-
-
- def start(self):
-
-
- if isinstance(self.bind_addr, basestring):
- try: os.unlink(self.bind_addr)
- except: pass
- info = [(socket.AF_UNIX, socket.SOCK_STREAM, 0, "", self.bind_addr)]
- else:
-
- host, port = self.bind_addr
- try:
- info = socket.getaddrinfo(host, port, socket.AF_UNSPEC,
- socket.SOCK_STREAM, 0, socket.AI_PASSIVE)
- except socket.gaierror:
-
-
-
- for res in info:
- af, socktype, proto, canonname, sa = res
- try:
- self.bind(af, socktype, proto)
- except socket.error:
- if self.socket:
- self.socket.close()
- self.socket = None
- continue
- break
- if not self.socket:
- raise socket.error(msg)
-
-
- self.socket.listen(self.request_queue_size)
-
-
- self.requests.start()
-
- self.ready = True
- while self.ready:
-
-
-
- self.tick()
-
- def bind(self, family, type, proto=0):
-
- self.socket = socket.socket(family, type, proto)
-
- self.socket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
-
- self.socket.bind(self.bind_addr)
-
-
- def tick(self):
- try:
-
- s, addr = self.socket.accept()
-
-
- makefile = CP_fileobject
- conn = self.ConnectionClass(self, s, makefile)
-
- self.requests.put(conn)
- except :
-
以前咱們說過HTTPServer中的request屬性是一個線程池(這個線程池內部關聯着一個消息隊列),如今咱們看看做者是如何實現一個線程池的: web
- class ThreadPool(object):
-
- def __init__(self, server, min=10, max=-1):
-
- self.server = server
-
- self.min = min
- self.max = max
-
- self._threads = []
-
- self._queue = Queue.Queue()
-
-
-
- self.get = self._queue.get
-
-
- def start(self):
-
- for i in range(self.min):
- self._threads.append(WorkerThread(self.server))
- for worker in self._threads:
- worker.start()
-
-
- def put(self, obj):
- self._queue.put(obj)
-
-
- def grow(self, amount):
- for i in range(amount):
- if self.max > 0 and len(self._threads) >= self.max:
- break
- worker = WorkerThread(self.server)
- self._threads.append(worker)
- worker.start()
-
-
- def shrink(self, amount):
-
- for t in self._threads:
- if not t.isAlive():
- self._threads.remove(t)
- amount -= 1
-
-
-
- if amount > 0:
- for i in range(min(amount, len(self._threads) - self.min)):
- self._queue.put(_SHUTDOWNREQUEST)
-
- class WorkerThread(threading.Thread):
-
- def __init__(self, server):
- self.ready = False
- self.server = server
-
- threading.Thread.__init__(self)
-
- def run(self):
-
- self.ready = True
- while True:
-
- conn = self.server.requests.get()
-
-
- if conn is _SHUTDOWNREQUEST:
- return
-
- self.conn = conn
-
- try:
-
- conn.communicate()
- finally:
- conn.close()
剛纔咱們看到,WorkThread從消息隊列中獲取一個HTTPConnection對象,而後調用它的communicate方法,那這個communicate方法究竟作了些什麼呢? apache
- class HTTPConnection(object):
-
- RequestHandlerClass = HTTPRequest
-
- def __init__(self, server, sock, makefile=CP_fileobject):
- self.server = server
- self.socket = sock
-
- self.rfile = makefile(sock, "rb", self.rbufsize)
- self.wfile = makefile(sock, "wb", self.wbufsize)
-
- def communicate(self):
-
- req = self.RequestHandlerClass(self.server, self)
-
- req.parse_request()
-
- req.respond()
在咱們具體看HTTPRequest.parse_request如何解析HTTP請求以前,咱們先了解下HTTP協議. HTTP協議是一個文本行的協議,它一般由如下部分組成: 編程
引用
請求行(請求方法 URI路徑 HTTP協議版本)
請求頭(譬如:User-Agent,Host等等)
空行
可選的數據實體
而HTTPRequest.parse_request方法就是把socket中的字節流,按照HTTP協議規範解析,而且從中提取信息(最終封裝成一個env傳遞給webapp):
安全
- def parse_request(self):
- self.rfile = SizeCheckWrapper(self.conn.rfile,
- self.server.max_request_header_size)
-
- self.read_request_line()
-
- success = self.read_request_headers()
-
-
- def read_request_line(self):
-
- request_line = self.rfile.readline()
-
-
- method, uri, req_protocol = request_line.strip().split(" ", 2)
- self.uri = uri
- self.method = method
-
- scheme, authority, path = self.parse_request_uri(uri)
-
- qs = ''
- if '?' in path:
- path, qs = path.split('?', 1)
- self.path = path
-
-
- def read_request_headers(self):
-
- read_headers(self.rfile, self.inheaders)
-
-
- def read_headers(rfile, hdict=None):
- if hdict is None:
- hdict = {}
-
- while True:
- line = rfile.readline()
-
- k, v = line.split(":", 1)
-
- k = k.strip().title()
- v = v.strip()
- hname = k
-
-
- if k in comma_separated_headers:
- existing = hdict.get(hname)
- if existing:
- v = ", ".join((existing, v))
-
- hdict[hname] = v
-
- return hdict
至此咱們就分析完了HTTPRequest.parse_request方法如何解析HTTP請求,下面咱們就接着看看HTTPRequest.respond如何響應請求:
服務器
- def respond(self):
-
- self.server.gateway(self).respond()
在繼續往下看代碼以前,咱們先簡單思考下,爲何要有這個gateway,爲何這裏不把請求直接交給webapp處理?
我本身以爲仍是出於分層和代碼複用性考慮。由於可能存在,或者須要支持不少web規範,目前咱們使用的是wsgi規範,明天可能出來個ysgi,大後天可能還來個zsgi,若是按照當前的設計,咱們只須要替換HTTPServer的gateway屬性,而不用修改其餘代碼(相似JAVA概念中的DAO層),下面咱們就來看看這個gateway的具體實現(回到本文最初,咱們在Server中註冊的gateway是WSGIGateway_10):
WSGI網關 網絡
- class WSGIGateway(Gateway):
- def __init__(self, req):
- self.req = req
- self.env = self.get_environ()
-
-
- def get_environ(self):
- raise NotImplemented
-
- def respond(self):
-
-
-
- response = self.req.server.wsgi_app(self.env, self.start_response)
-
-
- for chunk in response:
- self.write(chunk)
-
- def start_response(self, status, headers, exc_info = None):
- self.req.status = status
- self.req.outheaders.extend(headers)
-
- return self.write
-
- def write(self, chunk):
-
- self.req.send_headers()
-
- self.req.write(chunk)
WSGIGateway_10繼承WSGIGateway類,並實現get_environ方法 app
- class WSGIGateway_10(WSGIGateway):
-
- def get_environ(self):
-
- req = self.req
- env = {
- 'ACTUAL_SERVER_PROTOCOL': req.server.protocol,
- 'PATH_INFO': req.path,
- 'QUERY_STRING': req.qs,
- 'REMOTE_ADDR': req.conn.remote_addr or '',
- 'REMOTE_PORT': str(req.conn.remote_port or ''),
- 'REQUEST_METHOD': req.method,
- 'REQUEST_URI': req.uri,
- 'SCRIPT_NAME': '',
- 'SERVER_NAME': req.server.server_name,
- 'SERVER_PROTOCOL': req.request_protocol,
- 'SERVER_SOFTWARE': req.server.software,
- 'wsgi.errors': sys.stderr,
- 'wsgi.input': req.rfile,
- 'wsgi.multiprocess': False,
- 'wsgi.multithread': True,
- 'wsgi.run_once': False,
- 'wsgi.url_scheme': req.scheme,
- 'wsgi.version': (1, 0),
- }
-
-
-
- for k, v in req.inheaders.iteritems():
- env["HTTP_" + k.upper().replace("-", "_")] = v
-
-
- return env