Python（00）：Python程序中的IO模型

時間 2020-07-06

標籤 python 程序模型欄目 Python 简体版

原文原文鏈接

五種IO模型

爲了更好地瞭解IO模型，咱們須要事先回顧下：同步、異步、阻塞、非阻塞html

同步（synchronous） IO
異步（asynchronous） IO
阻塞（blocking） IO
非阻塞（non-blocking）IO

五種I/O模型包括：阻塞I/O、非阻塞I/O、信號驅動I/O(不經常使用)、I/O多路轉接、異步I/O。其中，前四個被稱爲同步I/O。linux

上五個模型的阻塞程度由低到高爲：阻塞I/O > 非阻塞I/O > 多路轉接I/O > 信號驅動I/O > 異步I/O，所以他們的效率是由低到高的。編程

一、阻塞I/O模型

在linux中，默認狀況下全部的socket都是blocking,除非特別指定，幾乎全部的I/O接口 ( 包括socket接口 ) 都是阻塞型的。網絡

若是所面臨的可能同時出現的上千甚至上萬次的客戶端請求，「線程池」或「鏈接池」或許能夠緩解部分壓力，可是不能解決全部問題。總之，多線程模型能夠方便高效的解決小規模的服務請求，但面對大規模的服務請求，多線程模型也會遇到瓶頸，能夠用非阻塞接口來嘗試解決這個問題。多線程

二、非阻塞I/O模型

在非阻塞式I/O中，用戶進程實際上是須要不斷的主動詢問kernel數據準備好了沒有。可是非阻塞I/O模型毫不被推薦。併發

非阻塞，不等待。好比建立socket對某個地址進行connect、獲取接收數據recv時默認都會等待（鏈接成功或接收到數據），才執行後續操做。
若是設置setblocking(False)，以上兩個過程就再也不等待，可是會報BlockingIOError的錯誤，只要捕獲便可。app

異步，通知，執行完成以後自動執行回調函數或自動執行某些操做（通知）。好比作爬蟲中向某個地址baidu。com發送請求，當請求執行完成以後自執行回調函數。框架

三、多路複用I/O模型(事件驅動)

基於事件循環的異步非阻塞框架:如Twisted框架，scrapy框架(單線程完成併發)。異步

檢測多個socket是否已經發生變化（是否已經鏈接成功/是否已經獲取數據）(可讀/可寫)IO多路複用做用？scrapy

操做系統檢測socket是否發生變化，有三種模式：

select：最多1024個socket；循環去檢測。
poll：不限制監聽socket個數；循環去檢測（水平觸發）。
epoll：不限制監聽socket個數；回調方式（邊緣觸發）。

Python模塊：

select.select
select.epoll

基於IO多路複用+socket非阻塞,實現併發請求(一個線程100個請求)

import socket
# 建立socket
client = socket.socket()
# 將原來阻塞的位置變成非阻塞（報錯）
client.setblocking(False)

# 百度建立鏈接: 阻塞
try:
    # 執行了但報錯了
    client.connect(('www.baidu.com',80))
except BlockingIOError as e:
    pass

# 檢測到已經鏈接成功

# 問百度我要什麼？
client.sendall(b'GET /s?wd=alex HTTP/1.0\r\nhost:www.baidu.com\r\n\r\n')

# 我等着接收百度給個人回覆
chunk_list = []
while True:
    # 將原來阻塞的位置變成非阻塞（報錯）
    chunk = client.recv(8096) 
    if not chunk:
        break
    chunk_list.append(chunk)

body = b''.join(chunk_list)
print(body.decode('utf-8'))

selectors模塊

#服務端
from socket import *
import selectors

sel=selectors.DefaultSelector()
def accept(server_fileobj,mask):
    conn,addr=server_fileobj.accept()
    sel.register(conn,selectors.EVENT_READ,read)

def read(conn,mask):
    try:
        data=conn.recv(1024)
        if not data:
            print('closing',conn)
            sel.unregister(conn)
            conn.close()
            return
        conn.send(data.upper()+b'_SB')
    except Exception:
        print('closing', conn)
        sel.unregister(conn)
        conn.close()



server_fileobj=socket(AF_INET,SOCK_STREAM)
server_fileobj.setsockopt(SOL_SOCKET,SO_REUSEADDR,1)
server_fileobj.bind(('127.0.0.1',8088))
server_fileobj.listen(5)
server_fileobj.setblocking(False) #設置socket的接口爲非阻塞
sel.register(server_fileobj,selectors.EVENT_READ,accept) #至關於網select的讀列表裏append了一個文件句柄
                                                         #server_fileobj,而且綁定了一個回調函數accept

while True:
    events=sel.select() #檢測全部的fileobj，是否有完成wait data的
    for sel_obj,mask in events:
        callback=sel_obj.data #callback=accpet
        callback(sel_obj.fileobj,mask) #accpet(server_fileobj,1)

#客戶端
from socket import *
c=socket(AF_INET,SOCK_STREAM)
c.connect(('127.0.0.1',8088))

while True:
    msg=input('>>: ')
    if not msg:continue
    c.send(msg.encode('utf-8'))
    data=c.recv(1024)
    print(data.decode('utf-8'))

四、異步I/O

asyncio是Python 3.4版本引入的標準庫，直接內置了對異步IO的支持。

asyncio的編程模型就是一個消息循環。咱們從asyncio模塊中直接獲取一個EventLoop的引用，而後把須要執行的協程扔到EventLoop中執行，就實現了異步IO。

用asyncio實現Hello world代碼以下：

import asyncio

@asyncio.coroutine
def hello():
    print("Hello world!")
    # 異步調用asyncio.sleep(1):
    r = yield from asyncio.sleep(1)
    print("Hello again!")

# 獲取EventLoop:
loop = asyncio.get_event_loop()
# 執行coroutine
loop.run_until_complete(hello())
loop.close()

@asyncio.coroutine把一個generator標記爲coroutine類型，而後，咱們就把這個coroutine扔到EventLoop中執行。

hello()會首先打印出Hello world!，而後，yield from語法可讓咱們方便地調用另外一個generator。因爲asyncio.sleep()也是一個coroutine，因此線程不會等待asyncio.sleep()，而是直接中斷並執行下一個消息循環。當asyncio.sleep()返回時，線程就能夠從yield from拿到返回值（此處是None），而後接着執行下一行語句。

把asyncio.sleep(1)當作是一個耗時1秒的IO操做，在此期間，主線程並未等待，而是去執行EventLoop中其餘能夠執行的coroutine了，所以能夠實現併發執行。

咱們用Task封裝兩個coroutine試試：

import threading
import asyncio

@asyncio.coroutine
def hello():
    print('Hello world! (%s)' % threading.currentThread())
    yield from asyncio.sleep(1)
    print('Hello again! (%s)' % threading.currentThread())

loop = asyncio.get_event_loop()
tasks = [hello(), hello()]
loop.run_until_complete(asyncio.wait(tasks))
loop.close()

觀察執行過程：

Hello world! (<_MainThread(MainThread, started 140735195337472)>)
Hello world! (<_MainThread(MainThread, started 140735195337472)>)
(暫停約1秒)
Hello again! (<_MainThread(MainThread, started 140735195337472)>)
Hello again! (<_MainThread(MainThread, started 140735195337472)>)

由打印的當前線程名稱能夠看出，兩個coroutine是由同一個線程併發執行的。

若是把asyncio.sleep()換成真正的IO操做，則多個coroutine就能夠由一個線程併發執行。

咱們用asyncio的異步網絡鏈接來獲取sina、sohu和163的網站首頁：

import asyncio

@asyncio.coroutine
def wget(host):
    print('wget %s...' % host)
    connect = asyncio.open_connection(host, 80)
    reader, writer = yield from connect
    header = 'GET / HTTP/1.0\r\nHost: %s\r\n\r\n' % host
    writer.write(header.encode('utf-8'))
    yield from writer.drain()
    while True:
        line = yield from reader.readline()
        if line == b'\r\n':
            break
        print('%s header > %s' % (host, line.decode('utf-8').rstrip()))
    # Ignore the body, close the socket
    writer.close()

loop = asyncio.get_event_loop()
tasks = [wget(host) for host in ['www.sina.com.cn', 'www.sohu.com', 'www.163.com']]
loop.run_until_complete(asyncio.wait(tasks))
loop.close()

執行結果以下：

wget www.sohu.com...
wget www.sina.com.cn...
wget www.163.com...
(等待一段時間)
(打印出sohu的header)
www.sohu.com header > HTTP/1.1 200 OK
www.sohu.com header > Content-Type: text/html
...
(打印出sina的header)
www.sina.com.cn header > HTTP/1.1 200 OK
www.sina.com.cn header > Date: Wed, 20 May 2015 04:56:33 GMT
...
(打印出163的header)
www.163.com header > HTTP/1.0 302 Moved Temporarily
www.163.com header > Server: Cdn Cache Server V2.0
...

可見3個鏈接由一個線程經過coroutine併發完成。

async/await

用asyncio提供的@asyncio.coroutine能夠把一個generator標記爲coroutine類型，而後在coroutine內部用yield from調用另外一個coroutine實現異步操做。

爲了簡化並更好地標識異步IO，從Python 3.5開始引入了新的語法async和await，可讓coroutine的代碼更簡潔易讀。

請注意，async和await是針對coroutine的新語法，要使用新的語法，只須要作兩步簡單的替換：

把@asyncio.coroutine替換爲async；
把yield from替換爲await。

讓咱們對比一下上一節的代碼：

@asyncio.coroutine
def hello():
    print("Hello world!")
    r = yield from asyncio.sleep(1)
    print("Hello again!")

用新語法從新編寫以下：

async def hello():
    print("Hello world!")
    r = await asyncio.sleep(1)
    print("Hello again!")

剩下的代碼保持不變。

小結

asyncio提供了完善的異步IO支持；

異步操做須要在coroutine中經過yield from完成；

多個coroutine能夠封裝成一組Task而後併發執行。

相關標籤/搜索

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。