網絡編程之進程池與線程池

時間 2019-11-11

原文原文鏈接

網絡編程之進程池與線程池

1、進程池與線程池

在剛開始學多進程或多線程時，咱們火燒眉毛地基於多進程或多線程實現併發的套接字通訊，然而這種實現方式的致命缺陷是：服務的開啓的進程數或線程數都會隨着併發的客戶端數目地增多而增多，這會對服務端主機帶來巨大的壓力，甚至於不堪重負而癱瘓，因而咱們必須對服務端開啓的進程數或線程數加以控制，讓機器在一個本身能夠承受的範圍內運行，這就是進程池或線程池的用途，例如進程池，就是用來存放進程的池子，本質仍是基於多進程，只不過是對開啓進程的數目加上了限制。html

介紹：python

官網：https://docs.python.org/dev/library/concurrent.futures.html

concurrent.futures模塊提供了高度封裝的異步調用接口
ThreadPoolExecutor：線程池，提供異步調用
ProcessPoolExecutor: 進程池，提供異步調用
Both implement the same interface, which is defined by the abstract Executor class.

基本方法：git

一、submit(fn, *args, **kwargs)
異步提交任務

二、map(func, *iterables, timeout=None, chunksize=1) 
取代for循環submit的操做

三、shutdown(wait=True) 
至關於進程池的pool.close()+pool.join()操做
wait=True，等待池內全部任務執行完畢回收完資源後才繼續
wait=False，當即返回，並不會等待池內的任務執行完畢
但無論wait參數爲什麼值，整個程序都會等到全部任務執行完畢
submit和map必須在shutdown以前

四、result(timeout=None)
取得結果

五、add_done_callback(fn)
回調函數

2、進程池

介紹：github

一、The ProcessPoolExecutor class is an Executor subclass that uses a pool of processes to execute calls asynchronously. ProcessPoolExecutor uses the multiprocessing module, which allows it to side-step the Global Interpreter Lock but also means that only picklable objects can be executed and returned.

二、class concurrent.futures.ProcessPoolExecutor(max_workers=None, mp_context=None)

三、An Executor subclass that executes calls asynchronously using a pool of at most max_workers processes. If max_workers is None or not given, it will default to the number of processors on the machine. If max_workers is lower or equal to 0, then a ValueError will be raised.

用法：編程

from concurrent.futures import ThreadPoolExecutor,ProcessPoolExecutor
import os,time,random
def task(n):
    print('%s is runing' %os.getpid())
    time.sleep(random.randint(1,3))
    return n**2
if __name__ == '__main__':
    executor=ProcessPoolExecutor(max_workers=3)
    futures=[]
    for i in range(11):
        future=executor.submit(task,i)
        futures.append(future)
    executor.shutdown(True)
    print('+++>')
    for future in futures:
        print(future.result())

3、線程池

介紹：json

一、ThreadPoolExecutor is an Executor subclass that uses a pool of threads to execute calls asynchronously.

二、class concurrent.futures.ThreadPoolExecutor(max_workers=None, thread_name_prefix='')

三、An Executor subclass that uses a pool of at most max_workers threads to execute calls asynchronously.

四、Changed in version 3.5: If max_workers is None or not given, it will default to the number of processors on the machine, multiplied by 5, assuming that ThreadPoolExecutor is often used to overlap I/O instead of CPU work and the number of workers should be higher than the number of workers for ProcessPoolExecutor.

五、New in version 3.6: The thread_name_prefix argument was added to allow users to control the threading.Thread names for worker threads created by the pool for easier debugging.

用法：網絡

把ProcessPoolExecutor換成ThreadPoolExecutor，其他用法所有相同

4、map方法

from concurrent.futures import ThreadPoolExecutor,ProcessPoolExecutor
import os,time,random
def task(n):
    print('%s is runing' %os.getpid())
    time.sleep(random.randint(1,3))
    return n**2
if __name__ == '__main__':
    executor=ThreadPoolExecutor(max_workers=3)
    # for i in range(11):
    #     future=executor.submit(task,i)
    executor.map(task,range(1,12)) #map取代了for+submit

5、回調函數

能夠爲進程池或線程池內的每一個進程或線程綁定一個函數，該函數在進程或線程的任務執行完畢後自動觸發，並接收任務的返回值看成參數，該函數稱爲回調函數。多線程

from concurrent.futures import ThreadPoolExecutor,ProcessPoolExecutor
from multiprocessing import Pool
import requests
import json
import os
def get_page(url):
    print('<進程%s> get %s' %(os.getpid(),url))
    respone=requests.get(url)
    if respone.status_code == 200:
        return {'url':url,'text':respone.text}
def parse_page(res):
    res=res.result()
    print('<進程%s> parse %s' %(os.getpid(),res['url']))
    parse_res='url:<%s> size:[%s]\n' %(res['url'],len(res['text']))
    with open('db.txt','a') as f:
        f.write(parse_res)
if __name__ == '__main__':
    urls=[
        'https://www.baidu.com',
        'https://www.python.org',
        'https://www.openstack.org',
        'https://help.github.com/',
        'http://www.sina.com.cn/'
    ]
    p=ProcessPoolExecutor(3)
    for url in urls:
        p.submit(get_page,url).add_done_callback(parse_page) #parse_page拿到的是一個future對象obj，須要用obj.result()拿到結果

相關標籤/搜索

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。