Python多進程編程基礎——圖文版

時間 2019-11-17

原文原文鏈接

多進程編程知識是Python程序員進階高級的必備知識點，咱們平時習慣了使用multiprocessing庫來操縱多進程，可是並不知道它的具體實現原理。下面我對多進程的經常使用知識點都簡單列了一遍，使用原生的多進程方法調用，幫助讀者理解多進程的實現機制。代碼跑在linux環境下。沒有linux條件的，可使用docker或者虛擬機運行進行體驗。python

docker pull python:2.7
複製代碼

生成子進程

Python生成子進程使用os.fork()，它將產生一個子進程。fork調用同時在父進程和主進程同時返回，在父進程中返回子進程的pid，在子進程中返回0，若是返回值小於零，說明子進程產生失敗，通常是由於操做系統資源不足。linux

import os

def create_child():
    pid = os.fork()
    if pid > 0:
        print 'in father process'
        return True
    elif pid == 0:
        print 'in child process'
        return False
    else:
        raise
複製代碼

生成多個子進程

咱們調用create_child方法屢次就能夠生成多個子進程，前提是必須保證create_child是在父進程裏執行，若是是子進程，就不要在調用了。程序員

# coding: utf-8
# child.py
import os

def create_child(i):
    pid = os.fork()
    if pid > 0:
        print 'in father process'
        return pid
    elif pid == 0:
        print 'in child process', i
        return 0
    else:
        raise

for i in range(10):  # 循環10次，建立10個子進程
    pid = create_child(i)
    # pid==0是子進程，應該當即退出循環，不然子進程也會繼續生成子進程
    # 子子孫孫，那就生成太多進程了
    if pid == 0:
        break
複製代碼

運行python child.py，輸出redis

in father process
in father process
in child process 0
in child process 1
in father process
in child process 2
in father process
in father process
in child process 3
in father process
in child process 4
in child process 5
in father process
in father process
in child process 6
in child process 7
in father process
in child process 8
in father process
in child process 9
複製代碼

進程休眠

使用time.sleep可使進程休眠任意時間，單位爲秒，能夠是小數docker

import time

for i in range(5):
    print 'hello'
    time.sleep(1)  # 睡1s
複製代碼

殺死子進程

使用os.kill(pid, sig_num)能夠向進程號爲pid的子進程發送信號，sig_num經常使用的有SIGKILL(暴力殺死，至關於kill -9)，SIGTERM(通知對方退出，至關於kill不帶參數)，SIGINT(至關於鍵盤的ctrl+c)。編程

# coding: utf-8
# kill.py

import os
import time
import signal


def create_child():
    pid = os.fork()
    if pid > 0:
        return pid
    elif pid == 0:
        return 0
    else:
        raise


pid = create_child()
if pid == 0:
    while True:  # 子進程死循環打印字符串
        print 'in child process'
        time.sleep(1)
else:
    print 'in father process'
    time.sleep(5)  # 父進程休眠5s再殺死子進程
    os.kill(pid, signal.SIGKILL)
    time.sleep(5)  # 父進程繼續休眠5s觀察子進程是否還有輸出
複製代碼

運行python kill.py，咱們看到控制檯輸出以下bash

in father process
in child process
# 等1s
in child process
# 等1s
in child process
# 等1s
in child process
# 等1s
in child process
# 等了5s
複製代碼

說明os.kill執行以後，子進程已經中止輸出了app

殭屍子進程

在上面的例子中，os.kill執行完以後，咱們經過ps -ef|grep python快速觀察進程的狀態，能夠發現子進程有一個奇怪的顯示<defunct>函數

root        12     1  0 11:22 pts/0    00:00:00 python kill.py
root        13    12  0 11:22 pts/0    00:00:00 [python] <defunct>
複製代碼

待父進程終止後，子進程也一塊消失了。那<defunct>是什麼含義呢？它的含義是「殭屍進程」。子進程結束後，會當即成爲殭屍進程，殭屍進程佔用的操做系統資源並不會當即釋放，它就像一具屍體啥事也不幹，可是仍是持續佔據着操做系統的資源(內存等)。ui

收割子進程

若是完全乾掉殭屍進程？父進程須要調用waitpid(pid, options)函數，「收割」子進程，這樣子進程才能夠灰飛煙滅。waitpid函數會返回子進程的退出狀態，它就像子進程留下的臨終遺言，必須等父進程聽到後才能完全瞑目。

# coding: utf-8

import os
import time
import signal


def create_child():
    pid = os.fork()
    if pid > 0:
        return pid
    elif pid == 0:
        return 0
    else:
        raise


pid = create_child()
if pid == 0:
    while True:  # 子進程死循環打印字符串
        print 'in child process'
        time.sleep(1)
else:
    print 'in father process'
    time.sleep(5)  # 父進程休眠5s再殺死子進程
    os.kill(pid, signal.SIGTERM)
    ret = os.waitpid(pid, 0)  # 收割子進程
    print ret  # 看看到底返回了什麼
    time.sleep(5)  # 父進程繼續休眠5s觀察子進程是否還存在
複製代碼

運行python kill.py輸出以下

in father process
in child process
in child process
in child process
in child process
in child process
in child process
(125, 9)
複製代碼

咱們看到waitpid返回了一個tuple，第一個是子進程的pid，第二個9是什麼含義呢，它在不一樣的操做系統上含義不盡相同，不過在Unix上，它一般的value是一個16位的整數值，前8位表示進程的退出狀態，後8位表示致使進程退出的信號的整數值。因此本例中退出狀態位0，信號編號位9，還記得kill -9這個命令麼，就是這個9表示暴力殺死進程。

若是咱們將os.kill換一個信號纔看結果，好比換成os.kill(pid, signal.SIGTERM)，能夠看到返回結果變成了(138, 15)，15就是SIGTERM信號的整數值。

waitpid(pid, 0)還能夠起到等待子進程結束的功能，若是子進程不結束，那麼該調用會一直卡住。

捕獲信號

SIGTERM信號默認處理動做就是退出進程，其實咱們還能夠設置SIGTERM信號的處理函數，使得它不退出。

# coding: utf-8

import os
import time
import signal


def create_child():
    pid = os.fork()
    if pid > 0:
        return pid
    elif pid == 0:
        return 0
    else:
        raise


pid = create_child()
if pid == 0:
    signal.signal(signal.SIGTERM, signal.SIG_IGN)
    while True:  # 子進程死循環打印字符串
        print 'in child process'
        time.sleep(1)
else:
    print 'in father process'
    time.sleep(5)  # 父進程休眠5s再殺死子進程
    os.kill(pid, signal.SIGTERM)  # 發一個SIGTERM信號
    time.sleep(5)  # 父進程繼續休眠5s觀察子進程是否還存在
    os.kill(pid, signal.SIGKILL)  # 發一個SIGKILL信號
    time.sleep(5)  # 父進程繼續休眠5s觀察子進程是否還存在
複製代碼

咱們在子進程裏設置了信號處理函數，SIG_IGN表示忽略信號。咱們發現第一次調用os.kill以後，子進程會繼續輸出。說明子進程沒有被殺死。第二次os.kill以後，子進程終於中止了輸出。

接下來咱們換一個自定義信號處理函數，子進程收到SIGTERM以後，打印一句話再退出。

# coding: utf-8

import os
import sys
import time
import signal


def create_child():
    pid = os.fork()
    if pid > 0:
        return pid
    elif pid == 0:
        return 0
    else:
        raise


def i_will_die(sig_num, frame):  # 自定義信號處理函數
    print "child will die"
    sys.exit(0)


pid = create_child()
if pid == 0:
    signal.signal(signal.SIGTERM, i_will_die)
    while True:  # 子進程死循環打印字符串
        print 'in child process'
        time.sleep(1)
else:
    print 'in father process'
    time.sleep(5)  # 父進程休眠5s再殺死子進程
    os.kill(pid, signal.SIGTERM)
    time.sleep(5)  # 父進程繼續休眠5s觀察子進程是否還存在
複製代碼

輸出以下

in father process
in child process
in child process
in child process
in child process
in child process
child will die
複製代碼

信號處理函數有兩個參數，第一個sig_num表示被捕獲信號的整數值，第二個frame不太好理解，通常也不多用。它表示被信號打斷時，Python的運行的棧幀對象信息。讀者能夠沒必要深度理解。

多進程並行計算實例

下面咱們使用多進程進行一個計算圓周率PI。對於圓周率PI有一個數學極限公式，咱們將使用該公司來計算圓周率PI。

先使用單進程版本

import math

def pi(n):
    s = 0.0
    for i in range(n):
        s += 1.0/(2*i+1)/(2*i+1)
    return math.sqrt(8 * s)

print pi(10000000)
複製代碼

輸出

3.14159262176
複製代碼

這個程序跑了有一小會纔出結果，不過這個值已經很是接近圓周率了。

接下來咱們用多進程版本，咱們用redis進行進程間通訊。

# coding: utf-8

import os
import sys
import math
import redis


def slice(mink, maxk):
    s = 0.0
    for k in range(mink, maxk):
        s += 1.0/(2*k+1)/(2*k+1)
    return s


def pi(n):
    pids = []
    unit = n / 10
    client = redis.StrictRedis()
    client.delete("result")  # 保證結果集是乾淨的
    del client  # 關閉鏈接
    for i in range(10):  # 分10個子進程
        mink = unit * i
        maxk = mink + unit
        pid = os.fork()
        if pid > 0:
            pids.append(pid)
        else:
            s = slice(mink, maxk)  # 子進程開始計算
            client = redis.StrictRedis()
            client.rpush("result", str(s))  # 傳遞子進程結果
            sys.exit(0)  # 子進程結束
    for pid in pids:
        os.waitpid(pid, 0)  # 等待子進程結束
    sum = 0
    client = redis.StrictRedis()
    for s in client.lrange("result", 0, -1):
        sum += float(s)  # 收集子進程計算結果
    return math.sqrt(sum * 8)


print pi(10000000)
複製代碼