《Python》進程收尾線程初識

時間 2019-12-09

原文原文鏈接

1、數據共享

　　from multiprocessing import Managerhtml

　　把全部實現了數據共享的比較便捷的類都從新又封裝了一遍，而且在原有的multiprocessing基礎上增長了新的機制list、dictjava

　　機制：支持的數據類型很是有限python

　　　　list、dict都不是數據安全的，須要本身加鎖來保證數據安全c++

from multiprocessing import Manager,Process,Lock

def work(d,lock):
    with lock:
        d['count'] -= 1

if __name__ == '__main__':
    lock = Lock()
    with Manager() as m:   # m = Manager()
        dic = m.dict({'count':100})
        p_lst = []
        for i in range(10):
            p = Process(target=work, args=(dic, lock))
            p_lst.append(p)
            p.start()
        for p in p_lst:
            p.join()
        print(dic)
#{'count': 90}

with ......
    一大段語句
dis模塊
python的上下文管理
在執行一大段語句以前，自動作某個操做  open
在執行一大段語句以後，自動作某個操做  close

面向對象的魔術方法（雙下槓槓方法）

# 回調函數 in Pool

import os
from multiprocessing import Pool

def func(i):
    print('第一個任務', os.getpid())
    return '*'*i

def call_back(res):   #回調函數
    print('回調函數：', os.getpid())
    print('res--->', res)

if __name__ == '__main__':
    p = Pool()
    print('主進程', os.getpid())
    p.apply_async(func, args=(1,), callback=call_back)
    p.close()
    p.join()

　　func執行完畢以後執行callback函數flask

　　func的返回值會做爲callback的參數c#

　　回調函數是在主進程中實現的安全

　　應用場景：子進程有大量運算要作，回調函數等待結果作簡單處理多線程

import re
from urllib.request import urlopen
from multiprocessing import Pool

url_lst = [
    'http://www.baidu.com',
    'http://www.sohu.com',
    'http://www.sogou.com',
    'http://www.4399.com',
    'http://www.cnblogs.com',
]

def get_url(url):
    response = urlopen(url)
    ret = re.search('www\.(.*?)\.com', url)
    print('%s finished' % ret.group(1))
    return ret.group(1),response.read()

def call(content):
    url,con = content
    with open(url+'.html', 'wb')as f:
        f.write(con)
if __name__ == '__main__':
    p = Pool()
    for url in url_lst:
        p.apply_async(get_url,args=(url,),callback=call)
    p.close()
    p.join()

子進程去訪問網頁，主進程處理網頁的結果

2、線程理論基礎

　　進程是計算機中最小的資源分配單位，進程對於操做系統來講還具備必定的負擔併發

　　建立一個進程，操做系統分配的資源大約有：代碼，數據，文件等app

一、爲何要有線程

　　線程是輕量級的概念，他沒有屬於本身的進程資源，一條線程只負責執行代碼，沒有本身獨立的代碼、數據以及文件

　　線程是計算機中能被CPU調用的最小的單位，當前大部分計算機中的CPU都是執行的線程中的代碼

　　線程與進程之間的關係：每個進程中都至少有一條線程在工做

線程的特色：

　　同一個進程中的全部線程的資源是共享的

　　輕量級，沒有本身的資源

進程與線程之間的區別：　　　　

　　佔用的資源、調度的效率、資源是否共享

線程的並行問題：

　　線程能夠並行：java、c++，c#等

　　在cpython中，一個進程中的多個線程是不能夠並行的

　　緣由是：Cpython解釋器內部有一把全局解釋器鎖GIL，因此線程不能充分利用多核，同一時刻同一進程中的線程只有一個能被cpu執行

　　GIL鎖確實是限制了你程序的效率，但目前能夠幫助你提升線程之間切換的效率

　　若是是想寫高計算型的就要多進程或者換一個解釋器

二、threading 模塊

# 併發

import os
from threading import Thread

def func(i):
    print('子線程：', i, os.getpid())

print('主線程', os.getpid())
for i in range(10):
    t = Thread(target=func, args=(i,))
    t.start()

# 進程和線程的差距

import os
import time
from threading import Thread
from multiprocessing import Process


def func(i):
    print('子：', os.getpid())

if __name__ == '__main__':
    start = time.time()
    t_lst = []
    for i in range(100):
        t = Thread(target=func, args=(i,))
        t.start()
        t_lst.append(t)
    for t in t_lst:
        t.join()
    end = time.time()-start

    start = time.time()
    t_lst = []
    for i in range(100):
        p = Process(target=func, args=(i,))
        p.start()
        t_lst.append(p)
    for p in t_lst:
        p.join()
    end2 = time.time()-start
    print(end, end2)
#0.0279843807220459 13.582834720611572

# 線程間的數據共享

from threading import Thread

num = 100
def func():
    global num
    num -= 1            #每一個線程都-1

t_lst = []
for i in range(100):
    t = Thread(target=func)   #建立一百個線程
    t.start()
    t_lst.append(t)
for t in t_lst:
    t.join()
print(num)   #0

Thread 類的其餘用法

Thread實例對象的方法
  # isAlive(): 返回線程是否活動的。
  # getName(): 返回線程名。
  # setName(): 設置線程名。

threading模塊提供的一些方法：
  # threading.currentThread(): 返回當前的線程變量。
  # threading.enumerate(): 返回一個包含正在運行的線程的list。正在運行指線程啓動後、結束前，不包括啓動前和終止後的線程。
  # threading.activeCount(): 返回正在運行的線程數量，與len(threading.enumerate())有相同的結果。

from threading import currentThread,Thread
def func():
    time.sleep(2)

t = Thread(target=func)
t.start()
print(t.is_alive())    #True（判斷線程是否活着）
print(t.getName())  #Tread-1
t.setName('tt')
print(t.getName())   #tt（更名字）

def func():
    print('子線程：', currentThread().ident)
    time.sleep(2)
print('主線程：',currentThread().ident)
t = Thread(target=func)
t.start()
#currentThread().ident返回線程的pid

from threading import enumerate
def func():
    print('子進程：', currentThread().ident)
    time.sleep(2)

print('主進程：', currentThread().ident)
for i in range(10):
    t = Thread(target=func)
    t.start()
print(len(enumerate()))
#enumerate()返回一個包含正在運行的線程的list，len(list)

from threading import activeCount
def func():
    print('子線程：', currentThread().ident)
    time.sleep(2)

print('主線程：', currentThread().ident)
for i in range(10):
    t = Thread(target=func)
    t.start()
print(activeCount())
#activeCount()返回正在運行的線程數量，與len(threading.enumerate())有相同的結果

示例

三、守護線程

import time
from threading import Thread

def func():
    while True:
        time.sleep(1)
        print(123)

def func2():
    print('func2 start')
    time.sleep(3)
    print('func2 end')

t1 = Thread(target=func)
t2 = Thread(target=func2)
t1.setDaemon(True)
t1.start()
t2.start()
print('主線程代碼結束')
# func2 start
#主線程代碼結束
#123
#123
#func2 end

　　守護線程是在主線程代碼結束以後，再等待子線程執行結束後才結束

　　主線程結束就意味着主進程結束

　　主線程等待全部的線程結束

　　主線程結束了之後守護線程會隨着主進程的結束而隨之結束不是隨着代碼的結束而結束

#################################################################################

線程
線程和進程之間的關係
    每一個進程內都有一個線程
    線程是不能獨立存在的
線程和進程之間的區別
    同一個進程中線程之間的數據是共享的
    進程之間的數據是隔離的
    線程是被cpu執行的最小單位
        操做系統調度
    進程是計算機中最小的資源分配單位
python
    GIL鎖 全局解釋器鎖 全局鎖
         cpython解釋器中的
        鎖線程 ：同一時刻同一個進程只會有一個線程訪問CPU
            鎖的是線程而不是數據
    當程序是高IO型的 多線程
    當程序是高計算(CPU)型的 多進程
        cpu*1 ~ cpu*2

threading
Thread
    守護線程 ：主線程結束以後才結束

socket_server IO多路複用 + 多線程
框架 併發的效果 ：多線程、協程的概念 flask
爬蟲 ：線程池 協程

set、dict、list
生成器
面向對象的進階 ：魔術方法
管道
socket_server的源碼

相關標籤/搜索

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。