進程:程序的一次執行(程序載入內存,系統分配資源運行)。每一個進程有本身的內存空間,數據棧等,進程之間能夠進行通信,可是不能共享信息。python
線程:全部的線程運行在同一個進程中,共享相同的運行環境。每一個獨立的線程有一個程序入口,順序執行序列和程序的出口。windows
線程的運行能夠被強佔,中斷或者暫時被掛起(睡眠),讓其餘的線程運行。一個進程中的各個線程共享同一片數據空間。多線程
import threading def thread_job(): print "this is added thread,number is {}".format(threading.current_thread()) def main(): added_thread = threading.Thread(target = thread_job) #添加線程 added_thread.start() #執行添加的線程 print threading.active_count() #當前已被激活的線程的數目 print threading.enumerate() #激活的是哪些線程 print threading.current_thread() #正在運行的是哪些線程 if __name__ == "__main__": main()
this is added thread,number is <Thread(Thread-6, started 6244)>6 [<HistorySavingThread(IPythonHistorySavingThread, started 7588)>, <ParentPollerWindows(Thread-3, started daemon 3364)>, <Heartbeat(Thread-5, started daemon 3056)>, <_MainThread(MainThread, started 1528)>, <Thread(Thread-6, started 6244)>, <Thread(Thread-4, started daemon 4700)>] <_MainThread(MainThread, started 1528)>
#join 功能 等到線程執行完以後 再回到主線程中去 import threading import time def T1_job(): print "T1 start\n" for i in range(10): time.sleep(0.1) print "T1 finish" def T2_job(): print 'T2 start' print 'T2 finish' def main(): thread1 = threading.Thread(target = T1_job) #添加線程 thread2 = threading.Thread(target = T2_job) thread1.start() #執行添加的線程 thread2.start() thread1.join() thread2.join() print 'all done\n' if __name__ == "__main__": main() T1 start T2 start T2 finish T1 finish all done
#queue 多線程各個線程的運算的值放到一個隊列中,到主線程的時候再拿出來,以此來代替 #return的功能,由於在線程是不能返回一個值的 import time import threading from Queue import Queue def job(l,q): q.put([i**2 for i in l]) def multithreading(data): q = Queue() threads = [] for i in xrange(4): t = threading.Thread(target = job,args = (data[i],q)) t.start() threads.append(t) for thread in threads: thread.join() results = [] for _ in range(4): results.append(q.get()) print results if __name__ == "__main__": data = [[1,2,3],[4,5,6],[3,4,3],[5,5,5]] multithreading(data) [[1, 4, 9], [16, 25, 36], [9, 16, 9], [25, 25, 25]]
#多線程的鎖
import threading
import time
def T1_job():
global A,lock
lock.acquire()
for i in xrange(10):
A += 1
print 'T1_job',A
lock.release()
def T2_job():
global A,lock
lock.acquire()
for i in xrange(10):
A += 10
print 'T2_job',A
lock.release()
if __name__ == "__main__":
lock = threading.Lock()
A = 0 #全局變量
thread1 = threading.Thread(target = T1_job) #添加線程
thread2 = threading.Thread(target = T2_job)
thread1.start() #執行添加的線程
thread2.start()
thread1.join()
thread2.join()
GIL並非Python的特性,他是CPython引入的概念,是一個全局排他鎖。app
同一時刻一個解釋進程只有一行bytecode 在執行async
#python中 多線程的效率不必定就是 3 個線程就 三倍的效率 #在python中有GIL,線程鎖,保證只有一個線程在計算,在不停的切換 #因此 若是是不一樣的任務,任務之間差異很大,線程之間能夠分工合做,能夠提升效率,如一個發送消息,另外一個接收消息。 #若是處理一大堆的數據,多線程幫不上,須要mutliprocessing 由於每一個核有單獨的邏輯空間,互相不影響 import time import threading from Queue import Queue def job(l,q): q.put(sum(l)) def normal(l): print sum(l) def multithreading(l): q = Queue() threads = [] for i in range(3): t = threading.Thread(target = job,args = (l,q),name = 'T{}'.format(i)) t.start() threads.append(t) [t.join() for t in threads] total = 0 for _ in range(3): total += q.get() print total if __name__ == '__main__': l = list(xrange(1000000)) s_t = time.time() normal(l*3) print 'normal time:',time.time()-s_t s_t = time.time() multithreading(l) print 'multithreading time:',time.time() -s_t 1499998500000 normal time: 0.297999858856 1499998500000 multithreading time: 0.25200009346
multiprocessing庫彌補了thread庫由於GIL而低效的缺陷。完整的複製了一套thread所提供的接口方便遷移,惟一的不一樣就是他使用了多進程而不是多線程。每一個進程都有本身獨立的GIL。可是在windows下多進程的開銷要比多線程要大好多,Linux下是差很少的。多進程更加穩定,ui
multiprocessing Process類表明一個進程對象。this
import multiprocessing as mp import threading as td import time def job(q): res = 0 for i in range(100000): res += i + i **2 q.put(res) def normal(): res = 0 for i in range(100000): res += i + i **2 print 'normal:',res def multithread(): q = mp.Queue() #這裏用多進程的queue沒問題的 t1 = td.Thread(target = job,args = (q,)) # t2 = td.Thread(target = job(q,)) t1.start() # t2.start() t1.join() # t2.join() res1 = q.get() # res2 = q.get() print 'thread:',res1 def multiprocess(): q = mp.Queue() p1 = mp.Process(target = job,args = (q,)) # p2 = mp.Process(target = job(q,)) p1.start() # p2.start() p1.join() # p2.join() res1 = q.get() # res2 = q.get() print 'multiprocess:',res1 if __name__ == '__main__': st = time.time() normal() st1 = time.time() print 'normal time:',st1 - st multithread() st2 = time.time() print 'thread:',st2 - st1 multiprocess() print 'process:',time.time() - st2
#進程池 ,Pool中是有return的 import multiprocessing as mp def job(x): return x**2 def multiprocess(): pool = mp.Pool() #默認是有幾個核就用幾個,能夠本身設置processes = ? res = pool.map(job,range(10)) #能夠放入可迭代對象,自動分配進程 print res res = pool.apply_async(job,(2,)) #一次只能在一個進程裏計算,要達到map的效果,要迭代 print res.get() multi_res = [pool.apply_async(job,(i,)) for i in range(10)] #迭代器 print ([res.get() for res in multi_res]) if __name__ == '__main__': multiprocess()
#多進程中的global的全局變量 分給不一樣的cpu,難以交流 #使用 shared memory 進行交流 import multiprocessing as mp value = mp.Value('d',1) #d就是double,i是一個signed int array = mp.Array('i',[1,3,4]) #只是個一維的而已 ,和numpy的不同
#鎖 import multiprocessing as mp import time def job(v,num,l): l.acquire() for i in range(10): time.sleep(0.1) v.value += num print v.value l.release() def multiprocess(): v = mp.Value('i',0) #共享內存 l = mp.Lock() q = mp.Queue() p1 = mp.Process(target = job,args = (v,1,l)) p2 = mp.Process(target = job,args = (v,3,l)) p1.start() p2.start() p1.join() p2.join() if __name__ == '__main__': multiprocess()
fork操做:調用一次,返回兩次。操做系統自動把當前進程複製一份,分佈在父進程和子進程中返回,子進程永遠返回0,父進程永遠返回子進程的ID。子進程getppid()就能夠拿到父進程的ID ,getpid()能夠得到當前進程的ID。spa