協程,yield下支持的協程,gevent下支持的協程,簡單爬蟲注:協程程序運行時間取決於其最長協程的執行時間協程:又稱微線程,線程。英文名Coroutine。協程是一種用戶態的輕量級線程。協程擁有本身的寄存器上下文和棧。協程調度切換時,將寄存器上下文和棧保存到其它地方,在切回來的時候,恢復先前保存的寄存器上下文和棧。所以:協程能保留上一次調用時的狀態(即全部局部狀態的一個特定組成),每次過程重入時,就至關於進入上一次調用時的狀態,換種說法:進入上一次離開時所在邏輯流的位置。協程的好處:(1)無需線程上下文切換的開銷(2)無需原子操做鎖定即同步的開銷(3)方便切換控制流,簡化線程模型(4)高併發+高擴展+低成本:一個cpu支持上萬的協程都不是問題。因此很適合用於高併發處理。缺點:(1)沒法利用多核資源:協程的本質是個單線程,他不能同時將單個cpu的多個核用上,協程須要和進程配合才能運行在多 cpu上,平常所編寫的絕大部分應用都沒有這個必要,除非是cpu密集型應用。(2)進行阻塞(Blocking)操做(如IO時)會阻塞掉整個程序。1.yield下支持的協程: (1) import time import queue def consumer(name): print(" ") while True: new_baozi = yield print("[%s] is eating baozi %s"%(name,new_baozi)) def producer(): r = con.__next__() r = con2.__next__() n=0 while n<5: n += 1 con.send(n) con2.send(n) print("is making baozi %s"%n) if __name__=='__main__': con = consumer("c1") con2 = consumer("c2") p=producer() (2) def f(): print('ok1') count=yield 5 print(count) print('ok2') yield 6 # print(f())#<generator object f at 0x00000164C3A8F4F8> gen=f() # ret=next(gen)#生成器 # print(ret) ret = gen.send(None)#注:第一次send的時候是找不到目標的,要有一次yield折回 print(ret) x=gen.send(8) print(x)2.gevent下支持的協程: #author: wylkjj #date:2019/5/13 from greenlet import greenlet def test1(): print(12) gr2.switch()#切換 print(34) gr2.switch()#切換 def test2(): print(56) gr1.switch()#切換 print(78) gr1 = greenlet(test1)#建立對象 # print(gr1) gr2 = greenlet(test2)#建立對象 gr1.switch()#調用執行 結果:12 56 34 78 簡單協程實例: import gevent def foo(): print('Running in foo') gevent.sleep(0)#模擬IO阻塞 print('Explicit context switch to foo again') def bar(): print('Explicit context to bar') gevent.sleep(0)#模擬IO阻塞 print('Implicit switch back bar') gevent.joinall([ gevent.spawn(foo), gevent.spawn(bar), ])3.簡單爬蟲: (1)爬取網頁並寫到文件裏面 from urllib.request import urlopen def f(url): print('GET:%s'%url) resp = urlopen(url) data = resp.read() with open('bilibili.html','wb')as f: f.write(data) print('%d bytes received from %s.'%(len(data),url)) f('https://www.bilibili.com/') (2)爬取網頁 #author: wylkjj #date:2019/5/14 import gevent,time from gevent import monkey #對IO流監控增強控制 monkey.patch_all() from urllib.request import urlopen def f(url): print('GET:%s'%url) resp = urlopen(url) data = resp.read() # with open('bilibili.html','wb')as f: # f.write(data) print('%d bytes received from %s.'%(len(data),url)) # f('https://www.bilibili.com/') # list = ['https://www.python.org/','https://www.yahoo.com/','https://github.com/'] start=time.time() # for url in list: # f(url)#8.78313660621643秒 gevent.joinall([ #時間3.7336878776550293秒 協程節省時間 gevent.spawn(f,'https://www.python.org/'), gevent.spawn(f,'https://www.yahoo.com/'), gevent.spawn(f,'https://github.com/'), ]) print(time.time()-start)