Python學習筆記9-多線程和多進程

時間 2019-12-08

原文原文鏈接

1、線程&進程html

對於操做系統來講，一個任務就是一個進程（Process），好比打開一個瀏覽器就是啓動一個瀏覽器進程，打開一個記事本就啓動了一個記事本進程，打開兩個記事本就啓動了兩個記事本進程，打開一個Word就啓動了一個Word進程。進程是不少資源的集合。python

有些進程還不止同時幹一件事，好比Word，它能夠同時進行打字、拼寫檢查、打印等事情。在一個進程內部，要同時幹多件事，就須要同時運行多個「子任務」，咱們把進程內的這些「子任務」稱爲線程（Thread）。瀏覽器

因爲每一個進程至少要幹一件事，因此，一個進程至少有一個線程。固然，像Word這種複雜的進程能夠有多個線程，多個線程能夠同時執行，多線程的執行方式和多進程是同樣的，也是由操做系統在多個線程之間快速切換，讓每一個線程都短暫地交替運行，看起來就像同時執行同樣。固然，真正地同時執行多線程須要多核CPU纔可能實現。線程是最小的執行單元，而進程由至少一個線程組成。多線程

咱們在作事情的時候，一我的作是比較慢的，若是多我的一塊兒來作的話，就比較快了，程序也是同樣的，咱們想運行的速度快一點的話，就得使用多進程，或者多線程，在python裏面，多線程被不少人詬病，爲何呢，由於Python的解釋器使用了GIL的一個叫全局解釋器鎖，它不能利用多核CPU，只能運行在一個cpu上面，可是你在運行程序的時候，看起來好像仍是在一塊兒運行的，是由於操做系統輪流讓各個任務交替執行，任務1執行0.01秒，切換到任務2，任務2執行0.01秒，再切換到任務3，執行0.01秒……這樣反覆執行下去。表面上看，每一個任務都是交替執行的，可是，因爲CPU的執行速度實在是太快了，咱們感受就像全部任務都在同時執行同樣。這個叫作上下文切換。app

2、多線程，python中的多線程使用theading模塊函數

下面是一個簡單多線程ui

    import threading
    import time
    def sayhi(num): #定義每一個線程要運行的函數
     
        print("running on number:%s" %num)
     
        time.sleep(3)
     
    if __name__ == '__main__':
        t1 = threading.Thread(target=sayhi,args=(1,)) #生成一個線程實例
        t2 = threading.Thread(target=sayhi,args=(2,)) #生成另外一個線程實例
        t1.start() #啓動線程
        t2.start() #啓動另外一個線程

下面是另外一種啓動多線程的方式，繼承式url

    import threading
    import time
    class MyThread(threading.Thread):
        def __init__(self,num):
            threading.Thread.__init__(self)
            self.num = num
     
        def run(self):#定義每一個線程要運行的函數
     
            print("running on number:%s" %self.num)
    
            time.sleep(3)
     
    if __name__ == '__main__':
     
        t1 = MyThread(1)
        t2 = MyThread(2)
        t1.start()
        t2.start()

這兩種方式沒有什麼區別，兩種寫法而已，我我的喜歡用第一種，更簡單一些。操作系統

線程等待，多線程在運行的時候，每一個線程都是獨立運行的，不受其餘的線程干擾，若是想在哪一個線程運行完以後，再作其餘操做的話，就得等待它完成，那怎麼等待呢，使用join，等待線程結束線程

            import threading
            import time
            def run():
                print('qqq')
                time.sleep(1)
                print('done!')
            lis = []
            for i in range(5):
                t = threading.Thread(target=run)
                lis.append(t)
                t.start()
            for t in lis:
                t.join()
            print('over')

守護線程，什麼是守護線程呢，就至關於你是一個國王（非守護線程），而後你有不少僕人（守護線程），這些僕人都是爲你服務的，一但你死了，那麼你的僕人都給你陪葬。

            import threading
            import time
            def run():
                print('qqq')
                time.sleep(1)
                print('done!')
            for i in range(5):
                t = threading.Thread(target=run)
                t.setDaemon(True)
                t.start()
            print('over')

線程鎖，線程鎖就是，不少線程一塊兒在操做一個數據的時候，可能會有問題，就要把這個數據加個鎖，同一時間只能有一個線程操做這個數據。

        import threading
        from threading import Lock
        num = 0
        lock = Lock()#申請一把鎖
        def run():
            global num
            lock.acquire()#加鎖
            num+=1
            lock.release()#解鎖
        
        lis = []
        for i in range(5):
            t = threading.Thread(target=run)
            t.start()
            lis.append(t)
        for t in lis:
            t.join()
        print('over',num)

下面來個簡單的爬蟲，看下多線程的效果

        import threading
        import requests,time
        urls  ={
            "baidu":'http://www.baidu.com',
            "blog":'http://www.nnzhp.cn',
            "besttest":'http://www.besttest.cn',
            "taobao":"http://www.taobao.com",
            "jd":"http://www.jd.com",
        }
        def run(name,url):
            res = requests.get(url)
            with open(name+'.html','w',encoding=res.encoding) as fw:
                fw.write(res.text)
        
        
        start_time = time.time()
        lis = []
        for url in urls:
            t = threading.Thread(target=run,args=(url,urls[url]))
            t.start()
            lis.append(t)
        for t in lis:
            t.join()
        end_time = time.time()
        print('run time is %s'%(end_time-start_time))
        
        #下面是單線程的執行時間
        # start_time = time.time()
        # for url in urls:
        #     run(url,urls[url])
        # end_time = time.time()
        # print('run time is %s'%(end_time-start_time))

3、多進程，上面說了Python裏面的多線程，是不能利用多核CPU的，若是想利用多核CPU的話，就得使用多進程，python中多進程使用multiprocessing模塊。

    from multiprocessing import Process
    import time
    def f(name):
        time.sleep(2)
        print('hello', name) 
    p = Process(target=f, args=('niu',))
    p.start()
    p.join()