[單刷APUE系列]第十一章——線程[1]

時間 2019-11-17

標籤 apue 系列第十一章線程简体版

原文原文鏈接

原文來自靜雅齋，轉載請註明出處。javascript

線程概念

在前面的章節，都是以多進程單線程概念來說解的，特別是早期的Unix環境，沒有引入線程模型，因此無所謂線程概念，也就是一個進程在某一時刻只能作一件事情，而多線程則是可讓進程擁有多個線程，這樣進程就能在某一時刻作不止一件事情。線程的好處和缺點就很少說了，相信各位應該都有體會了。
固然，多線程和多處理器或者多核是無關的，多線程的出現是爲了解決異步和並行，即便是運行在單核心上，也能獲得性能提高，例如，當IO線程處於阻塞狀態，其餘的線程就能搶佔CPU，從而獲得資源有效利用。
在前面的章節中，也介紹了進程內存空間是如何的，具體包含了那些內容，而多線程的引入，則將其內容擴充了。一般狀況下，談論Unix環境的多線程就是特指pthread，一個進程在啓動的時候只有一個控制線程，而用戶能夠經過系統提供的API建立管理線程，實際上，線程是很是輕量化的，進程的正文段、數據段等實際上都是共享的，包括了全局內存啊文件描述符啊，這些資源實際上都是共享的，也就是說，線程雖然建立管理銷燬很容易，可是也會致使資源搶佔的問題，線程主要是在內核空間中寄存器等東西須要佔用內存。java

線程標識

就像進程ID同樣，線程也有本身的ID，叫作線程ID。進程ID相對於這個系統而言，而線程ID則是相對於進程ID而言，兩個進程的同一個線程ID是沒有可比性的。
在現代的Unix環境中，系統已經提供了pthread_t數據類型做爲線程ID的存儲類型，因爲不一樣的Unix環境的實現不一樣，有些是使用整形，有些是使用一個結構體，因此爲了保證可移植性，咱們不能直接去操做這個數據類型。編程

int pthread_equal(pthread_t t1, pthread_t t2);
pthread_t pthread_self(void);複製代碼

一個是比較函數，一個是獲取線程自身的線程ID，固然，因爲線程ID的數據結構不肯定性，因此在調試輸出的時候很麻煩，一般的作法就是使用第三方調試庫，或者本身寫一個調試函數，根據當前系統來肯定是輸出結構體仍是整形。數據結構

線程建立

前面說過，進程建立的時候通常只有一個線程，當須要多線程的時候須要開發者自行調用函數庫來建立管理銷燬，新的線程建立函數以下多線程

int pthread_create(pthread_t *restrict thread, const pthread_attr_t *restrict attr, void *(*start_routine)(void *), void *restrict arg);複製代碼

其實原著中有一些翻譯錯誤，例如，原著中這麼寫異步

當 pthread_create 成功返回時，新建立線程的線程ID會被設置成 tidp 指向的內存單元函數

這句話很是讓人費解，實際上Unix手冊是這麼講的性能

The pthread_create() function is used to create a new thread, with attributes specified by attr, within a process. If attr is NULL, the default attributes are used. If the attributes specified by attr are modified later, the thread's attributes are not affected. Upon successful completion, pthread_create() will store the ID of the created thread in the location specified by thread.複製代碼

pthread_create函數被用來建立一個新的線程，而且會應用attr參數指定的屬性，若是attr參數爲null，則會使用默認的屬性，後續對attr參數的修改不會影響以建立線程的屬性。當函數成功返回的時候，pthread_create函數將會把線程ID存儲在thread參數的內存位置。這樣你們應該就明白了。測試

Upon its creation, the thread executes start_routine, with arg as its sole argument.  If start_routine returns, the effect is as if there was an implicit call to pthread_exit(), using the return value of start_routine as the exit status.  Note that the thread in which main() was originally invoked differs from this.  When it returns from main(), the effect is as if there was an implicit call to exit(), using the return value of main() as the exit status.複製代碼

當建立後，線程執行start_routine參數指定的函數，而且將arg參數做爲其惟一參數，若是start_routine函數返回了，就是隱含了pthread_exit()函數的調用，而且將start_routine函數的返回值做爲退出狀態。注意，main函數中喚起的線程和這種方式建立的線程是有區別的，當main函數返回的時候，就隱含了exit()函數的調用，而且將main函數的返回值當作退出狀態。this

線程終止

線程其實能夠當作輕量級的進程，進程若是調用了exit、_Exit或者_exit，則進程會終止，而線程也能夠終止，單個線程可使用一下三種方式退出

線程返回，返回值是線程退出碼
線程被同一進程的其餘線程取消
線程調用pthread_exit

void pthread_exit(void *value_ptr);

The pthread_exit() function terminates the calling thread and makes the value value_ptr available to any successful join with the terminating thread.複製代碼

從上面咱們好像看到了一些新的內容，提到了successful join，實際上是一個相似wait的函數。

int pthread_join(pthread_t thread, void **value_ptr);複製代碼

前面提到了線程有三種方式結束，線程返回、線程取消、使用pthread_exit函數。若是是簡單的返回，那麼rval_ptr就會包含返回碼，若是線程被取消，則rval_ptr將被設置爲PTHREAD_CANCELED。

#include "include/apue.h"
#include <pthread.h>

void *thr_fn1(void *arg)
{
    printf("thread 1 returning\n");
    return((void *)1);
}

void *thr_fn2(void *arg)
{
    printf("thread 2 exiting\n");
    pthread_exit((void *)2);
}

int main(int argc, char *argv[])
{
    int err;
    pthread_t tid1, tid2;
    void *tret;

    err = pthread_create(&tid1, NULL, thr_fn1, NULL);
    if (err != 0)
        err_exit(err, "can't create thread 1");
    err = pthread_create(&tid2, NULL, thr_fn2, NULL);
    if (err != 0)
        err_exit(err, "can't create thread 2");
    err = pthread_join(tid1, &tret);
    if (err != 0)
        err_exit(err, "can't join with thread 1");
    printf("thread 1 exit code %ld\n", (long)tret);
    err = pthread_join(tid2, &tret);
    if (err != 0)
        err_exit(err, "can't join with thread 2");
    printf("thread 2 exit code %ld\n", (long)tret);
    exit(0);
}複製代碼

運行後的結果以下

thread 1 returning
thread 2 exiting
thread 1 exit code 1
thread 2 exit code 2複製代碼

除了能看到pthread_exit和return都是同樣的效果之外，咱們還能發現一些書上沒有提到的東西。好比，線程退出後依舊會等待進程進行清理工做，或者咱們能夠類比父子進程，主線程建立了子線程，因此子線程須要等待父線程使用函數清理回收，並且pthread_join函數是一個阻塞函數，固然，實際的線程工做固然不是如同這樣的。
在對線程函數的查看中咱們能夠看到，不管是參數仍是返回值，都是一個無類型指針，這表明着咱們能夠傳遞任何的數據。可是，請記住，C語言編程是存在棧分配和堆分配的，若是是棧分配的變量，咱們須要考慮到訪問的時候內存是否已經被回收了，因此，像這類的狀況，基本都是使用堆分配變量手動管理內存的。

int pthread_cancel(pthread_t thread);複製代碼

pthread_cancel函數會發起一個取消請求給thread參數指定的線程，目標線程的取消狀態和類型肯定了取消過程發生的時間。當取消過程生效的時候，目標線程的取消清理函數將會被調用。當最後一個取消清理函數返回的時候，指定線程的數據析構函數將會被調用，當最後一個數據析構函數返回的時候，線程將會終止。
固然，pthread_cancel函數是異步請求，因此不會等待線程的徹底終止。最終若是使用pthread_join函數偵聽線程結束，實際上會獲得PTHREAD_CANCELED常量。

void pthread_cleanup_push(void (*routine)(void *), void *arg);
void pthread_cleanup_pop(int execute);複製代碼

就像進程退出會有進程清理函數同樣，線程退出也會有線程清理函數，從上面的函數名稱中也能猜出來實際上使用的是棧來存儲函數指針。也就是說，註冊的順序和調用的順序是反過來的。
pthread_cleanup_push函數將routine函數指針壓入棧頂，噹噹前線程退出的時候被調用，換言之，這個函數其實是針對當前線程的行爲。
pthread_cleanup_pop函數彈出當前棧頂的routine清理函數，若是execute參數爲非0，將會執行這個清理函數，若是不存在清理函數，則pthread_cleanup_pop將不會作任何事情。

pthread_cleanup_push() must be paired with a corresponding pthread_cleanup_pop(3) in the same lexical scope.複製代碼

pthread_cleanup_push函數須要和pthead_cleanup_pop函數在一個做用域內配對使用，原著對此給出的解釋是這兩個函數多是以宏定義的形式實現的。
注意：這兩個函數只會在pthraed_exit()返回的時候被調用，若是是線程函數返回，則不會調用。
並且，通過實際測試，蘋果系統下確實是經過宏定義實現這兩個函數的。因此，若是在這兩個函數還沒有調用的時候就返回的話，會致使段錯誤。根據猜測，應該是返回的時候棧被改寫了，可是清理函數仍然會繼續調用。
如下是我本身的代碼

#include "include/apue.h"
#include <pthread.h>

void cleanup(void *arg)
{
    printf("cleanup: %s\n", (char *)arg);
}

void *thr_fn1(void *arg)
{
    printf("thread 1 start\n");
    pthread_cleanup_push(cleanup, "thread 1 first handler");
    pthread_cleanup_push(cleanup, "thread 1 second handler");
    printf("thread 1 push complete\n");
    pthread_cleanup_pop(0);
    pthread_cleanup_pop(0);
    return((void *)1);
}

void *thr_fn2(void *arg)
{
    printf("thread 2 start\n");
    pthread_cleanup_push(cleanup, "thread 2 first handler");
    pthread_cleanup_push(cleanup, "thread 2 second handler");
    printf("thread 2 push complete\n");
    if (arg)
        pthread_exit((void *)2);
    pthread_cleanup_pop(0);
    pthread_cleanup_pop(0);
    pthread_exit((void *)2);
}

int main(int argc, char *argv[])
{
    int err;
    pthread_t tid1, tid2;
    void *tret;

    err = pthread_create(&tid1, NULL, thr_fn1, (void *)1);
    if (err != 0)
        err_exit(err, "can't create thread 1");
    err = pthread_create(&tid2, NULL, thr_fn2, (void *)1);
    if (err != 0)
        err_exit(err, "can't create thread 2");
    err = pthread_join(tid1, &tret);
    if (err != 0)
        err_exit(err, "can't join with thread 1");
    printf("thread 1 exit code %ld\n", (long)tret);
    err = pthread_join(tid2, &tret);
    if (err != 0)
        err_exit(err, "can't join with thread 2");
    printf("thread 2 exit code %ld\n", (long)tret);
    exit(0);
}複製代碼

筆者在這裏將原著的第一個線程的代碼改了，令其能執行完pthread_cleanup_pop()函數之後在執行return語句，就不存在錯誤了，可是依舊不會執行清理代碼。

~/Development/Unix » ./a.out
thread 1 start
thread 2 start
thread 1 push complete
thread 2 push complete
cleanup: thread 2 second handler
cleanup: thread 2 first handler
thread 1 exit code 1
thread 2 exit code 2複製代碼

因此在開發中，若是使用了清理函數，則應當使用pthread_exit()函數返回。
咱們知道，進程若是終止了，則須要父進程執行清理工做，而線程若是終止了，那麼線程的終止狀態將會保存直到pthread_join函數的調用，可是若是使用pthread_detach函數將線程分離，則線程退出時候將會馬上回收存儲資源

int pthread_detach(pthread_t thread);

The pthread_detach() function is used to indicate to the implementation that storage for the thread thread can be reclaimed when the thread terminates. If thread has not terminated, pthread_detach() will not cause it to terminate. The effect of multiple pthread_detach() calls on the same target thread is unspecified.複製代碼

pthread_detach函數被用來標識一個線程能夠在終止後回收存儲空間，若是線程沒有終止，pthread_detach將不會致使線程終止。
其實是這樣的，當一個線程建立的時候，默認是joinable的，因此就像進程同樣，若是終止了，則須要手動使用pthread_join函數偵聽返回值而且回收空間，可是在不少狀況下，咱們建立線程後，不會去管後續，因此就須要使用這個函數對其進行分離。

相關標籤/搜索

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。