Linux 進程與信號的概念和操做 linux process and signals

時間 2019-11-13

標籤 linux 進程信號概念 process signals 欄目 Linux 简体版

原文原文鏈接

進程

主要參考: http://www.bogotobogo.com/Linux/linux_process_and_signals.php
譯者：李秋豪php

信號與進程幾乎控制了操做系統的每一個任務。html

在shell中輸入ps -ef命令，咱們將獲得以下結果：linux

（譯者注：-e Select all processes. Identical to -A； -f Do full-format listing. This option can be combined with many other UNIX-style options to add additional columns. It also causes the command arguments to be printed.）ios

UID        PID  PPID  C STIME TTY          TIME CMD
root         1     0  0  2010 ?        00:01:48 init 
root     21033     1  0 Apr04 ?        00:00:39 crond
root     24765     1  0 Apr08 ?        00:00:01 /usr/sbin/httpd

每個進程都會被賦予一個特殊的整數，稱爲進程標識符 (process identifier PID) ，PID的範圍是2~32768。當一個進程啓動的時候，數字最少會從2開始算，由於1是爲init進程保留的——正如上面這個例子能夠看到的，init進程會管理其餘的進程。shell

當咱們運行一個程序時，保存在硬盤上的可執行指令集就會被加載到內存中的一個區塊中，一般來講，一個linux進程是不能向這個區塊進行寫操做的。（因此說，這個區塊能夠被安全地共享）數組

一樣，系統的庫也能夠被共享。所以，即便不少程序都用到了printf這個函數，在內存中只要有一份拷貝就夠了。安全

與可以共享的庫不一樣，一個程序或許會有本身的內部變量，這些變量是保存在程序本身獨有的棧空間中的，沒法和另外的進程共享。每一個進程也有本身管理的獨有的環境變量。另外，每一個進程也應該有本身獨有的程序計數器（PC）——用來記錄程序執行到哪裏了。（執行線程請參考 linux pthread)

bash

進程表

進程表中保存了當前內存中加載的全部進程，咱們可使用ps命令將其顯示出來。可是，默認狀況下ps只會顯示和終端或者僞終端或者串行連接（serial line）保持鏈接的進程。其餘不須要和用戶終端交互的進程是由操做系統負責管理共享資源的。爲了顯示全部的程序，我可使用-e和-f參數。多線程

（譯者注：To see every process on the system using standard syntax: ps -ef）

架構

系統進程

$ ps -ax
  PID TTY      STAT   TIME COMMAND
    1 ?        Ss     1:48 init [3]
    2 ?        S<     0:03 [migration/0]
    3 ?        SN     0:00 [ksoftirqd/0]
 ....
 2981 ?        S<sl  10:14 auditd
 2983 ?        S<sl   3:43 /sbin/audispd
 ....
 3428 ?        SLs    0:00 ntpd -u ntp:ntp -p /var/run/ntpd.pid -g
 3464 ?        Ss     0:00 rpc.rquotad
 3508 ?        S<     0:00 [nfsd4]
 ....
 3812 tty1     Ss+    0:00 /sbin/mingetty tty1
 3813 tty2     Ss+    0:00 /sbin/mingetty tty2
 3814 tty3     Ss+    0:00 /sbin/mingetty tty3
 3815 tty4     Ss+    0:00 /sbin/mingetty tty4
.....
19874 pts/1    R+     0:00 ps -ax
19875 pts/1    S+     0:00 more
21033 ?        Ss     0:39 crond
24765 ?        Ss     0:01 /usr/sbin/httpd

STAT對應的字符含義以下表所示：

STAT Code	Description
R	正在運行或者有能力運行。
D	不間斷的睡眠 (等待中) - 一般是爲了等待完成輸入輸出。
S	睡眠中. 一般是在等待一個事件, 例如一個信號或者輸入變成可得到的。
T	已中止. 一般是被shell的job控制了或者正在被一個調試器進行調試。
Z	死亡/失效的殭屍進程.
N	低優先級, nice（譯者注：nice後面會提到）.
W	分頁.
s	這個進程是會話中的首進程.
+	這個進程在前臺工做組中。
l	這個進程是多線程的。
<	高優先級的任務。

觀察下面這個進程：

1 ?        Ss     1:48 init [3]

每個子進程都是由父進程fork出來的。當linux開始運行時，它只運行了一個進程：init, PID爲1。init是系統的進程管理者，而且它是其餘全部進程的直接/間接父進程。當init fork出進程後，這些進程又開始fork進程（相似於病毒傳播）。登陸就是一個例子：init會爲每一個終端經過fork出getty這個進程，經過它咱們能夠進行登陸操做。以下所示：

3812 tty1     Ss+    0:00 /sbin/mingetty tty1

getty進程會等待被終端激活，爲用戶輸出登陸時候的提示符，而後把控制交給登陸相關的程序，這些程序會創建起用戶的環境而後啓動一個shell。當用戶從這個shell退出的時候，init會啓動另外一個getty進程。

啓動新的進程並等待他們結束是一個操做系統的基本任務。咱們也能夠經過使用系統調用fork(), exec(), wait(), 完成這些工做。

一個系統調用至關於一個可控的和內核交流的入口，經過這些調用，進程能夠要求內核提供一些服務和工做。

事實上，一個系統調用會將處理器的用戶狀態轉化爲內核狀態，所以cpu能夠訪問內存中被保護的內核模塊。內核經過系統調用API爲進程提供了很是豐富的服務。

進程調度

讓我看看ps ax自己的STAT:

23603 pts/1    R+     0:00 ps ax

R表明進程23603處於runnable狀態。換句話說，它監測了本身的狀態。指示器只是代表了這個程序處於可運行狀態，並不必定正在運行（參見下面的資料，可能在runqueue中）。R+表示出這個進程是在前臺工做組中，因此它不會等待其餘的進程完成也不會等待輸入輸出完畢。這也是爲何咱們可能在ps的輸出中看到兩個以上R+的進程。

（譯者：這個地方感謝胡堯學長指點，以前有幾句話沒有理解正確）

譯者：參考一下Process State Definition和Runnable Process Definition :（有時間我會把這兩篇翻譯一下）（更新：已經翻譯了：Linux 進程狀態標識和Linux 可運行進程）

1.節選Process State Definition中前一部分：

Process state is the state field in the process descriptor.

A process descriptor is a task_struct-type data structure whose fields contain all of the information about a single process. A process, also referred to as a task, is an instance of a program in execution.

A data structure is a way of storing data in a computer so that it can be used efficiently. task_struct is a relatively large data structure (roughly 1.7 kilobytes on a 32-bit machine) that is designed to hold all the information that the kernel (i.e., the core of the operating system) has and needs about a process.

The state field in the process descriptor describes what is currently happening to a process. This field contains one of the following five flags (i.e., values):

TASK_RUNNING: The process is runnable, and it is either currently running or it is on a runqueue waiting to run. This is the only possible state for a process executing in user space (i.e., that portion of system memory in which user processes run); it can also apply to a process in kernel space (i.e., that portion of memory in which the kernel executes and provides its services) that is actively running. A runnable process is a process that is in the TASK_RUNNING process state.

A runqueue is the basic data structure in the scheduler, and it contains the list of runnable processes for the CPU (central processing unit), or for one CPU on a multiprocessor system. The scheduler, also called the process scheduler, is a part of the kernel that allocates the scare CPU time among the various runnable processes on the system.

2.Runnable Process Definition

A runnable process is a process which is in the TASK_RUNNING process state.

A process, also referred to as a task, is an instance of a program in execution. A process state is a field in the process descriptor. This field can accept any of five possible flags (i.e., values), one of which is TASK_RUNNING.

A process descriptor is a task_struct-type data structure whose fields contain all of the information regarding a single process. Its process state field describes what is currently happening to the process. A data structure is a way of storing data in a computer so that it can be used efficiently. A task_struct data structure is a data structure that is used to describe a process on the system.

The TASK_RUNNING state means that the process is runnable, and it is either currently running or on a runqueue waiting to run. This is the only possible state for a process executing in user space (i.e., that portion of system memory in which user processes run); it can also apply to a process in kernel space (i.e., that portion of memory in which the kernel executes and provides its services) that is actively running.

A runqueue is the basic data structure in the scheduler, and it contains the list of runnable processes for the CPU (central processing unit), or for one CPU on a multiprocessor system. The scheduler, also called the process scheduler, is a part of the kernel that allocates the scare CPU time among the various runnable processes on the system.

Linux內核使用一個叫作進程調度器的程序經過進程的優先級判斷哪一個進程會得到下一個cpu時間片。

一般狀況下，幾個程序會同時競爭計算資源。若是一個程序只佔用少許的計算資源而且會停下來等待輸入，咱們就說它是「安分守己的」——與此相反，有的進程會不斷的霸佔系統的計算資源。術語上咱們把「安分守己」的程序稱做美好（nice）程序。同時，這種美好程度（niceness）也是可計量的。

操做系統經過進程的nice值來判斷該進程的優先級。長時間不暫停的程序一般會有更低的優先級（譯者：沒懂，若是這樣的程序是很是不nice的——須要不少計算資源的進程，仍是給它很小的優先級嗎？），相反地，暫停的程序會獲得「獎賞」——這保證了交互式進程能夠很快的相應用戶，當它在等待用戶輸入時，操做系統會提升的它的優先級，這樣當它準備恢復運行的時候就已是高優先級了。

nice值（niceness）是一個從-20到20的整數，-20表明最高的優先級，19或者20表明最低的優先級。一個子進程的默認優先級是從它的父進程繼承來的，一般是0。咱們能夠經過nice命令設置nice值，也可使用renice命令更改nice值。nice命令每次會把進程的nice值提升10，使得它的優先級下降。只有root權限的用戶能夠下降進程的nice值（提升優先級）。在Linux上你能夠改變 /etc/security/limits.conf來容許別的用戶或者組下降nice值。

咱們能夠經過ps和參數-l或-f查看進程的nice值：

（譯者注：-l Long format. The -y option is often useful with this. -y Do not show flags; show rss in place of addr. This option can only be used with -l.）

F S   UID   PID  PPID  C PRI  NI ADDR SZ WCHAN  TTY          TIME CMD
0 S   601 12649 12648  0  75   0 -  1135 wait   pts/0    00:00:00 bash
0 S   601 12681 12649  0  76   0 -  1122 wait   pts/0    00:00:00 myTest.sh
0 S   601 12682 12681  0  76   0 -   929 -      pts/0    00:00:00 sleep
0 R   601 12683 12649  0  76   0 -  1054 -      pts/0    00:00:00 ps

:這裏咱們能夠看到 myTest.sh程序是運行在默認nice值0下（譯者注：NI列）。但若是它是這麼啓動的：

$ nice ./myTest.sh &

那麼它的nice值就會+10。

F S   UID   PID  PPID  C PRI  NI ADDR SZ WCHAN  TTY          TIME CMD
0 S   601  9835  9834  0  75   0 -  1135 wait   pts/1    00:00:00 bash
0 S   601 12744 12649  0  86  10 -  1122 wait   pts/0    00:00:00 myTest.sh
0 S   601 12745 12744  0  86  10 -   929 -      pts/0    00:00:00 sleep
0 R   601 12746 12649  0  76   0 -  1054 -      pts/0    00:00:00 ps

也能夠這樣作：

$ renice 10 12681
12681: old priority 0, new priority 10

有了更高的nice值，這個程序會更少的運行。以下圖所示，STAT列的值多了一個N標記，說明這個進程的nice值和默認不一樣了。

$ ps x
12649 pts/0    Ss     0:00 -bash
12744 pts/0    SN     0:00 /bin/bash ./myTest.sh
12745 pts/0    SN     0:00 sleep 100
12867 pts/0    R+     0:00 ps x

The PPID field of ps output indicates the parent process ID, the PID of either the process that caused this process to start or, if that process is no longer running, init (PID 1).ps輸出中PPID列表示了該進程父進程的PID，若是那個父進程沒有運行了，就會是init(PID 1) 。

init進程 / 守護進程

（譯者注：守護來自於daemon這個詞，它有兩個含義：1.(esp in Greek mythology) supernatural being that is half god, half man （尤指希臘神話中的）半人半神的精靈. 2. spirit that inspires sb to action or creativity 守護神.）

當咱們啓動系統的時候，內核會建立一個叫作init的進程（來自於/sbin/init），它是全部其餘進程的「祖宗」。

系統上全部的其餘進程都是經過調用fork()從init或者它的後代生成的。init進程老是擁有爲1的PID和超級用戶的權限。它也不能被終止掉，除非機器關機。inti的主要功能就是相應操做系統生成其餘進程並監視管理全部進程。

守護進程是一個有着特殊目的的進程（例如syslogd, httpd等等），它也是由操做系統負責生成並管理的，但它和普通的進程有如下兩個不一樣：

長壽命。一個守護程序一般會在系統啓動的時候就開始運行，直到機器關機。
它是在後臺運行的，也就是說沒有一個和它鏈接的終端能夠用來輸入輸出。

建立一個新進程

咱們能夠在一個程序中啓動另外一個程序，system庫函數就是用來建立新進程的。下面這個例子就經過調用system運行了ps.

// mySysCall.c

#include <iostream>

int main()
{
  system("ps ax");
  std::cout << "Done." << std::endl;
  exit(0);
  return 0;
}

若是運行這個程序，輸出以下：

$./mySysCall
  PID TTY      STAT   TIME COMMAND
    1 ?        Ss     1:48 init [3]
....
24447 pts/0    S+     0:00 ./mySysCall
24448 pts/0    R+     0:00 ps ax
Done.

由於system是經過一個shell啓動新的進程的，咱們也能夠作一個改變：

system("ps ax &");

運行這個新的版本，輸出以下：

Done.
  PID TTY      STAT   TIME COMMAND
    1 ?        Ss     1:48 init [3]
....
24849 pts/1    Ss+    0:00 -bash
25802 pts/1    R      0:00 ps ax

如今，system在shell命令完成後就當即返回了。由於它要求shell將這個新程序放在後臺運行，shell會在ps程序啓動後當即返回。這和咱們在shell中輸入相同的命令是同樣的：

$ps ax &

shell返回後，咱們的程序就打印出「Done.「並在ps命令有機會完成輸出前退出。這看起來有些難以理解，因此咱們也須要徹底控制進程的行爲。

exec() 系統調用

exec函數會將當前的進程替換爲一個新的進程，這個新的進程能夠由路徑或者文件參數指定。咱們可使用exec將咱們正在執行的程序切換到另外一個。

以下圖所示，咱們在bash中發起ls命令。在這種狀況下，shell做爲父進程，經過調用fork()建立出一個子進程，這個子進程隨之調用exec()將之變爲ls 。

exec會比system更加有效率，由於調用exec後父進程就不會再運行了。

（譯者注： The exec() family of functions replaces the current process image with a new process image. ）

/* Execute PATH with arguments ARGV and environment from `environ'.  */
extern int execv (__const char *__path, char *__const __argv[])
     __THROW __nonnull ((1));

/* Execute PATH with all arguments after PATH until a NULL pointer,
   and the argument after that for environment.  */
extern int execle (__const char *__path, __const char *__arg, ...)
     __THROW __nonnull ((1));

/* Execute PATH with all arguments after PATH until
   a NULL pointer and environment from `environ'.  */
extern int execl (__const char *__path, __const char *__arg, ...)
     __THROW __nonnull ((1));

/* Execute FILE, searching in the `PATH' environment variable if it contains
   no slashes, with arguments ARGV and environment from `environ'.  */
extern int execvp (__const char *__file, char *__const __argv[])
     __THROW __nonnull ((1));

/* Execute FILE, searching in the `PATH' environment variable if
   it contains no slashes, with all arguments after FILE until a
   NULL pointer and environment from `environ'.  */
extern int execlp (__const char *__file, __const char *__arg, ...)
     __THROW __nonnull ((1));

這些函數大可能是經過使用execve實現的，以p做爲後綴的函數會在環境變量PATH對應的地方搜尋那個要運行的程序，若是沒有找到可運行的那個程序，你必須給這個函數傳入一個文件的絕對路徑做爲參數。

全局變量environ能夠給新的程序傳遞環境參數。execle和execve有另外的方法：你能夠傳入一個字符串數組用來創建新程序的環境。

下面是使用execlp的一個例子：

（譯者注：unistd.h 是 C 和 C++ 程序設計語言中提供對 POSIX 操做系統 API 的訪問功能的頭文件的名稱。該頭文件由 POSIX.1 標準（單一UNIX規範的基礎）提出，故全部遵循該標準的操做系統和編譯器均應提供該頭文件（如 Unix 的全部官方版本，包括 Mac OS X、Linux 等）。）

// my_ps.c

#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>

int main()
{
  printf("ps with execlp\n");
  execlp("ps", "ps", 0);
  printf("Done.\n");
  exit(0);
}

當咱們運行這個程序，輸出將會只有ps的標準輸出而沒有"Done", 一樣的，咱們在ps的輸出中也找不到my_ps這個進程。

$./my_ps
ps with execlp
  PID TTY          TIME CMD
12377 pts/0    00:00:00 bash
18304 pts/0    00:00:00 ps

這個程序打印出了第一個「ps with execlp」，而後調用了execlp() ——在PATH環境變量對應的地方搜索一個叫作ps的程序。最後它執行ps以代替my_ps ，就像咱們在shell中執行如下命令同樣：

$ ps

（譯者注：舉個例子，實際上，bash裏面就有一個exec命令，咱們平時在bash中執行的命令都是在生成了子進程，並無替換當前shell的進程，若是在bash中直接使用exec ps會立刻」退出「bash——輸出你也來不到，若是咱們在一個bash中輸入bash，而後輸入exec ps ，就會獲得正確的輸出，可是這個時候實際上已經在第一個bash裏面了，輸入一個exit就能退出shell了。）

因此，當ps進程完畢時，咱們會獲得一個shell的提示符而不是返回到my_ps 。所以，第二個printf沒有打印出」Done「這個消息。exec獲得的新進程的PID和nice值都是和」父進程「同樣的。

爲了讓一個進程能夠同時進行多個函數，咱們可使用threads或者徹底建立另外一個進程，就像init作的，而不是像exec同樣替換現有進程。

其中的一種方法就是調用fork().

fork() 與 execv()

在下面的代碼中，fork先在父進程中穿建立子進程，隨後這個子進程調用exec將父進程的代碼替換爲path中指定的值。

void main(char *path, char *argv[]) （譯者：main函數第一個參數還能夠指針類型？）
{ 
    pid_t pid = fork(); 
    if (pid == 0) 
    { 
        printf("Child\n"); 
        execv(path, argv); 
    } 
    else 
    { 
        printf("Parent %d\n", pid); 
    } 
    printf("Parent prints this line \n"); 
}

fork() 系統調用

（譯者注：能夠先看一下中文的一個教程linux中fork（）函數詳解（原創！！實例講解）,以爲講解的不錯，特別是fork的時候流緩衝區的問題和fork返回值的問題頗有意思。）

咱們能夠調用fork()來建立一個新的進程。這個系統調用會」複製「當前的進程，在進程表中產生一個新的入口，新的進程的不少屬性和負進程是相同的。

理解fork()的關鍵點在於當它返回時，會存在兩個進程，而且，在兩個進程中，程序會從fork()返回的地方繼續開始執行。

Linux會複製父進程完整的地址空間並把它賦值給子進程。所以，父進程和子進程擁有徹底相同內容的地址空間/代碼。可是這兩個進程是互相獨立的，它們有本身的獨立的環境，數據空間，文件描述符等等。因此，和exec()相結合，fork()就是我門須要用來建立新程序的調用。

另外，要注意的是，fork()被調用一次會返回兩次！（譯者注，這句話原本放在前面，但放在這好像好一些）

對於父進程，fork()會返回新建立的子進程的PID ——這是頗有用的，由於父進程可能會建立不少進程並監視它們的狀態（經過wait()函數）（譯者注：wait, waitpid, waitid - wait for process to change state），對於子進程，fork()會返回0。若是必要的話，進程能夠經過getpid()得到本進程的PID ，經過getppid()得到父進程的PID （譯者注：若是父進程已經死亡，PPID將會是1）。若是fork()調用失敗會返回-1，這要麼是由於子進程數量上的限制（CHILD_MAX ，errno會被設置成EAGAIN) ，要麼是由於進程表中沒有足夠的空間再建立一個入口或者（虛擬）內存不足（errno會被設置成ENOMEM ）。

那麼，在fork()以後哪個進程會先運行呢？

是子進程？仍是父進程？

事實上，這是未定義的！

上面的參考圖來自於 "The Linux Programming Interface"

下面是個總結:

系統調用fork()不須要參數轉入而且會返回一個PID. 使用fork()的目的在於建立一個新的進程，也就是其父進程的子進程。在新的子進程建立後，父與子都會從fork()調用的下一條指令開始執行。所以，咱們必須區分哪個是父進程，哪個是子進程，這能夠經過fork()的返回值來判斷：

若是返回一個負值，那麼調用失敗。
若是返回的是0，那麼當前處於新建立的子進程。
若是返回一個正數，那這個數表明新建立的子進程的PID（這個正數是pid_t類型的，聲明在sys/types.h）。正常狀況下，這個正數是一個整數。另外，一個進程可使用getpid()來獲取本進程的PID。

#include <stdio.h>
#include <stdlib.h>
#include <string.h> 
//譯者注：還應該包括unistd.h

#define BUF_SIZE 150

int main()
{
  int pid = fork();
  char buf[BUF_SIZE];
  int print_count;

  switch (pid)
  {
    case -1:
      perror("fork failed");
      exit(1);
    case 0:
      /* When fork() returns 0, we are in the child process. */
      print_count = 10;
      sprintf(buf,"child process: pid = %d", pid);
      break;
    default: /* + */
      /* When fork() returns a positive number, we are in the parent process
       * (the fork return value is the PID of the newly created child process) */
      print_count = 5;
      sprintf(buf,"parent process: pid = %d", pid);
      break;
  }
  for(;print_count > 0; print_count--) {
      puts(buf);
      sleep(1);
  }
  exit(0);
}

Output is:

child process: pid = 0
parent process: pid = 13510
child process: pid = 0
parent process: pid = 13510
child process: pid = 0
parent process: pid = 13510
child process: pid = 0
parent process: pid = 13510
child process: pid = 0
parent process: pid = 13510
child process: pid = 0
child process: pid = 0
child process: pid = 0
child process: pid = 0
child process: pid = 0

正如輸出中所看到的，fork()在父進程中返回了子進程的PID，在子進程中返回了0. 咱們使用fork()建立的子進程獨立於父進程運行。可是有些時候，咱們想要知道子進程是否運行完了，若是父進程提早於子進程運行完畢，就像上面這個例子，這會很讓人困惑。因此，咱們須要經過wait()函數來等待子進程執行完畢。

譯者注：在個人機器（Ubuntu 16.04 gcc 5.4 bash 4.3.48）上運行結果以下：

frank@under:~/tmp$ ./a.out 
parent process: pid = 24238
child process: pid = 0
parent process: pid = 24238
child process: pid = 0
parent process: pid = 24238
child process: pid = 0
parent process: pid = 24238
child process: pid = 0
parent process: pid = 24238
child process: pid = 0
child process: pid = 0
frank@under:~/tmp$ child process: pid = 0 #這裏
child process: pid = 0
child process: pid = 0
child process: pid = 0
ls
a.out  hellolinux  hellolinux.c  test.c  test.s
frank@under:~/tmp$

這其中標識的那一行頗有意思，bash的提示符先於子程序結束前出現了，我猜測是由於bash只是等待父進程執行完畢而後開始接受新的輸入，對於這個父進程產生的子進程它並不關心。因而乎我作了一個小實驗：

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>

#define BUF_SIZE 150

int main()
{
  int pid = fork();
  char buf[BUF_SIZE];
  int print_count;

  switch (pid)
  {
    case -1:
      perror("fork failed");
      exit(1);
    case 0:
      /* When fork() returns 0, we are in the child process. */
      print_count = 10;
      sprintf(buf,"child process: pid = %d", pid);
      break;
    default: /* + */
      /* When fork() returns a positive number, we are in the parent process
       * (the fork return value is the PID of the newly created child process) */
      print_count = 5;
      sprintf(buf,"parent process: pid = %d", pid);
      break;
  }
  for(;print_count > 0; print_count--) {
      puts(buf);
      sleep(1);
  }
  if(pid)
  {
    return 1;//the parent process
  }
  else
  {
    return 0;//the child process
  }
}

若是bash也監控子進程，那麼因爲子進程是後來完成的，bash獲得的返回值應該是0，不然就是1.結果輸出以下：

frank@under:~/tmp$ ./a.out 
parent process: pid = 26290
child process: pid = 0
parent process: pid = 26290
child process: pid = 0
parent process: pid = 26290
child process: pid = 0
parent process: pid = 26290
child process: pid = 0
parent process: pid = 26290
child process: pid = 0
child process: pid = 0
frank@under:~/tmp$ child process: pid = 0
child process: pid = 0
child process: pid = 0
child process: pid = 0
echo $?
1
frank@under:~/tmp$

能夠看到，其返回值是1，猜測正確。

wait() 系統調用

使用wait()的主要是爲了和子進程的同步性。

暫時將父進程掛起，直到某一個子進程終止。
返回值是終止子進程的PID,對於一個成功返回的進程，父進程將會回收子進程。
若是child_status != NULL ， status 的值將會反映子進程終止的緣由。
若是父進程有多個子進程，那麼wait()將會在任何一個子進程終止的時候返回。
waitpid()能夠被用來等待特定的子進程。

父進程須要知道何時它的子進程終止了或者狀態改變了或者接收到一個信號而中止了。wait()就是監視子進程的其中一個方法（另外一個是SIGCHLD信號）。

（譯者注：SIGCHLD 20,17,18 Ign Child stopped or terminated）

wait()會鎖住調用的進程直到它的子進程退出或者接收到了一個信號，wait()會接受一個整型的地址參數並返回完成的子進程的PID 。

#include <sys/wait.h>
pid_t wait(int *child_status);

再一次說明。調用wait()的一個主要目的就是等待子進程執行完畢。

wait()的執行能夠分爲兩種狀況：

若是調用wait()的時候存在子進程，調用者將暫時被掛起，直到其中一個子進程終止它纔會恢復運行。
若是調用wait()的時候沒有子進程，那麼wait()至關於不起做用。

系統調用wait(&status) 有兩個目的：

若是調用子進程沒有調用 exit()退出，調用者將暫時被掛起，直到其中一個子進程終止它纔會恢復運行。（譯者：這他媽不都講了幾遍啊，嚴重懷疑做者湊字數）
子進程的終止狀態返回到了wait()的status變量裏。

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
//譯者：#include <unistd.h>
//譯者：#include <sys/wait.h>
//譯者：這做者是否是喝酒了啊，本身都說wait在sys/wait.h裏面。。）
#define BUF_SIZE 150

int main()
{
  int pid = fork();
  char buf[BUF_SIZE];
  int print_count;

  switch (pid)
  {
    case -1:
      perror("fork failed");
      exit(1);
    case 0:
      print_count = 10;
      sprintf(buf,"child process: pid = %d", pid);
      break;
    default:
      print_count = 5;
      sprintf(buf,"parent process: pid = %d", pid);
      break;
  }
  //if(!pid) { 譯者注：又TMD寫錯了，這是子進程，醉了。。
    if(pid) {
    int status;
    int pid_child = wait(&status;);
  }
  for(;print_count > 0; print_count--) puts(buf);
  exit(0);
}

如今父進程會等待子進程執行完畢纔會開始打印：

child process: pid = 0
child process: pid = 0
child process: pid = 0
child process: pid = 0
child process: pid = 0
child process: pid = 0
child process: pid = 0
child process: pid = 0
child process: pid = 0
child process: pid = 0
parent process: pid = 22652
parent process: pid = 22652
parent process: pid = 22652
parent process: pid = 22652
parent process: pid = 22652

譯者注：稍微改了一下，看看wait()的返回值和對status作的改變：

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <sys/wait.h>

#define BUF_SIZE 150

int main()
{
  int pid = fork();
  char buf[BUF_SIZE];
  int print_count;
  int status = 12345;
  int pid_child;

  switch (pid)
  {
    case -1:
      perror("fork failed");
      exit(1);
    case 0:
      print_count = 10;
      sprintf(buf,"child process: pid = %d", pid);
      break;
    default:
      print_count = 5;
      sprintf(buf,"parent process: pid = %d", pid);
      break;
  }
  if(pid) {
    pid_child = wait(&status);
  }
  for(;print_count > 0; print_count--) puts(buf);
    if (pid)
    {

      printf("pid = %d\n", pid);
      printf("pid_child = %d\n", pid_child);
      printf("status = %d\n", status);
    }
  exit(0);
}

運行輸出：

frank@under:~/tmp$ ./a.out 
child process: pid = 0
child process: pid = 0
child process: pid = 0
child process: pid = 0
child process: pid = 0
child process: pid = 0
child process: pid = 0
child process: pid = 0
child process: pid = 0
child process: pid = 0
parent process: pid = 842
parent process: pid = 842
parent process: pid = 842
parent process: pid = 842
parent process: pid = 842
pid = 842
pid_child = 842
status = 0
frank@under:~/tmp$

很明顯，status被改變爲0，wait()返回的值就是子進程的PID 。

exit() 庫函數調用 / _exit() 系統調用

exit(status)庫函數是用來終止進程的，同時使得進程佔用的全部資源（內存，打開的文件描述符等待）釋放掉，被內核進行再分配處理，以便被別的進程所使用。傳入的status參數決定了這個進程結束時候的狀態，這個狀態是能夠被wait()所捕獲的。

另外，exit()是_exit()系統調用的抽象......在fork()以後，一般狀況下只有一個父進程的子進程會經過exit()終止掉，其他的進程應該使用_exit() 。」 — The Linux Programming Interface

譯者：參考 "Linux Programmer's Manual" :

The function _exit() terminates the calling process "immediately".  Any open file descriptors belonging to the process are closed; any children of the process are inherited by process 1, init, and the process's parent is sent a SIGCHLD signal.

The  value  status  is  returned to the parent process as the process's exit status, and can be collected using one of the  wait(2)  family  of calls.

The function _Exit() is equivalent to _exit().

殭屍進程

父進程和子進程的存活時間通常都不相同：要麼父進程活得長要麼相反。

那麼，若是子進程在父進程尚未調用wait()以前終止了會怎麼樣？事實是，即便子進程已經終止了，父進程應該仍是容許調用wait()查看這個子進程的終止狀態。內核經過將子進程變爲一個殭屍進程來處理這樣的狀況。這意味着大多數子進程佔用的資源都被釋放掉以便系統再分配利用。

事實上，當一個進程終止後，它不會當即從內存中消失——它的進程描述符 （譯者注：進程描述符我之後會在另外一篇定義進程狀態的文章中列出的）還會駐留在內存中（這隻會佔用不多內存）。進程的狀態會變爲 EXIT_ZOMBIE 而且經過信號 SIGCHLD 告知其父進程它已經「死亡」了。父進程應該經過調用wait()來讀取這個殭屍進程的退出狀態和其餘信息。在wait()調用以後，這個殭屍進程就會徹底從內存中消失掉。

這一般發生的很是快，因此你不會看到殭屍進程在你的電腦上不斷增長。然而，若是一個父進程歷來不調用wait() ，它產生的殭屍進程就會在程序結束前一直駐留內存。 -來自 what-is-a-zombie-process-on-linux.

// file - zombie.c

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
//譯者注：少了#include <unistd.h>
#define BUF_SIZE 150

int main()
{
  int pid = fork();
  char buf[BUF_SIZE];
  int print_count;

  switch (pid)
  {
    case -1:
      perror("fork failed");
      exit(1);
    case 0:
      print_count = 2;
      sprintf(buf,"child process: pid = %d", pid);
      break;
    default:
      print_count = 10;
      sprintf(buf,"parent process: pid = %d", pid);
      break;
  }
  for(;print_count > 0; print_count--) {
      puts(buf);
      sleep(1);
  }
  exit(0);
}

若是你運行以上代碼，子進程會在父進程結束前結束，而且會變成一個殭屍進程直到父進程結束。以下所示：

譯者注（PID爲25351，S爲Z，CMD爲<defunct>）

$ ./zombie
$ ps -la
F S   UID   PID  PPID  C PRI  NI ADDR SZ WCHAN  TTY          TIME CMD
0 S   601 25350 12377  0  75   0 -   381 -      pts/0    00:00:00 zombie
1 Z   601 25351 25350  0  78   0 -     0 exit   pts/0    00:00:00 zomb <defunct>
0 R   601 25352 12377  0  77   0 -  1054 -      pts/0    00:00:00 ps

如下描述來自於 what-is-a-zombie-process-on-linux.

殭屍進程不會消耗任何系統資源（事實上，每個殭屍進程只會使用一丁點內存來保存進程描述符）。然而，每個殭屍進程仍是會保留它的PID。Linux在32位系統上有一個固定的PID範圍：32767。若是殭屍進程以很快的速度累計——例如，一個編寫錯誤的服務程序，那麼很快就將沒有剩餘的PID可使用，其餘正常的進程也啓動不了了。

因此，少部分殭屍進程仍是無傷大雅的，雖然在必定程度上反映了其父進程存在一些bug。

若是父進程非正常終止，它的子進程會變成init的子進程。殭屍進程會駐留在內存中直到init將其釋放。雖然只是一小段時間，它們也會再釋放前佔用PID。

咱們不能像終止正常進程那樣使用 SIGKILL 信號終止一個殭屍進程。對於殭屍進程，UNIX像電影中的那樣——它不能被信號終止，甚至是(silver bullet，譯者注：銀色子彈是指一種雞尾酒。援引自西方的魔幻故事中驅魔的銀色子彈，在魔幻故事中有驅魔效果。) SIGKILL 都不行。事實上，這是一個故意爲之的特性，爲了確保父進程老是能夠最終調用wait() 。記住，咱們不須要爲一小撮殭屍進程擔心，除非它們快速累計起來了。可是，仍是有一些方法拜託殭屍進程的。

其中的一種方法就是像殭屍進程的父進程發送SIGCHLD 信號。這個信號告訴父進程執行wait()而後清除殭屍子進程。可使用kill命令發送這個信號：（其中的pid是父進程的PID ）

kill -s SIGCHLD pid

然而，若是父進程沒有正確處理SIGCHLD信號，這就不會有效果。咱們必須終止父進程——這些剩下的殭屍子進程的父進程會變成init，而init會按期執行wait() 系統調用去清理殭屍子進程，因此init會使得殭屍進程不那麼「囂張」。

若是父進程持續創造殭屍進程，咱們就必須debug它了，讓它正確的調用wait()來回收它的殭屍子進程。

殭屍進程並不一樣於孤兒進程（orphan process）。孤兒進程是指一個持續運行的程序，可是它的父進程已經終止了。它們不是殭屍——它們會被init收養。（譯者：哈哈，生動形象）
換句話說，在父進程終止後，對子進程調用getppid() 會獲得1（init ）。這能夠用來判斷一個進程的父進程是否已經終止。（假設這個子進程不是一開始就是init建立的）

Signals

信號是一種通知，一種由操做系統或者應用程序發出的消息。信號是一種單向異步通知方法，其多是由內核傳給進程的，也多是由進程傳給進程的，也多是本身傳給本身的。信號通常都是用來告知進程一些事件，例如分段錯誤或者用戶按下了CTRL-C。

Linux內核實現了大概30種信號，每一種信號都標記爲一個整數，從1到31.信號不會有任何參數，它們本身的名字也大概解釋了它們的含義。例如SIGKILL 或者9號信號告訴程序有人想要殺死它， SIGHUP體現出發生了一個終端上的掛起操做，它在i386架構上是1號信號。

除了SIGKILL 和 SIGSTOP 老是終止或者中止進程，進程能夠控制如何處理它們收到的信號。它們能夠：

接受信號默認的操做，例如終止進程、終止並coredump、中止進程、什麼都不作等等。
或者，進程能夠選擇忽略或者處理信號。
1. 默默丟棄信號。
2. 程序收到信號後跳到用戶實現的信號處理模塊，處理完成後控制從新回到以前被打斷的地方並繼續執行程序。

Signal	Name	Description
SIGHUP	1	Hangup (POSIX)
SIGINT	2	Terminal interrupt (ANSI)
SIGQUIT	3	Terminal quit (POSIX)
SIGILL	4	Illegal instruction (ANSI)
SIGTRAP	5	Trace trap (POSIX)
SIGIOT	6	IOT Trap (4.2 BSD)
SIGBUS	7	BUS error (4.2 BSD)
SIGFPE	8	Floating point exception (ANSI)
SIGKILL	9	Kill(can't be caught or ignored) (POSIX)
SIGUSR1	10	User defined signal 1 (POSIX)
SIGSEGV	11	Invalid memory segment access (ANSI)
SIGUSR2	12	User defined signal 2 (POSIX)
SIGPIPE	13	Write on a pipe with no reader, Broken pipe (POSIX)
SIGALRM	14	Alarm clock (POSIX)
SIGTERM	15	Termination (ANSI)
SIGSTKFLT	16	Stack fault
SIGCHLD	17	Child process has stopped or exited, changed (POSIX)
SIGCONTv	18	Continue executing, if stopped (POSIX)
SIGSTOP	19	Stop executing(can't be caught or ignored) (POSIX)
SIGTSTP	20	Terminal stop signal (POSIX)
SIGTTIN	21	Background process trying to read, from TTY (POSIX)
SIGTTOU	22	Background process trying to write, to TTY (POSIX)
SIGURG	23	Urgent condition on socket (4.2 BSD)
SIGXCPU	24	CPU limit exceeded (4.2 BSD)
SIGXFSZ	25	File size limit exceeded (4.2 BSD)
SIGVTALRM	26	Virtual alarm clock (4.2 BSD)
SIGPROF	27	Profiling alarm clock (4.2 BSD)
SIGWINCH	28	Window size change (4.3 BSD, Sun)
SIGIO	29	I/O now possible (4.2 BSD)
SIGPWR	30	Power failure restart (System V)

術語產生（raise）表明信號的生成，術語捕獲（catch）表明信號的接受。

信號是由錯誤的條件引起的，它們多是由shell和終端的處理程序發出的終端指令，也多是由一個進程向另外一個進程傳送的修改行爲的指令。

信號能夠被：

產生
捕獲
採起行動
忽略

若是一個進程收到了像 SIGFPE, SIGKILL, 這樣的信號，進程會當即終止，而且會建立一個 core dump文件，這個文件是該進程的內存鏡像，咱們能夠利用它進行debug。

可參考 coredump debugging

舉個常見的栗子，當咱們輸出 interrupt character 的時候（即Ctrl+C），ISGINT 信號就會被送給前臺程序 （即目前正在運行的程序）。這會使得程序終止，除非它有捕獲該信號的安排。

kill命令能夠用來給進程發送信號。例如咱們想要給PID爲pid_number發送 hangup信號：

kill -HUP pid_number

kill命令有一個好用的變體killall ，它能夠向全部運行了一個命令的進程發送同一個信號。例如，咱們給全部運行了 inetd的進程發送一個 reread信號：

$ killall -HUP inetd

上面這個命令可使得inetd程序從新讀取他的配置文件。

在下面這個例子中，程序會對Ctrl+C作出反應而不是終止前臺程序。可是若是咱們再次輸入Ctrl+C的話，它就會終止：

譯者注：（關於signal這個函數）

SYNOPSIS
#include <signal.h>

typedef void (*sighandler_t)(int);

sighandler_t signal(int signum, sighandler_t handler);

DESCRIPTION
The  behavior  of  signal()  varies  across UNIX versions, and has also varied historically across different versions of Linux.  Avoid its use: use sigaction(2) instead.   See  Portability below.

signal()  sets  the  disposition  of the signal signum to handler, which is either SIG_IGN,SIG_DFL, or the address of a programmer-defined function (a "signal handler").

If the signal signum is delivered to the process, then one of the following happens:

*  If the disposition is set to SIG_IGN, then the signal is ignored.

*  If the disposition is set to SIG_DFL, then the default action associated with the signal(see signal(7)) occurs.

*  If  the  disposition is set to a function, then first either the disposition is reset to SIG_DFL, or the signal is blocked (see Portability below), and then  handler  is  called with  argument  signum.   If  invocation of the handler caused the signal to be blocked, then the signal is unblocked upon return from the handler.

The signals SIGKILL and SIGSTOP cannot be caught or ignored.

程序：

#include <stdio.h>
    #include <unistd.h>
    #include <signal.h>

    void my_signal_interrupt(int sig)
    {
      printf("I got signal %d\n", sig);
      (void) signal(SIGINT, SIG_DFL);
    }

    int main()
    {
      (void) signal(SIGINT,my_signal_interrupt);

      while(1) {
          printf("Waiting for interruption...\n");
          sleep(1);
      }
    }

輸出以下：

Waiting for interruption...
    Waiting for interruption...
    Waiting for interruption...
    Waiting for interruption...
    Waiting for interruption...
    Waiting for interruption...
    I got signal 2
    Waiting for interruption...
    Waiting for interruption...
    Waiting for interruption...
    Waiting for interruption...
    Waiting for interruption...
    Waiting for interruption...
    Waiting for interruption...
    Waiting for interruption...
    Waiting for interruption...

當咱們按下Ctrl+C的時候SIGINT信號被傳入進該進程，因爲咱們設置了由my_signal_interrupt()處理這個信號，程序不會終止，而是進入my_signal_interrupt() ，在my_signal_interrupt() 中，咱們打印出「I got signal %d\n」，並將對 SIGINT的處理從新變爲信號默認的動做，因此第二次傳入 SIGINT信號時，程序就會執行終止操做。

相關標籤/搜索

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。