linux下使用TCP存活(keepalive)定時器

時間 2019-11-10

標籤 linux 使用 tcp 存活 keepalive 定時器欄目 Linux 简体版

原文原文鏈接

1、什麼是keepalive定時器？[1]linux

在一個空閒的（idle）TCP鏈接上，沒有任何的數據流，許多TCP/IP的初學者都對此感到驚奇。也就是說，若是TCP鏈接兩端沒有任何一個進程在向對方發送數據，那麼在這兩個TCP模塊之間沒有任何的數據交換。你可能在其它的網絡協議中發現有輪詢（polling），但在TCP中它不存在。言外之意就是咱們只要啓動一個客戶端進程，同服務器創建了TCP鏈接，無論你離開幾小時，幾天，幾星期或是幾個月，鏈接依舊存在。中間的路由器可能崩潰或者重啓，電話線可能go down或者back up，只要鏈接兩端的主機沒有重啓，鏈接依舊保持創建。express

這就能夠認爲不論是客戶端的仍是服務器端的應用程序都沒有應用程序級（application-level）的定時器來探測鏈接的不活動狀態（inactivity），從而引發任何一個應用程序的終止。然而有的時候，服務器須要知道客戶端主機是否已崩潰而且關閉，或者崩潰但重啓。許多實現提供了存活定時器來完成這個任務。編程

存活定時器是一個包含爭議的特徵。許多人認爲，即便須要這個特徵，這種對對方的輪詢也應該由應用程序來完成，而不是由TCP中實現。此外，若是兩個終端系統之間的某個中間網絡上有鏈接的暫時中斷，那麼存活選項（option）就可以引發兩個進程間一個良好鏈接的終止。例如，若是正好在某個中間路由器崩潰、重啓的時候發送存活探測，TCP就將會認爲客戶端主機已經崩潰，但事實並不是如此。服務器

存活（keepalive）並非TCP規範的一部分。在Host Requirements RFC羅列有不使用它的三個理由：（1）在短暫的故障期間，它們可能引發一個良好鏈接（good connection）被釋放（dropped），（2）它們消費了沒必要要的寬帶，（3）在以數據包計費的互聯網上它們（額外）花費金錢。然而，在許多的實現中提供了存活定時器。網絡

一些服務器應用程序可能表明客戶端佔用資源，它們須要知道客戶端主機是否崩潰。存活定時器能夠爲這些應用程序提供探測服務。Telnet服務器和Rlogin服務器的許多版本都默認提供存活選項。app

個人計算機用戶使用TCP/IP協議經過Telnet登陸一臺主機，這是可以說明須要使用存活定時器的一個經常使用例子。若是某個用戶在使用結束時只是關掉了電源，而沒有註銷（log off），那麼他就留下了一個半打開（half-open）的鏈接。在圖18.16，咱們看到如何在一個半打開鏈接上經過發送數據，獲得一個復位（reset）返回，但那是在客戶端，是由客戶端發送的數據。若是客戶端消失，留給了服務器端半打開的鏈接，而且服務器又在等待客戶端的數據，那麼等待將永遠持續下去。存活特徵的目的就是在服務器端檢測這種半打開鏈接。

2、keepalive如何工做？[1]less

在此描述中，咱們稱使用存活選項的那一段爲服務器，另外一端爲客戶端。也能夠在客戶端設置該選項，且沒有不容許這樣作的理由，但一般設置在服務器。若是鏈接兩端都須要探測對方是否消失，那麼就能夠在兩端同時設置（好比NFS）。socket

若在一個給定鏈接上，兩小時以內無任何活動，服務器便向客戶端發送一個探測段。（咱們將在下面的例子中看到探測段的樣子。）客戶端主機必須是下列四種狀態之一：
tcp

1) 客戶端主機依舊活躍（up）運行，而且從服務器可到達。從客戶端TCP的正常響應，服務器知道對方仍然活躍。服務器的TCP爲接下來的兩小時復位存活定時器，若是在這兩個小時到期以前，鏈接上發生應用程序的通訊，則定時器從新爲往下的兩小時復位，而且接着交換數據。ide

2) 客戶端已經崩潰，或者已經關閉（down），或者正在重啓過程當中。在這兩種狀況下，它的TCP都不會響應。服務器沒有收到對其發出探測的響應，而且在75秒以後超時。服務器將總共發送10個這樣的探測，每一個探測75秒。若是沒有收到一個響應，它就認爲客戶端主機已經關閉並終止鏈接。

3) 客戶端曾經崩潰，但已經重啓。這種狀況下，服務器將會收到對其存活探測的響應，但該響應是一個復位，從而引發服務器對鏈接的終止。

4) 客戶端主機活躍運行，但從服務器不可到達。這與狀態2相似，由於TCP沒法區別它們兩個。它所能代表的僅是未收到對其探測的回覆。

服務器沒必要擔憂客戶端主機被關閉而後重啓的狀況（這裏指的是操做員執行的正常關閉，而不是主機的崩潰）。當系統被操做員關閉時，全部的應用程序進程（也就是客戶端進程）都將被終止，客戶端TCP會在鏈接上發送一個FIN。收到這個FIN後，服務器TCP向服務器進程報告一個文件結束，以容許服務器檢測這種狀態。

在第一種狀態下，服務器應用程序不知道存活探測是否發生。凡事都是由TCP 層處理的，存活探測對應用程序透明，直到後面2，3，4三種狀態發生。在這三種狀態下，經過服務器的TCP，返回給服務器應用程序錯誤信息。（一般服務器向網絡發出一個讀請求，等待客戶端的數據。若是存活特徵返回一個錯誤信息，則將該信息做爲讀操做的返回值返回給服務器。）在狀態2，錯誤信息相似於「鏈接超時」。狀態3則爲「鏈接被對方復位」。第四種狀態看起來像鏈接超時，或者根據是否收到與該鏈接相關的ICMP錯誤信息，而可能返回其它的錯誤信息。

3、在Linux中如何使用keepalive？[2]

Linux has built-in support for keepalive. You need to enable TCP/IP networking in order to use it. You also need procfs support andsysctl support to be able to configure the kernel parameters at runtime.

The procedures involving keepalive use three user-driven variables:

tcp_keepalive_time
the interval between the last data packet sent (simple ACKs are not considered data) and the first keepalive probe; after the connection is marked to need keepalive, this counter is not used any further
tcp_keepalive_intvl
the interval between subsequential keepalive probes, regardless of what the connection has exchanged in the meantime
tcp_keepalive_probes
the number of unacknowledged probes to send before considering the connection dead and notifying the application layer

Remember that keepalive support, even if configured in the kernel, is not the default behavior in Linux. Programs must request keepalive control for their sockets using the setsockopt interface. There are relatively few programs implementing keepalive, but you can easily add keepalive support for most of them following the instructions.

上面一段話已經說得很明白，linux內核包含對keepalive的支持。其中使用了三個參數：tcp_keepalive_time（開啓 keepalive的閒置時長）tcp_keepalive_intvl（keepalive探測包的發送間隔）和tcp_keepalive_probes （若是對方不予應答，探測包的發送次數）；如何配置這三個參數呢？

There are two ways to configure keepalive parameters inside the kernel via userspace commands:

procfs interface
sysctl interface

We mainly discuss how this is accomplished on the procfs interface because it's the most used, recommended and the easiest to understand. The sysctl interface, particularly regarding the sysctl(2) syscall and not the sysctl(8) tool, is only here for the purpose of background knowledge.

The procfs interface

This interface requires both sysctl and procfs to be built into the kernel, and procfs mounted somewhere in the filesystem (usually on/proc, as in the examples below). You can read the values for the actual parameters by "catting" files in /proc/sys/net/ipv4/directory:

  # cat /proc/sys/net/ipv4/tcp_keepalive_time  7200  # cat /proc/sys/net/ipv4/tcp_keepalive_intvl  75  # cat /proc/sys/net/ipv4/tcp_keepalive_probes  9

The first two parameters are expressed in seconds, and the last is the pure number. This means that the keepalive routines wait for two hours (7200 secs) before sending the first keepalive probe, and then resend it every 75 seconds. If no ACK response is received for nine consecutive times, the connection is marked as broken.

Modifying this value is straightforward: you need to write new values into the files. Suppose you decide to configure the host so that keepalive starts after ten minutes of channel inactivity, and then send probes in intervals of one minute. Because of the high instability of our network trunk and the low value of the interval, suppose you also want to increase the number of probes to 20.

Here's how we would change the settings:

  # echo 600 > /proc/sys/net/ipv4/tcp_keepalive_time  # echo 60 > /proc/sys/net/ipv4/tcp_keepalive_intvl  # echo 20 > /proc/sys/net/ipv4/tcp_keepalive_probes

To be sure that all succeeds, recheck the files and confirm these new values are showing in place of the old ones.

這樣，上面的三個參數配置完畢。使這些參數重啓時保持不變的方法請閱讀參考文獻[2]。

4、在程序中如何使用keepalive？[2]-[4]

All you need to enable keepalive for a specific socket is to set the specific socket option on the socket itself. The prototype of the function is as follows:

int setsockopt(int s, int level, int optname,                 const void *optval, socklen_t optlen)

The first parameter is the socket, previously created with the socket(2); the second one must be SOL_SOCKET, and the third must beSO_KEEPALIVE . The fourth parameter must be a boolean integer value, indicating that we want to enable the option, while the last is the size of the value passed before.

According to the manpage, 0 is returned upon success, and -1 is returned on error (and errno is properly set).

There are also three other socket options you can set for keepalive when you write your application. They all use the SOL_TCP level instead of SOL_SOCKET, and they override system-wide variables only for the current socket. If you read without writing first, the current system-wide parameters will be returned.

TCP_KEEPCNT: overrides tcp_keepalive_probes
TCP_KEEPIDLE: overrides tcp_keepalive_time
TCP_KEEPINTVL: overrides tcp_keepalive_intvlint keepAlive = 1; // 開啓keepalive屬性
咱們看到keepalive是一個開關選項，能夠經過函數來使能。具體地說，可使用如下代碼：
setsockopt(rs, SOL_SOCKET, SO_KEEPALIVE, (void *)&keepAlive, sizeof(keepAlive));
上面英文資料中提到的第二個參數能夠取爲SOL_TCP，以設置keepalive的三個參數（具體代碼參考文獻[3]），在程序中實現須要頭文件「netinet/tcp.h」。固然，在實際編程時也能夠採用系統調用的方式配置的keepalive參數。
關於setsockopt的其餘參數能夠參考文獻[4]。

5、如何判斷TCP鏈接是否斷開？[3]
當tcp檢測到對端socket再也不可用時(不能發出探測包,或探測包沒有收到ACK的響應包),select會返回socket可讀,而且在recv時返回-1,同時置上errno爲ETIMEDOUT。

網上提到「ET(edge-triggered)是高速工做方式，只支持no-block socket。在這種模式下，當描述符從未就緒變爲就緒時，內核經過epoll告訴你。而後它會假設你知道文件描述符已經就緒，而且不會再爲那個文件描述符發送更多的就緒通知，直到你作了某些操做致使那個文件描述符再也不爲就緒狀態了」。

編寫代碼測試了一下：

若是客戶端發送100字節，服務器每次只讀取10字節，若是epoll_wait設置了超時而且客戶端沒有再發送數據，到達超時時間後還會有IN事件觸發，若是設置了永不超時（epoll_wait(epfd,events,20,-1)），而且客戶端再也不發送數據就不會產生IN事件。若是客戶端繼續發送數據會產生IN事件或者服務器再次mod IN事件就會有IN事件到達，這樣能夠讀取沒有讀完的數據。

在IN事件後設置OUT，可是OUT中再也不設置IN，若是客戶端發送數據，則會觸發OUT事件。

if(events[i].events&EPOLLIN)
{
ev.data.fd=sockfd;
ev.events=EPOLLOUT|EPOLLET;
epoll_ctl(epfd,EPOLL_CTL_MOD,sockfd,&ev);

} else if(events[i].events&EPOLLOUT)
{
cout << "EPOLLOUT" << endl;
//ev.data.fd=sockfd;
//設置用於注測的讀操做事件
// ev.events=EPOLLIN|EPOLLET;
//修改sockfd上要處理的事件爲EPOLIN
// epoll_ctl(epfd,EPOLL_CTL_MOD,sockfd,&ev);
}

若是客戶端屢次發送了數據，server單線程在第一次觸發IN事件後server在一直讀取直到錯誤，就算客戶端屢次發送只要服務器收取到就不會再次觸發IN事件。

單線程代碼以下：

if(events[i].events&EPOLLIN)
{
printf("EPOLLIN\n);
sockfd = events[i].data.fd;
char buf;
sleep(3);
while (read(sockfd, &buf, 1) > 0)
{
printf("%c\n",buf);
}

}

對於客戶端程序關閉socket或異常關閉，會觸發IN事件，recv返回0。

須要補充關機、拔掉網線、斷電等狀況，考慮設置keep-alive與不設置keep-alive分別測試。

阻塞socket狀況下：

（1）無Keep-alive狀況下，斷電與拔網線不會收到IN事件；關機與程序退出同樣會觸發IN事件，read返回0；

（2）在客戶端設置keep-alive，對服務器沒有影響，不會產生in事件。

（3）在服務器設置keep-alive，若是拔掉網線，在TCP_KEEPINTVL*TCP_KEEPCNT左右會產生in事件，若是沒有有數據，recv 返回-1，錯誤信息爲Connect time out(ETIMEDOUT)；若是內核接收隊列還有數據只會觸發一次in事件，recv返回讀出的字節數，若是繼續讀取會獲得Connect time out。

非阻塞socket狀況:

(1)不拔掉網線，接受in事件後循環讀，若是再沒有數據可讀會返回EAGAIN Resource temporarily unavailable (may be the same value as EWOULDBLOCK)

(2)其餘狀況同阻塞socket

設置keep-alive代碼以下：

int socket_set_keepalive( int fd){ int ret, error, flag, alive, idle, cnt, intv; /* Set: use keepalive on fd */ alive = 1; if (setsockopt (fd, SOL_SOCKET, SO_KEEPALIVE, &alive, sizeof alive) != 0) { printf ("Set keepalive error: %s.\n" , strerror (errno)); return -1; } /* １０秒鐘無數據，觸發保活機制，發送保活包 */ idle = 10; if (setsockopt(fd, SOL_TCP, TCP_KEEPIDLE , &idle, sizeof idle) != 0) { printf ("Set keepalive idle error: %s.\n" , strerror (errno)); return -1; } /* 若是沒有收到迴應，則５秒鐘後重發保活包 */ intv = 5; if (setsockopt(fd, SOL_TCP, TCP_KEEPINTVL , &intv, sizeof intv) != 0) { printf ("Set keepalive intv error: %s.\n", strerror (errno)); return -1; } /* 連續３次沒收到保活包，視爲鏈接失效 */ cnt = 3; if (setsockopt(fd, SOL_TCP, TCP_KEEPCNT , &cnt, sizeof cnt) != 0) { printf ("Set keepalive cnt error: %s.\n", strerror (errno)); return -1; } return 0;}