client,server,nginx 在使用keepAlive 專題

2. TCP keepalive overview

In order to understand what TCP keepalive (which we will just call keepalive) does, you need do nothing more than read the name: keep TCP alive. This means that you will be able to check your connected socket (also known as TCP sockets), and determine whether the connection is still up and running or if it has broken.php


2.1. What is TCP keepalive?

The keepalive concept is very simple: when you set up a TCP connection, you associate a set of timers. Some of these timers deal with the keepalive procedure. When the keepalive timer reaches zero, you send your peer a keepalive probe packet with no data in it and the ACK flag turned on. You can do this because of the TCP/IP specifications, as a sort of duplicate ACK, and the remote endpoint will have no arguments, as TCP is a stream-oriented protocol. On the other hand, you will receive a reply from the remote host (which doesn't need to support keepalive at all, just TCP/IP), with no data and the ACK set.html

If you receive a reply to your keepalive probe, you can assert that the connection is still up and running without worrying about the user-level implementation. In fact, TCP permits you to handle a stream, not packets, and so a zero-length data packet is not dangerous for the user program.面試

This procedure is useful because if the other peers lose their connection (for example by rebooting) you will notice that the connection is broken, even if you don't have traffic on it. If the keepalive probes are not replied to by your peer, you can assert that the connection cannot be considered valid and then take the correct action.編程


2.2. Why use TCP keepalive?

You can live quite happily without keepalive, so if you're reading this, you may be trying to understand if keepalive is a possible solution for your problems. Either that or you've really got nothing more interesting to do instead, and that's okay too. :)瀏覽器

Keepalive is non-invasive, and in most cases, if you're in doubt, you can turn it on without the risk of doing something wrong. But do remember that it generates extra network traffic, which can have an impact on routers and firewalls.服務器

In short, use your brain and be careful.網絡

In the next section we will distinguish between the two target tasks for keepalive:app

 

  • Checking for dead peersless

  • Preventing disconnection due to network inactivity運維

 


2.3. Checking for dead peers

Keepalive can be used to advise you when your peer dies before it is able to notify you. This could happen for several reasons, like kernel panic or a brutal termination of the process handling that peer. Another scenario that illustrates when you need keepalive to detect peer death is when the peer is still alive but the network channel between it and you has gone down. In this scenario, if the network doesn't become operational again, you have the equivalent of peer death. This is one of those situations where normal TCP operations aren't useful to check the connection status.

Think of a simple TCP connection between Peer A and Peer B: there is the initial three-way handshake, with one SYN segment from A to B, the SYN/ACK back from B to A, and the final ACK from A to B. At this time, we're in a stable status: connection is established, and now we would normally wait for someone to send data over the channel. And here comes the problem: unplug the power supply from B and instantaneously it will go down, without sending anything over the network to notify A that the connection is going to be broken. A, from its side, is ready to receive data, and has no idea that B has crashed. Now restore the power supply to B and wait for the system to restart. A and B are now back again, but while A knows about a connection still active with B, B has no idea. The situation resolves itself when A tries to send data to B over the dead connection, and B replies with an RST packet, causing A to finally to close the connection.

Keepalive can tell you when another peer becomes unreachable without the risk of false-positives. In fact, if the problem is in the network between two peers, the keepalive action is to wait some time and then retry, sending the keepalive packet before marking the connection as broken.

    _____                                                     _____
   |     |                                                   |     |
   |  A  |                                                   |  B  |
   |_____|                                                   |_____|
      ^                                                         ^
      |--->--->--->-------------- SYN -------------->--->--->---|
      |---<---<---<------------ SYN/ACK ------------<---<---<---|
      |--->--->--->-------------- ACK -------------->--->--->---|
      |                                                         |
      |                                       system crash ---> X
      |
      |                                     system restart ---> ^
      |                                                         |
      |--->--->--->-------------- PSH -------------->--->--->---|
      |---<---<---<-------------- RST --------------<---<---<---|
      |                                                         |

      

2.4. Preventing disconnection due to network inactivity

The other useful goal of keepalive is to prevent inactivity from disconnecting the channel. It's a very common issue, when you are behind a NAT proxy or a firewall, to be disconnected without a reason. This behavior is caused by the connection tracking procedures implemented in proxies and firewalls, which keep track of all connections that pass through them. Because of the physical limits of these machines, they can only keep a finite number of connections in their memory. The most common and logical policy is to keep newest connections and to discard old and inactive connections first.

Returning to Peers A and B, reconnect them. Once the channel is open, wait until an event occurs and then communicate this to the other peer. What if the event verifies after a long period of time? Our connection has its scope, but it's unknown to the proxy. So when we finally send data, the proxy isn't able to correctly handle it, and the connection breaks up.

Because the normal implementation puts the connection at the top of the list when one of its packets arrives and selects the last connection in the queue when it needs to eliminate an entry, periodically sending packets over the network is a good way to always be in a polar position with a minor risk of deletion.

    _____           _____                                     _____
   |     |         |     |                                   |     |
   |  A  |         | NAT |                                   |  B  |
   |_____|         |_____|                                   |_____|
      ^               ^                                         ^
      |--->--->--->---|----------- SYN ------------->--->--->---|
      |---<---<---<---|--------- SYN/ACK -----------<---<---<---|
      |--->--->--->---|----------- ACK ------------->--->--->---|
      |               |                                         |
      |               | <--- connection deleted from table      |
      |               |                                         |
      |--->- PSH ->---| <--- invalid connection                 |
      |               |                                         |

      

3. Using TCP keepalive under Linux

Linux has built-in support for keepalive. You need to enable TCP/IP networking in order to use it. You also need procfs support and sysctl support to be able to configure the kernel parameters at runtime.

The procedures involving keepalive use three user-driven variables:

 

tcp_keepalive_time

the interval between the last data packet sent (simple ACKs are not considered data) and the first keepalive probe; after the connection is marked to need keepalive, this counter is not used any further

tcp_keepalive_intvl

the interval between subsequential keepalive probes, regardless of what the connection has exchanged in the meantime

tcp_keepalive_probes

the number of unacknowledged probes to send before considering the connection dead and notifying the application layer

 

Remember that keepalive support, even if configured in the kernel, is not the default behavior in Linux. Programs must request keepalive control for their sockets using the setsockopt interface. There are relatively few programs implementing keepalive, but you can easily add keepalive support for most of them following the instructions explained later in this document.

http://www.tldp.org/HOWTO/html_single/TCP-Keepalive-HOWTO/

keepalive不是說TCP的常鏈接,當咱們做爲服務端,一個客戶端鏈接上來,若是設置了keeplive爲true,當對方沒有發送任何數據過來,超過一個時間(看系統內核參數配置),那麼咱們這邊會發送一個ack探測包發到對方,探測雙方的TCP/IP鏈接是否有效(對方可能斷點,斷網)。若是不設置,那麼客戶端宕機時,服務器永遠也不知道客戶端宕機了,仍然保存這個失效的鏈接。
固然,在客戶端也可使用這個參數。客戶端Socket會每隔段的時間(大約兩個小時)就會利用空閒的鏈接向服務器發送一個數據包。這個數據包並無其它的做用,只是爲了檢測一下服務器是否仍處於活動狀態。若是服務器未響應這個數據包,在大約11分鐘後,客戶端Socket再發送一個數據包,若是在12分鐘內,服務器還沒響應,那麼客戶端Socket將關閉。若是將Socket選項關閉,客戶端Socket在服務器無效的狀況下可能會長時間不會關閉。
儘管keepalive的好處並很少,可是不少開發者提倡在更高層次的應用程序代碼中控制超時設置和死的套接字。同時須要記住,keepalive不容許你爲探測套接字終點(endpoint)指定一個值。因此建議開發者使用的另外一種比keepalive更好的解決方案是修改超時設置套接字選項。
說白了:這個參數其實對應用層的程序而言沒有什麼用。能夠經過應用層實現瞭解服務端或客戶端狀態,而決定是否繼續維持該Socket。

 

不少應用層協議都有HeartBeat機制,一般是客戶端每隔一小段時間向服務器發送一個數據包,通知服務器本身仍然在線,並傳輸一些可能必要的數據。使用心跳包的典型協議是IM,好比QQ/MSN/飛信等協議。

學過TCP/IP的同窗應該都知道,傳輸層的兩個主要協議是UDP和TCP,其中UDP是無鏈接的、面向packet的,而TCP協議是有鏈接、面向流的協議。

因此很是容易理解,使用UDP協議的客戶端(例如早期的「OICQ」,據說OICQ.com這兩天被搶注了來着,好古老的回憶)須要定時向服務器發送心跳包,告訴服務器本身在線。

然而,MSN和如今的QQ每每使用的是TCP鏈接了,儘管TCP/IP底層提供了可選的KeepAlive(ACK-ACK包)機制,可是它們也仍是實現了更高層的心跳包。彷佛既浪費流量又浪費CPU,有點莫名其妙。

具體查了下,TCP的KeepAlive機制是這樣的,首先它貌似默認是不打開的,要用setsockopt將SOL_SOCKET.SO_KEEPALIVE設置爲1纔是打開,而且能夠設置三個參數tcp_keepalive_time/tcp_keepalive_probes/tcp_keepalive_intvl,分別表示鏈接閒置多久開始發keepalive的ack包、發幾個ack包不回覆才當對方死了、兩個ack包之間間隔多長,在我測試的Ubuntu Server 10.04下面默認值是7200秒(2個小時,要不要這麼蛋疼啊!)、9次、75秒。因而鏈接就了有一個超時時間窗口,若是鏈接之間沒有通訊,這個時間窗口會逐漸減少,當它減少到零的時候,TCP協議會向對方發一個帶有ACK標誌的空數據包(KeepAlive探針),對方在收到ACK包之後,若是鏈接一切正常,應該回復一個ACK;若是鏈接出現錯誤了(例如對方重啓了,鏈接狀態丟失),則應當回覆一個RST;若是對方沒有回覆,服務器每隔intvl的時間再發ACK,若是連續probes個包都被無視了,說明鏈接被斷開了。

這裏有一篇很是詳細的介紹文章: http://tldp.org/HOWTO/html_single/TCP-Keepalive-HOWTO ,包括了KeepAlive的介紹、相關內核參數、C編程接口、如何爲現有應用(能夠或者不能夠修改源碼的)啓用KeepAlive機制,很值得詳讀。

這篇文章的2.4節說的是「Preventing disconnection due to network inactivity」,阻止因網絡鏈接不活躍(長時間沒有數據包)而致使的鏈接中斷,說的是,不少網絡設備,尤爲是NAT路由器,因爲其硬件的限制(例如內存、CPU處理能力),沒法保持其上的全部鏈接,所以在必要的時候,會在鏈接池中選擇一些不活躍的鏈接踢掉。典型作法是LRU,把最久沒有數據的鏈接給T掉。經過使用TCP的KeepAlive機制(修改那個time參數),可讓鏈接每隔一小段時間就產生一些ack包,以下降被T掉的風險,固然,這樣的代價是額外的網絡和CPU負擔。

前面說到,許多IM協議實現了本身的心跳機制,而不是直接依賴於底層的機制,不知道真正的緣由是什麼。

就我看來,一些簡單的協議,直接使用底層機制就能夠了,對上層徹底透明,下降了開發難度,不用管理鏈接對應的狀態。而那些本身實現心跳機制的協議,應該是指望經過發送心跳包的同時來傳輸一些數據,這樣服務端能夠獲知更多的狀態。例如某些客戶端很喜歡收集用戶的信息……反正是要發個包,不如再塞點數據,不然包頭又浪費了……

https://www.felix021.com/blog/read.php?2076

 

爲何要有KeepAlive?

 在談KeepAlive以前,咱們先來了解下簡單TCP知識(知識很簡單,高手直接忽略)。首先要明確的是在TCP層是沒有「請求」一說的,常常聽到在TCP層發送一個請求,這種說法是錯誤的。

 

TCP是一種通訊的方式,「請求」一詞是事務上的概念,HTTP協議是一種事務協議,若是說發送一個HTTP請求,這種說法就沒有問題。也常常聽到面試官反饋有些面試運維的同窗,基本的TCP三次握手的概念不清楚,面試官問TCP是如何創建連接,面試者上來就說,假如我是客戶端我發送一個請求給服務端,服務端發送一個請求給我。。。


TCP層是沒有請求的概念,HTTP協議是事務性協議纔有請求的概念,TCP報文承載HTTP協議的請求(Request)和響應(Response)。

如今纔是開始說明爲何要有KeepAlive。連接創建以後,若是應用程序或者上層協議一直不發送數據,或者隔很長時間才發送一次數據,當連接好久沒有數據報文傳輸時如何去肯定對方還在線,究竟是掉線了仍是確實沒有數據傳輸,連接還需不須要保持,這種狀況在TCP協議設計中是須要考慮到的。

TCP協議經過一種巧妙的方式去解決這個問題,當超過一段時間以後,TCP自動發送一個數據爲空的報文給對方,若是對方迴應了這個報文,說明對方還在線,連接能夠繼續保持,若是對方沒有報文返回,而且重試了屢次以後則認爲連接丟失,沒有必要保持連接。

如何開啓KeepAlive?

KeepAlive並非默認開啓的,在Linux系統上沒有一個全局的選項去開啓TCP的KeepAlive。須要開啓KeepAlive的應用必須在TCP的socket中單獨開啓。Linux Kernel有三個選項影響到KeepAlive的行爲:

1.net.ipv4.tcpkeepaliveintvl = 75
2.net.ipv4.tcpkeepaliveprobes = 9
3.net.ipv4.tcpkeepalivetime = 7200

tcpkeepalivetime的單位是秒,表示TCP連接在多少秒以後沒有數據報文傳輸啓動探測報文; tcpkeepaliveintvl單位是也秒,表示前一個探測報文和後一個探測報文之間的時間間隔,tcpkeepaliveprobes表示探測的次數。

TCP socket也有三個選項和內核對應,經過setsockopt系統調用針對單獨的socket進行設置:

TCPKEEPCNT: 覆蓋 tcpkeepaliveprobes
TCPKEEPIDLE: 覆蓋 tcpkeepalivetime
TCPKEEPINTVL: 覆蓋 tcpkeepalive_intvl

舉個例子,以個人系統默認設置爲例,kernel默認設置的tcpkeepalivetime是7200s, 若是我在應用程序中針對socket開啓了KeepAlive,而後設置的TCP_KEEPIDLE爲60,那麼TCP協議棧在發現TCP連接空閒了60s沒有數據傳輸的時候就會發送第一個探測報文。

TCP KeepAlive和HTTP的Keep-Alive是同樣的嗎?

估計不少人乍看下這個問題才發現其實常常說的KeepAlive不是這麼回事,實際上在沒有特指是TCP仍是HTTP層的KeepAlive,不能混爲一談。TCP的KeepAlive和HTTP的Keep-Alive是徹底不一樣的概念。

TCP層的KeepAlive上面已經解釋過了。 HTTP層的Keep-Alive是什麼概念呢? 在講述TCP連接創建的時候,我畫了一張三次握手的示意圖,TCP在創建連接以後, HTTP協議使用TCP傳輸HTTP協議的請求(Request)和響應(Response)數據,一次完整的HTTP事務以下圖:

這張圖我簡化了HTTP(Req)和HTTP(Resp),實際上的請求和響應須要多個TCP報文。

從圖中能夠發現一個完整的HTTP事務,有連接的創建,請求的發送,響應接收,斷開連接這四個過程,早期經過HTTP協議傳輸的數據以文本爲主,一個請求可能就把全部要返回的數據取到,可是,如今要展示一張完整的頁面須要不少個請求才能完成,如圖片,JS,CSS等,若是每個HTTP請求都須要新建並斷開一個TCP,這個開銷是徹底沒有必要的。

開啓HTTP Keep-Alive以後,能複用已有的TCP連接,當前一個請求已經響應完畢,服務器端沒有當即關閉TCP連接,而是等待一段時間接收瀏覽器端可能發送過來的第二個請求,一般瀏覽器在第一個請求返回以後會當即發送第二個請求,若是某一時刻只能有一個連接,同一個TCP連接處理的請求越多,開啓KeepAlive能節省的TCP創建和關閉的消耗就越多。

固然一般會啓用多個連接去從服務器器上請求資源,可是開啓了Keep-Alive以後,仍然能加快資源的加載速度。HTTP/1.1以後默認開啓Keep-Alive, 在HTTP的頭域中增長Connection選項。當設置爲Connection:keep-alive表示開啓,設置爲Connection:close表示關閉。實際上HTTP的KeepAlive寫法是Keep-Alive,跟TCP的KeepAlive寫法上也有不一樣。因此TCP KeepAlive和HTTP的Keep-Alive不是同一回事情。

Nginx的TCP KeepAlive如何設置?

開篇提到我最近遇到的問題,Client發送一個請求到Nginx服務端,服務端須要通過一段時間的計算纔會返回, 時間超過了LVS Session保持的90s,在服務端使用Tcpdump抓包,本地經過wireshark分析顯示的結果如第二副圖所示,第5條報文和最後一條報文之間的時間戳大概差了90s。

在肯定是LVS的Session保持時間到期的問題以後,我開始在尋找Nginx的TCP KeepAlive如何設置,最早找到的選項是keepalivetimeout,從同事那裏得知keepalivetimeout的用法是當keepalivetimeout的值爲0時表示關閉keepalive,當keepalivetimeout的值爲一個正整數值時表示連接保持多少秒,因而把keepalivetimeout設置成75s,可是實際的測試結果代表並不生效。 

顯然keepalivetimeout不能解決TCP層面的KeepAlive問題,實際上Nginx涉及到keepalive的選項還很多,Nginx一般的使用方式以下:


從TCP層面Nginx不只要和Client關心KeepAlive,並且還要和Upstream關心KeepAlive, 同時從HTTP協議層面,Nginx須要和Client關心Keep-Alive,若是Upstream使用的HTTP協議,還要關心和Upstream的Keep-Alive,總而言之,還比較複雜。

因此搞清楚TCP層的KeepAlive和HTTP的Keep-Alive以後,就不會對於Nginx的KeepAlive設置錯。我當時解決這個問題時候不肯定Nginx有配置TCP keepAlive的選項,因而我打開Ngnix的源代碼,
在源代碼裏面搜索TCP_KEEPIDLE,相關的代碼以下:

 

從代碼的上下文我發現TCP KeepAlive能夠配置,因此我接着查找經過哪一個選項配置,最後發現listen指令的so_keepalive選項能對TCP socket進行KeepAlive的配置。

http://www.bubuko.com/infodetail-260176.html




KeepAlive,此次看下Nginx加Tomcat作反向代理這種典型場景下的KeepAlive配置與否的影響。如下圖中的數據只能作個定性的參考,具體要根據實際業務測試。

 

 

 

 

 

  • 顯然keepalive須要client和sever同時支持才生效;
  • 未使用keepalive(不管是客戶端仍是服務端不支持),服務端會主動關閉TCP鏈接,存在大量的TIME_WAI;
  • 是否使用keepalive比較複雜,並非單純的一個http頭決定的;
  • Nginx彷佛跟upstream之間維持着一個長鏈接池,因此不多會看到TIME_WAIT,都處於ESTABLISHED狀態。

Nginx有關KeepAlive的配置有兩處:

一處是http節點下的keepalive_timeout,這個設置的是跟client(圖中downstream)的鏈接超時時間;還有一處是upstream中配置的keepalive,注意這個單位是數量不是時間。

相關文章
相關標籤/搜索