模擬測試程序,從客戶端向服務器發數據,人工控制服務器收數據。當客戶端發了一部分數據後,沒法再發送,此時服務器開始每次收取1K。
按照常理推斷,服務器收取1K後,客戶端應該可以繼續發送數據,但實測觀察發現,客戶端仍是沒法發送數據,直到服務器收取了必定數據量後,客戶端纔可以繼續發送。
linux
tcp抓包以下:緩存
[plain] view plain copy服務器
- <span style="font-size:18px;">11:42:40.217984 IP localhost.6379 > localhost.28944: . ack 65665 win 0 <nop,nop,timestamp 1816613366 1816613366>
- 0x0000: 4500 0034 5e08 4000 4006 deb9 7f00 0001 E..4^.@.@.......
- 0x0010: 7f00 0001 18eb 7110 7c79 0efb 7c5f 2ff1 ......q.|y..|_/.
- 0x0020: 8010 0000 3a7f 0000 0101 080a 6c47 51f6 ....:.......lGQ.
- 0x0030: 6c47 51f6 lGQ.
- 11:42:40.425034 IP localhost.28944 > localhost.6379: . ack 1 win 257 <nop,nop,timestamp 1816613573 1816613366>
- 0x0000: 4500 0034 7f94 4000 4006 bd2d 7f00 0001 E..4..@.@..-....
- 0x0010: 7f00 0001 7110 18eb 7c5f 2ff0 7c79 0efb ....q...|_/.|y..
- 0x0020: 8010 0101 38b0 0000 0101 080a 6c47 52c5 ....8.......lGR.
- 0x0030: 6c47 51f6 lGQ.
- 11:42:40.425047 IP localhost.6379 > localhost.28944: . ack 65665 win 0 <nop,nop,timestamp 1816613573 1816613366>
- 0x0000: 4500 0034 5e09 4000 4006 deb8 7f00 0001 E..4^.@.@.......
- 0x0010: 7f00 0001 18eb 7110 7c79 0efb 7c5f 2ff1 ......q.|y..|_/.
- 0x0020: 8010 0000 39b0 0000 0101 080a 6c47 52c5 ....9.......lGR.
- 0x0030: 6c47 51f6 lGQ.
- 11:42:40.838967 IP localhost.28944 > localhost.6379: . ack 1 win 257 <nop,nop,timestamp 1816613987 1816613573>
- 0x0000: 4500 0034 7f95 4000 4006 bd2c 7f00 0001 E..4..@.@..,....
- 0x0010: 7f00 0001 7110 18eb 7c5f 2ff0 7c79 0efb ....q...|_/.|y..
- 0x0020: 8010 0101 3643 0000 0101 080a 6c47 5463 ....6C......lGTc
- 0x0030: 6c47 52c5 lGR.
- 11:42:40.838983 IP localhost.6379 > localhost.28944: . ack 65665 win 0 <nop,nop,timestamp 1816613987 1816613366>
- 0x0000: 4500 0034 5e0a 4000 4006 deb7 7f00 0001 E..4^.@.@.......
- 0x0010: 7f00 0001 18eb 7110 7c79 0efb 7c5f 2ff1 ......q.|y..|_/.
- 0x0020: 8010 0000 3812 0000 0101 080a 6c47 5463 ....8.......lGTc
- 0x0030: 6c47 51f6 lGQ.
- 11:42:41.666922 IP localhost.28944 > localhost.6379: . ack 1 win 257 <nop,nop,timestamp 1816614815 1816613987>
- 0x0000: 4500 0034 7f96 4000 4006 bd2b 7f00 0001 E..4..@.@..+....
- 0x0010: 7f00 0001 7110 18eb 7c5f 2ff0 7c79 0efb ....q...|_/.|y..
- 0x0020: 8010 0101 3169 0000 0101 080a 6c47 579f ....1i......lGW.
- 0x0030: 6c47 5463 lGTc
- 11:42:41.666939 IP localhost.6379 > localhost.28944: . ack 65665 win 0 <nop,nop,timestamp 1816614815 1816613366>
- 0x0000: 4500 0034 5e0b 4000 4006 deb6 7f00 0001 E..4^.@.@.......
- 0x0010: 7f00 0001 18eb 7110 7c79 0efb 7c5f 2ff1 ......q.|y..|_/.
- 0x0020: 8010 0000 34d6 0000 0101 080a 6c47 579f ....4.......lGW.
- 0x0030: 6c47 51f6 lGQ.
- 11:42:43.322908 IP localhost.28944 > localhost.6379: . ack 1 win 257 <nop,nop,timestamp 1816616471 1816614815>
- 0x0000: 4500 0034 7f97 4000 4006 bd2a 7f00 0001 E..4..@.@..*....
- 0x0010: 7f00 0001 7110 18eb 7c5f 2ff0 7c79 0efb ....q...|_/.|y..
- 0x0020: 8010 0101 27b5 0000 0101 080a 6c47 5e17 ....'.......lG^.
- 0x0030: 6c47 579f lGW.
- 11:42:43.322921 IP localhost.6379 > localhost.28944: . ack 65665 win 0 <nop,nop,timestamp 1816616471 1816613366>
- 0x0000: 4500 0034 5e0c 4000 4006 deb5 7f00 0001 E..4^.@.@.......
- 0x0010: 7f00 0001 18eb 7110 7c79 0efb 7c5f 2ff1 ......q.|y..|_/.
- 0x0020: 8010 0000 2e5e 0000 0101 080a 6c47 5e17 .....^......lG^.
- 0x0030: 6c47 51f6 lGQ.
- 11:42:46.634889 IP localhost.28944 > localhost.6379: . ack 1 win 257 <nop,nop,timestamp 1816619783 1816616471>
- 0x0000: 4500 0034 7f98 4000 4006 bd29 7f00 0001 E..4..@.@..)....
- 0x0010: 7f00 0001 7110 18eb 7c5f 2ff0 7c79 0efb ....q...|_/.|y..
- 0x0020: 8010 0101 144d 0000 0101 080a 6c47 6b07 .....M......lGk.
- 0x0030: 6c47 5e17 lG^.</span>
能夠看到服務器返回了大量的ack 65665 win 0的包。
通過查閱相關資料,發現這個問題現象和tcp流控有關,因爲涉及內容太多,這裏只總結關鍵點:
1)ack 65665 win 0中的win 0,是服務器告訴客戶端:個人tcp滑窗已經滿了,沒有空間了,客戶端收到這樣的包後,中止發送數據;
2)爲何服務器收取了一部分數據後,tcp滑窗已經不是滿了的狀態,還繼續返回ack 65665 win 0呢?
這是tcp的協議規定的,當滑窗滿了後,爲了不再次很快被填滿,只有當滑窗空間達到buffer size的通常或者MSS的大小時才告訴客戶端能夠繼續發送了,即ack包中win再也不爲0。詳見以下說明:app
To avoid SWS, we simply make the rule that the receiver may not update its advertised receive window in such a way that this leaves too little usable window space on the part of the sender. In other words, we restrict the receiver from moving the right edge of the window by too small an amount. The usual minimum that the edge may be moved is either the value of theMSS parameter, or one-half the buffer size, whichever is less.less
實測和代碼驗證確認,Linux應該是等於MSS。tcp
這個問題的處理過程當中涉及到了不少tcp協議的知識,例如:MSS,SWS(Slide window system),SWS(Silly window syndrome),tcp緩存,ack機制等,有興趣的同窗能夠去查查。
完整的解釋請參考以下連接:
http://www.tcpipguide.com/free/t_TCPWindowManagementIssues.htmide