Spring Zuul 性能調優，如何提高平均響應時間200% ？

時間 2020-06-03

標籤 spring zuul 性能如何提高平均響應時間 200% 欄目 Spring 简体版

原文原文鏈接

最近負責公司的 Gateway 項目，咱們用 Spring Zuul 來作 HTTP 轉發，可是發現請求多的時候，AWS 的健康檢查就失敗了，可是實際上程序還在跑，在日誌上也沒有任何東西錯誤打印出來出來。經過自己上報的性能數據發現，backend_processing_time 很是高，正常的狀況下，這個數據約等於下游服務的響應時間。可是下游服務的響應時間都在500毫秒左右，因此問題出在 Zuul 自己上。數據庫

咱們的 backend_processing_time 實際上就是取的 Zuul 自己的 SimpleHostRoutingFilter 的執行時間，若是花在網絡通訊上的時間很少，那麼必定是這個 Filter 自己在哪裏卡住了。我閱讀了這個 Filter 的源碼，發現實際上這個 Filter 自己是用的 Apache HTTP Client 來執行網絡請求的，並且是用的池化的連接。網絡

在 Zuul 的配置裏，有這麼一個配置 `zuu.host.max-per-route-connections` 這個配置對應的就是 Apache HTTP Client 中的 DefaultMaxPerRoute，文檔這麼寫到：app

A request for a route for which the manager already has a persistent connection available in the pool will be serviced by leasing a connection from the pool rather than creating a brand new connection.性能

PoolingHttpClientConnectionManager maintains a maximum limit of connections on a per route basis and in total. Per default this implementation will create no more than 2 concurrent connections per given route and no more 20 connections in total. For many real-world applications these limits may prove too constraining, especially if they use HTTP as a transport protocol for their services.this

這個相似於數據庫的連接池，通常而言從連接池拿連接的時候，都會有個超時時間，過了這個超時時間，會拋異常。其實這個超時時間也是有的，對應的是 Apache HTTP Client 中的 `getRequestTimeout`。問題是，在 Zuul 中，這個超時時間爲-1，而且不能設置。根據 Apache HTTP Client 中的文檔，線程

Returns the timeout in milliseconds used when requesting a connection from the connection manager. A timeout value of zero is interpreted as an infinite timeout. A timeout value of zero is interpreted as an infinite timeout. A negative value is interpreted as undefined (system default).日誌

因此當 HTTP 連接拿完之後，線程就等在那裏，形成整個系統 Hang 住。解決辦法就是，提高這個max-per-route-connections的數值，如下是兩次壓測的結果code

max-connection-per-route = 20blog

max-connection-per-route = 300ci

從壓測結果得知，評價響應時間提高了200%，P90 提高了 100%。