HTTP協議是無狀態的協議,即每一次請求都是互相獨立的。所以它的最初實現是,每個http請求都會打開一個tcp socket鏈接,當交互完畢後會關閉這個鏈接。瀏覽器
HTTP協議是全雙工的協議,因此創建鏈接與斷開鏈接是要通過三次握手與四次揮手的。顯然在這種設計中,每次發送Http請求都會消耗不少的額外資源,即鏈接的創建與銷燬。安全
因而,HTTP協議的也進行了發展,經過持久鏈接的方法來進行socket鏈接複用。服務器
從圖中能夠看到:app
持久鏈接的實現有兩種:HTTP/1.0+的keep-alive與HTTP/1.1的持久鏈接。less
從1996年開始,不少HTTP/1.0瀏覽器與服務器都對協議進行了擴展,那就是「keep-alive」擴展協議。異步
注意,這個擴展協議是做爲1.0的補充的「實驗型持久鏈接」出現的。keep-alive已經再也不使用了,最新的HTTP/1.1規範中也沒有對它進行說明,只是不少應用延續了下來。socket
使用HTTP/1.0的客戶端在首部中加上"Connection:Keep-Alive",請求服務端將一條鏈接保持在打開狀態。服務端若是願意將這條鏈接保持在打開狀態,就會在響應中包含一樣的首部。若是響應中沒有包含"Connection:Keep-Alive"首部,則客戶端會認爲服務端不支持keep-alive,會在發送完響應報文以後關閉掉當前鏈接。tcp
經過keep-alive補充協議,客戶端與服務器之間完成了持久鏈接,然而仍然存在着一些問題:ide
HTTP/1.1採起持久鏈接的方式替代了Keep-Alive。ui
HTTP/1.1的鏈接默認狀況下都是持久鏈接。若是要顯式關閉,須要在報文中加上Connection:Close首部。即在HTTP/1.1中,全部的鏈接都進行了複用。
然而如同Keep-Alive同樣,空閒的持久鏈接也能夠隨時被客戶端與服務端關閉。不發送Connection:Close不意味着服務器承諾鏈接永遠保持打開。
HttpClien中使用了鏈接池來管理持有鏈接,同一條TCP鏈路上,鏈接是能夠複用的。HttpClient經過鏈接池的方式進行鏈接持久化。
其實「池」技術是一種通用的設計,其設計思想並不複雜:
全部的鏈接池都是這個思路,不過咱們看HttpClient源碼主要關注兩點:
HttpClient關於持久鏈接的處理在下面的代碼中能夠集中體現,下面從MainClientExec摘取了和鏈接池相關的部分,去掉了其餘部分:
public class MainClientExec implements ClientExecChain { @Override public CloseableHttpResponse execute( final HttpRoute route, final HttpRequestWrapper request, final HttpClientContext context, final HttpExecutionAware execAware) throws IOException, HttpException { //從鏈接管理器HttpClientConnectionManager中獲取一個鏈接請求ConnectionRequest final ConnectionRequest connRequest = connManager.requestConnection(route, userToken);final HttpClientConnection managedConn; final int timeout = config.getConnectionRequestTimeout();
//從鏈接請求ConnectionRequest中獲取一個被管理的鏈接HttpClientConnection managedConn = connRequest.get(timeout > 0 ? timeout : 0, TimeUnit.MILLISECONDS); //將鏈接管理器HttpClientConnectionManager與被管理的鏈接HttpClientConnection交給一個ConnectionHolder持有 final ConnectionHolder connHolder = new ConnectionHolder(this.log, this.connManager, managedConn); try { HttpResponse response; if (!managedConn.isOpen()) {
//若是當前被管理的鏈接不是出於打開狀態,須要從新創建鏈接 establishRoute(proxyAuthState, managedConn, route, request, context); } //經過鏈接HttpClientConnection發送請求 response = requestExecutor.execute(request, managedConn, context); //經過鏈接重用策略判斷是否鏈接可重用 if (reuseStrategy.keepAlive(response, context)) { //得到鏈接有效期 final long duration = keepAliveStrategy.getKeepAliveDuration(response, context); //設置鏈接有效期 connHolder.setValidFor(duration, TimeUnit.MILLISECONDS);
//將當前鏈接標記爲可重用狀態 connHolder.markReusable(); } else { connHolder.markNonReusable(); } } final HttpEntity entity = response.getEntity(); if (entity == null || !entity.isStreaming()) { //將當前鏈接釋放到池中,供下次調用 connHolder.releaseConnection(); return new HttpResponseProxy(response, null); } else { return new HttpResponseProxy(response, connHolder); } }
這裏看到了在Http請求過程當中對鏈接的處理是和協議規範是一致的,這裏要展開講一下具體實現。
PoolingHttpClientConnectionManager是HttpClient默認的鏈接管理器,首先經過requestConnection()得到一個鏈接的請求,注意這裏不是鏈接。
public ConnectionRequest requestConnection( final HttpRoute route, final Object state) {final Future<CPoolEntry> future = this.pool.lease(route, state, null); return new ConnectionRequest() { @Override public boolean cancel() { return future.cancel(true); } @Override public HttpClientConnection get( final long timeout, final TimeUnit tunit) throws InterruptedException, ExecutionException, ConnectionPoolTimeoutException { final HttpClientConnection conn = leaseConnection(future, timeout, tunit); if (conn.isOpen()) { final HttpHost host; if (route.getProxyHost() != null) { host = route.getProxyHost(); } else { host = route.getTargetHost(); } final SocketConfig socketConfig = resolveSocketConfig(host); conn.setSocketTimeout(socketConfig.getSoTimeout()); } return conn; } }; }
能夠看到返回的ConnectionRequest對象其實是一個持有了Future<CPoolEntry>,CPoolEntry是被鏈接池管理的真正鏈接實例。
從上面的代碼咱們應該關注的是:
看一下CPool是如何釋放一個Future<CPoolEntry>的,AbstractConnPool核心代碼以下:
private E getPoolEntryBlocking( final T route, final Object state, final long timeout, final TimeUnit tunit, final Future<E> future) throws IOException, InterruptedException, TimeoutException { //首先對當前鏈接池加鎖,當前鎖是可重入鎖ReentrantLockthis.lock.lock(); try {
//得到一個當前HttpRoute對應的鏈接池,對於HttpClient的鏈接池而言,總池有個大小,每一個route對應的鏈接也是個池,因此是「池中池」 final RouteSpecificPool<T, C, E> pool = getPool(route); E entry; for (;;) { Asserts.check(!this.isShutDown, "Connection pool shut down");
//死循環得到鏈接 for (;;) {
//從route對應的池中拿鏈接,多是null,也多是有效鏈接 entry = pool.getFree(state);
//若是拿到null,就退出循環 if (entry == null) { break; }
//若是拿到過時鏈接或者已關閉鏈接,就釋放資源,繼續循環獲取 if (entry.isExpired(System.currentTimeMillis())) { entry.close(); } if (entry.isClosed()) { this.available.remove(entry); pool.free(entry, false); } else {
//若是拿到有效鏈接就退出循環 break; } }
//拿到有效鏈接就退出 if (entry != null) { this.available.remove(entry); this.leased.add(entry); onReuse(entry); return entry; } //到這裏證實沒有拿到有效鏈接,須要本身生成一個 final int maxPerRoute = getMax(route); //每一個route對應的鏈接最大數量是可配置的,若是超過了,就須要經過LRU清理掉一些鏈接 final int excess = Math.max(0, pool.getAllocatedCount() + 1 - maxPerRoute); if (excess > 0) { for (int i = 0; i < excess; i++) { final E lastUsed = pool.getLastUsed(); if (lastUsed == null) { break; } lastUsed.close(); this.available.remove(lastUsed); pool.remove(lastUsed); } } //當前route池中的鏈接數,沒有達到上線 if (pool.getAllocatedCount() < maxPerRoute) { final int totalUsed = this.leased.size(); final int freeCapacity = Math.max(this.maxTotal - totalUsed, 0);
//判斷鏈接池是否超過上線,若是超過了,須要經過LRU清理掉一些鏈接 if (freeCapacity > 0) { final int totalAvailable = this.available.size();
//若是空閒鏈接數已經大於剩餘可用空間,則須要清理下空閒鏈接 if (totalAvailable > freeCapacity - 1) { if (!this.available.isEmpty()) { final E lastUsed = this.available.removeLast(); lastUsed.close(); final RouteSpecificPool<T, C, E> otherpool = getPool(lastUsed.getRoute()); otherpool.remove(lastUsed); } }
//根據route創建一個鏈接 final C conn = this.connFactory.create(route);
//將這個鏈接放入route對應的「小池」中 entry = pool.add(conn);
//將這個鏈接放入「大池」中 this.leased.add(entry); return entry; } } //到這裏證實沒有從得到route池中得到有效鏈接,而且想要本身創建鏈接時當前route鏈接池已經到達最大值,即已經有鏈接在使用,可是對當前線程不可用 boolean success = false; try { if (future.isCancelled()) { throw new InterruptedException("Operation interrupted"); }
//將future放入route池中等待 pool.queue(future);
//將future放入大鏈接池中等待 this.pending.add(future);
//若是等待到了信號量的通知,success爲true if (deadline != null) { success = this.condition.awaitUntil(deadline); } else { this.condition.await(); success = true; } if (future.isCancelled()) { throw new InterruptedException("Operation interrupted"); } } finally { //從等待隊列中移除 pool.unqueue(future); this.pending.remove(future); } //若是沒有等到信號量通知而且當前時間已經超時,則退出循環 if (!success && (deadline != null && deadline.getTime() <= System.currentTimeMillis())) { break; } }
//最終也沒有等到信號量通知,沒有拿到可用鏈接,則拋異常 throw new TimeoutException("Timeout waiting for connection"); } finally {
//釋放對大鏈接池的鎖 this.lock.unlock(); } }
上面的代碼邏輯有幾個重要點:
到這裏爲止,程序已經拿到了一個可用的CPoolEntry實例,或者拋異常終止了程序。
protected HttpClientConnection leaseConnection( final Future<CPoolEntry> future, final long timeout, final TimeUnit tunit) throws InterruptedException, ExecutionException, ConnectionPoolTimeoutException { final CPoolEntry entry; try {
//從異步操做Future<CPoolEntry>中得到CPoolEntry entry = future.get(timeout, tunit); if (entry == null || future.isCancelled()) { throw new InterruptedException(); } Asserts.check(entry.getConnection() != null, "Pool entry with no connection"); if (this.log.isDebugEnabled()) { this.log.debug("Connection leased: " + format(entry) + formatStats(entry.getRoute())); }
//得到一個CPoolEntry的代理對象,對其操做都是使用同一個底層的HttpClientConnection return CPoolProxy.newProxy(entry); } catch (final TimeoutException ex) { throw new ConnectionPoolTimeoutException("Timeout waiting for connection from pool"); } }
在上一章中,咱們看到了HttpClient經過鏈接池來得到鏈接,當須要使用鏈接的時候從池中得到。
對應着第三章的問題:
咱們在第四章中看到了HttpClient是如何處理一、3的問題的,那麼第2個問題是怎麼處理的呢?
即HttpClient如何判斷一個鏈接在使用完畢後是要關閉,仍是要放入池中供他人複用?再看一下MainClientExec的代碼
//發送Http鏈接
response = requestExecutor.execute(request, managedConn, context); //根據重用策略判斷當前鏈接是否要複用 if (reuseStrategy.keepAlive(response, context)) { //須要複用的鏈接,獲取鏈接超時時間,以response中的timeout爲準 final long duration = keepAliveStrategy.getKeepAliveDuration(response, context); if (this.log.isDebugEnabled()) { final String s;
//timeout的是毫秒數,若是沒有設置則爲-1,即沒有超時時間 if (duration > 0) { s = "for " + duration + " " + TimeUnit.MILLISECONDS; } else { s = "indefinitely"; } this.log.debug("Connection can be kept alive " + s); }
//設置超時時間,當請求結束時鏈接管理器會根據超時時間決定是關閉仍是放回到池中 connHolder.setValidFor(duration, TimeUnit.MILLISECONDS); //將鏈接標記爲可重用
connHolder.markReusable(); } else {
//將鏈接標記爲不可重用 connHolder.markNonReusable(); }
能夠看到,當使用鏈接發生過請求以後,有鏈接重試策略來決定該鏈接是否要重用,若是要重用就會在結束後交給HttpClientConnectionManager放入池中。
那麼鏈接複用策略的邏輯是怎麼樣的呢?
public class DefaultClientConnectionReuseStrategy extends DefaultConnectionReuseStrategy { public static final DefaultClientConnectionReuseStrategy INSTANCE = new DefaultClientConnectionReuseStrategy(); @Override public boolean keepAlive(final HttpResponse response, final HttpContext context) { //從上下文中拿到request final HttpRequest request = (HttpRequest) context.getAttribute(HttpCoreContext.HTTP_REQUEST); if (request != null) {
//得到Connection的Header final Header[] connHeaders = request.getHeaders(HttpHeaders.CONNECTION); if (connHeaders.length != 0) { final TokenIterator ti = new BasicTokenIterator(new BasicHeaderIterator(connHeaders, null)); while (ti.hasNext()) { final String token = ti.nextToken();
//若是包含Connection:Close首部,則表明請求不打算保持鏈接,會忽略response的意願,該頭部這是HTTP/1.1的規範 if (HTTP.CONN_CLOSE.equalsIgnoreCase(token)) { return false; } } } }
//使用父類的的複用策略 return super.keepAlive(response, context); } }
看一下父類的複用策略
if (canResponseHaveBody(request, response)) { final Header[] clhs = response.getHeaders(HTTP.CONTENT_LEN); //若是reponse的Content-Length沒有正確設置,則不復用鏈接
//由於對於持久化鏈接,兩次傳輸之間不須要從新創建鏈接,則須要根據Content-Length確認內容屬於哪次請求,以正確處理「粘包」現象
//因此,沒有正確設置Content-Length的response鏈接不能複用 if (clhs.length == 1) { final Header clh = clhs[0]; try { final int contentLen = Integer.parseInt(clh.getValue()); if (contentLen < 0) { return false; } } catch (final NumberFormatException ex) { return false; } } else { return false; } } if (headerIterator.hasNext()) { try { final TokenIterator ti = new BasicTokenIterator(headerIterator); boolean keepalive = false; while (ti.hasNext()) { final String token = ti.nextToken();
//若是response有Connection:Close首部,則明確表示要關閉,則不復用 if (HTTP.CONN_CLOSE.equalsIgnoreCase(token)) { return false;
//若是response有Connection:Keep-Alive首部,則明確表示要持久化,則複用 } else if (HTTP.CONN_KEEP_ALIVE.equalsIgnoreCase(token)) { keepalive = true; } } if (keepalive) { return true; } } catch (final ParseException px) { return false; } } //若是response中沒有相關的Connection首部說明,則高於HTTP/1.0版本的都複用鏈接 return !ver.lessEquals(HttpVersion.HTTP_1_0);
總結一下:
從代碼中能夠看到,其實現策略與咱們第2、三章協議層的約束是一致的。
在HttpClient4.4版本以前,在從鏈接池中獲取重用鏈接的時候會檢查下是否過時,過時則清理。
以後的版本則不一樣,會有一個單獨的線程來掃描鏈接池中的鏈接,發現有離最近一次使用超過設置的時間後,就會清理。默認的超時時間是2秒鐘。
public CloseableHttpClient build() {
//若是指定了要清理過時鏈接與空閒鏈接,纔會啓動清理線程,默認是不啓動的 if (evictExpiredConnections || evictIdleConnections) {
//創造一個鏈接池的清理線程 final IdleConnectionEvictor connectionEvictor = new IdleConnectionEvictor(cm, maxIdleTime > 0 ? maxIdleTime : 10, maxIdleTimeUnit != null ? maxIdleTimeUnit : TimeUnit.SECONDS, maxIdleTime, maxIdleTimeUnit); closeablesCopy.add(new Closeable() { @Override public void close() throws IOException { connectionEvictor.shutdown(); try { connectionEvictor.awaitTermination(1L, TimeUnit.SECONDS); } catch (final InterruptedException interrupted) { Thread.currentThread().interrupt(); } } });
//執行該清理線程 connectionEvictor.start(); }
能夠看到在HttpClientBuilder進行build的時候,若是指定了開啓清理功能,會建立一個鏈接池清理線程並運行它。
public IdleConnectionEvictor( final HttpClientConnectionManager connectionManager, final ThreadFactory threadFactory, final long sleepTime, final TimeUnit sleepTimeUnit, final long maxIdleTime, final TimeUnit maxIdleTimeUnit) { this.connectionManager = Args.notNull(connectionManager, "Connection manager"); this.threadFactory = threadFactory != null ? threadFactory : new DefaultThreadFactory(); this.sleepTimeMs = sleepTimeUnit != null ? sleepTimeUnit.toMillis(sleepTime) : sleepTime; this.maxIdleTimeMs = maxIdleTimeUnit != null ? maxIdleTimeUnit.toMillis(maxIdleTime) : maxIdleTime; this.thread = this.threadFactory.newThread(new Runnable() { @Override public void run() { try {
//死循環,線程一直執行 while (!Thread.currentThread().isInterrupted()) {
//休息若干秒後執行,默認10秒 Thread.sleep(sleepTimeMs);
//清理過時鏈接 connectionManager.closeExpiredConnections();
//若是指定了最大空閒時間,則清理空閒鏈接 if (maxIdleTimeMs > 0) { connectionManager.closeIdleConnections(maxIdleTimeMs, TimeUnit.MILLISECONDS); } } } catch (final Exception ex) { exception = ex; } } }); }
總結一下:
上面的研究是基於HttpClient源碼的我的理解,若是有誤,但願你們積極留言討論。