Hadoop RPC Client 端源碼分析

時間 2019-11-11

原文原文鏈接

　　lz程序猿一枚，在大數據的道路上一騎絕塵，最近對源碼分析饒有興趣，so寫下此文共享給給位碼農們，實力有限若有錯誤的地方但願你們予以指正。話很少說上文章。java

RPC 實現一共有3個最重要的類，Client 客戶端、Server 服務端、RPC 三類，RPC實現主要是經過java NIO 、java 動態代理、java 反射的方式實現。node

本文只分析client 和RPC當前這兩部分，後續會加入Server端的部分。
RPC
RPC是在Client和Server的基礎上實現了Hadoop的IPC，共分兩部分功能
與客戶端相關的RPCInvoker，與服務端相關的Server(是RPC的內部類而不是上面的Server服務端類)。RPC中還有一個跟RPC引擎相關的類，RPCKind 枚舉類，內容以下：算法

public enum RpcKind {
　　RPC_BUILTIN ((short) 1), // 測試用
　　RPC_WRITABLE ((short) 2), // Use WritableRpcEngine 
　　RPC_PROTOCOL_BUFFER ((short) 3); // Use ProtobufRpcEngine
　　final static short MAX_INDEX = RPC_PROTOCOL_BUFFER.value; // used for array size
　　public final short value; //TODO make it private

　　RpcKind(short val) {
　　　　this.value = val;
　　} 
}

能夠看出 Hadoop自從yarn的引入，Hadoop的序列化引擎已經不僅僅是writable了，新引入了google的protocol方式，所以引入了RPCEngine接口和對應的實現類ProtoBufRPCEngine和WritableRPCEngine。RPCEngine 是客戶端和服務端統一獲取IPC鏈接的地方（RPC類中也包含相關部分，最終經過RPCKind類選擇適當的引擎的實現類），客戶端經過getProxy獲取客戶端鏈接，服務端經過getServer獲取鏈接。apache

先從getProxy開始分析，這也是客戶端的IPC入口。
getProxy採用java動態代理的方式，每次對協議接口方法的調用都會被攔截下來，經過invoke方法將客戶端的請求交給Client類處理。緩存

RPCEngine中的getProxy
<T> ProtocolProxy<T> getProxy(Class<T> protocol,
long clientVersion, InetSocketAddress addr,
UserGroupInformation ticket, Configuration conf,
SocketFactory factory, int rpcTimeout,
RetryPolicy connectionRetryPolicy,
AtomicBoolean fallbackToSimpleAuth) throws IOException

View Code

分析一下各個參數的含義（只分析重要參數，安全相關略過）
Class<T> protocol Hadoop各個角色之間的協議(2.0以後Hadoop協議接口都已經protocol化，不在採用writable方式)如客戶端和namenode之間的協議，namenode和datanode之間的協議都要接口化，各個接口中都相關的可用方法，IPC遠程調用其實就是調用這些接口的實現類中的方法。下面是客戶端和datanode之間的協議接口（下面的是爲了說明協議接口的應用，有必定了解的能夠略過）：安全

--------------------------------------------------------協議接口-------------------------------------------------------多線程

public interface ClientDatanodeProtocol {
  
  public static final long versionID = 9L;

  /**返回一個副本的可見長度. */
  long getReplicaVisibleLength(ExtendedBlock b) throws IOException;
  
  /**
   * 刷新聯合namenode名單，因爲configuration中的namenode節點的增長和中止已經
*刪除的namenode節點(2.x開始引入了聯合namenode的方式，namenode再也不是單一
*節點，分佈在多個節點上，每一個節點管理不一樣的目錄，如namenode1管理*/application1 ,namenode2管理/application2,每一個目錄互不干擾，其中某個namenode掛
*掉了，只是其管理的目錄下的*應用不可用，不會影響其餘的節點，datanode不變，任*何一個namenode均可以控制全部的*datanode )
   * 
   * @throws IOException on error
   **/
  void refreshNamenodes() throws IOException;

  /**
   *刪除塊池目錄。若是「force」是false只有塊池目錄爲空時刪除，不然塊池與它的內容
*一併刪除。（此方法和新hdfs   datanode數據管理相關，下章會講解）
   * 
   * @param bpid Blockpool id to be deleted.
   * @param force If false blockpool directory is deleted only if it is empty 
   *          i.e. if it doesn't contain any block files, otherwise it is 
   *          deleted along with its contents.
   * @throws IOException
   */
  void deleteBlockPool(String bpid, boolean force) throws IOException;
  
  /**
   * 檢索存儲在本地文件系統上的塊文件和元數據文件的路徑名。
 
   * 爲了使此方法有效，下列狀況之一應知足
   * 客戶端用戶必須在數據節點被配置成可以使用這一方法
   * 
   * 當啓用安全，Kerberos身份驗證必須可以鏈接到這個Datanode
   * 
   * @param block
   *          the specified block on the local datanode
   * @param token
   *          the block access token.
   * @return the BlockLocalPathInfo of a block
   * @throws IOException
   *           on error
   */
  BlockLocalPathInfo getBlockLocalPathInfo(ExtendedBlock block,
      Token<BlockTokenIdentifier> token) throws IOException;
  
  /**
   *檢索Datanode上有關一個list塊上卷位置信息。
     *這是在一個不透明的形式{@link org.apache.hadoop.fs.VolumeId}
    *爲配置的每一個數據目錄，這是不能保證橫跨DN從新啓動同樣的。   * 
   * @param blockPoolId the pool to query
   * @param blockIds
   *          list of blocks on the local datanode
   * @param tokens
   *          block access tokens corresponding to the requested blocks
   * @return an HdfsBlocksMetadata that associates {@link ExtendedBlock}s with
   *         data directories
   * @throws IOException
   *           if datanode is unreachable, or replica is not found on datanode
   */
  HdfsBlocksMetadata getHdfsBlocksMetadata(String blockPoolId,
      long []blockIds, List<Token<BlockTokenIdentifier>> tokens) throws IOException; 

  /**
   * 關閉一個datanode節點.
   *
   * @param forUpgrade If true, data node does extra prep work before shutting
   *          down. The work includes advising clients to wait and saving
   *          certain states for quick restart. This should only be used when
   *          the stored data will remain the same during upgrade/restart.
   * @throws IOException 
   */
  void shutdownDatanode(boolean forUpgrade) throws IOException;  

  /**
   * 獲取datanode元數據信息
   *
   * @return software/config version and uptime of the datanode
   */
  DatanodeLocalInfo getDatanodeInfo() throws IOException;

  /**
   * Asynchronously reload configuration on disk and apply changes.
   */
  void startReconfiguration() throws IOException;

  /**
   *獲取以前發出的從新配置任務的狀態.
   * @see {@link org.apache.hadoop.conf.ReconfigurationTaskStatus}.
   */
  ReconfigurationTaskStatus getReconfigurationStatus() throws IOException;

  /**
   * 觸發一個新block report
   */
  void triggerBlockReport(BlockReportOptions options)
    throws IOException;
}

View Code

---------------------------------------------協議接口---------------------------------------------------
long clientVersion client標識
InetSocketAddress addr 訪問的服務端地址
UserGroupInformation ticket 用戶組信息
Configuration conf configuration配置信息
SocketFactory factory socket工廠用來生成socket鏈接(IPC通訊採用socket的TCP方式)
int rpcTimeout 超時時間
RetryPolicy connectionRetryPolicy 鏈接重試策略(直接失敗，重試和切換到另外一臺機器重試詳細見RetryPolicy類)
AtomicBoolean fallbackToSimpleAuth 是否退到通常用戶併發

此方法最終會調用相關子類的對應的方法，以ProtoBuRPCEngine爲例，app

public <T> ProtocolProxy<T> getProxy(Class<T> protocol, long clientVersion,
      InetSocketAddress addr, UserGroupInformation ticket, Configuration conf,
      SocketFactory factory, int rpcTimeout, RetryPolicy connectionRetryPolicy,
      AtomicBoolean fallbackToSimpleAuth) throws IOException {

    //Invoker 類實現了InvocationHandler 
final Invoker invoker = new Invoker(protocol, addr, ticket, conf, factory,
        rpcTimeout, connectionRetryPolicy, fallbackToSimpleAuth);
    //生成代理對象（此部分不熟悉看一下java的動態代理）
    return new ProtocolProxy<T>(protocol, (T) Proxy.newProxyInstance(
        protocol.getClassLoader(), new Class[]{protocol}, invoker), false);
  }

View Code

Invoker
Invoker類圖如dom

isClosed 與鏈接關閉有關
remoteId Client端到Server端的鏈接id，Client會繼續分析
client Client對象
clientProtocolVersion 不一樣Hadoop版本之間的協議版本是不一致的，因此不能用2.1的版本與2.5的通訊
protocolName 協議名
returnTypes 緩存每一個協議接口中方法的返回類型(Message封裝Message是google protocolBuffer的消息序列化類)

invoker構造方法

private Invoker(Class<?> protocol, Client.ConnectionId connId,
        Configuration conf, SocketFactory factory) {
      this.remoteId = connId;
// CLIENTS  是ClientCache類型的對象，其中緩存着全部訪問過的客戶端對象信息，若是是新的客戶端則構造新的client對象並將其緩存。
      this.client = CLIENTS.getClient(conf, factory, RpcResponseWrapper.class);
      this.protocolName = RPC.getProtocolName(protocol);
      this.clientProtocolVersion = RPC
          .getProtocolVersion(protocol);
    }

View Code

Invoke
下面看看關鍵的invoke方法，當調用協議接口中的某個方法時，就會觸發此方法。

@Override
    public Object invoke(Object proxy, Method method, Object[] args)
        throws ServiceException {
      long startTime = 0;
      if (LOG.isDebugEnabled()) {
        startTime = Time.now();//當前時間毫秒數
      }
      
      if (args.length != 2) { // 參數必須是2個RpcController + Message
        throw new ServiceException("Too many parameters for request. Method: ["
            + method.getName() + "]" + ", Expected: 2, Actual: "
            + args.length);
      }
      if (args[1] == null) {
        throw new ServiceException("null param while calling Method: ["
            + method.getName() + "]");
      }

      //追述信息相關，
TraceScope traceScope = null;
      // if Tracing is on then start a new span for this rpc.
      // guard it in the if statement to make sure there isn't
      // any extra string manipulation.
      if (Trace.isTracing()) {
        traceScope = Trace.startSpan(RpcClientUtil.methodToTraceString(method));
      }
      //RPC請求頭信息，相似http中的請求頭同樣，客戶端和服務端都要先發送頭信息，而後在發送內容。注意，構造頭信息是將method放入了請求中，在服務端接受時就會知道調用哪一個方法。
      RequestHeaderProto rpcRequestHeader = constructRpcRequestHeader(method);
      
      if (LOG.isTraceEnabled()) {
        LOG.trace(Thread.currentThread().getId() + ": Call -> " +
            remoteId + ": " + method.getName() +
            " {" + TextFormat.shortDebugString((Message) args[1]) + "}");
      }

     //method的參數信息，method反射是用到。
      Message theRequest = (Message) args[1];
     // server端返回的結果
      final RpcResponseWrapper val;
      try {
    // 調用client（client已經在構造方法裏生成了對應的對象）類中的call方法（client類中會具體分析該方法）返回server端的返回結果
        val = (RpcResponseWrapper) client.call(RPC.RpcKind.RPC_PROTOCOL_BUFFER,
            new RpcRequestWrapper(rpcRequestHeader, theRequest), remoteId,
            fallbackToSimpleAuth);

      } catch (Throwable e) {
        if (LOG.isTraceEnabled()) {
          LOG.trace(Thread.currentThread().getId() + ": Exception <- " +
              remoteId + ": " + method.getName() +
                " {" + e + "}");
        }
        if (Trace.isTracing()) {
          traceScope.getSpan().addTimelineAnnotation(
              "Call got exception: " + e.getMessage());
        }
        throw new ServiceException(e);
      } finally {
        if (traceScope != null) traceScope.close();
      }

      if (LOG.isDebugEnabled()) {
        long callTime = Time.now() - startTime;
        LOG.debug("Call: " + method.getName() + " took " + callTime + "ms");
      }
      
      Message prototype = null;
      try {
        //獲取method的返回類型
prototype = getReturnProtoType(method);
      } catch (Exception e) {
        throw new ServiceException(e);
      }
      Message returnMessage;
      try {
       //將返回值message序列化
        returnMessage = prototype.newBuilderForType()
            .mergeFrom(val.theResponseRead).build();

        if (LOG.isTraceEnabled()) {
          LOG.trace(Thread.currentThread().getId() + ": Response <- " +
              remoteId + ": " + method.getName() +
                " {" + TextFormat.shortDebugString(returnMessage) + "}");
        }

      } catch (Throwable e) {
        throw new ServiceException(e);
      }
      return returnMessage;
}

獲取方法的返回類型（message序列化後的結果）
private Message getReturnProtoType(Method method) throws Exception {
      if (returnTypes.containsKey(method.getName())) {
        return returnTypes.get(method.getName());
      }
      
      Class<?> returnType = method.getReturnType();
      Method newInstMethod = returnType.getMethod("getDefaultInstance");
      newInstMethod.setAccessible(true);
      Message prototype = (Message) newInstMethod.invoke(null, (Object[]) null);
      returnTypes.put(method.getName(), prototype);
      return prototype;
}
關閉客戶端的IPC鏈接
public void close() throws IOException {
      if (!isClosed) {
        isClosed = true;
        CLIENTS.stopClient(client);
      }
}

View Code

總之，invoker 類經過client call方法攔截了協議接口方法的調用，並將處理方式發送到Client.call方法中，由call方法處理如何將調用信息發送到服務端並獲取返回結果，封裝成message返回最終的調用的結果。
RPCInvoker接口
此接口與上面的Invoker沒有任何關係，此類只有一個call方法由server端調用，用於處理最終請求處理的地方，就是調用協議接口實現類對應方法的地方。主要採用反射的方式實現。在WritableRPCEngine和ProtoBufRPCEngine中都有對應的實現類。之因此會多出這一步驟，而不是直接在Server裏直接實現call方法，是由於當前Hadoop版本序列化的方式存在兩種，Hadoop實現者將這兩個序列化的解析處理方法分開實現，供其餘類調用，怎加了代碼的重用性。
ProtoBufRpcInvoker.Call
下面以ProtoBufRPCEngine. ProtoBufRpcInvoker爲例講解call方法的具體處理步驟。

public Writable call(RPC.Server server, String protocol,
          Writable writableRequest, long receiveTime) throws Exception {
        RpcRequestWrapper request = (RpcRequestWrapper) writableRequest;
        RequestHeaderProto rpcRequest = request.requestHeader;
        //獲取調用的方法名
        String methodName = rpcRequest.getMethodName();
        //獲取協議接口名
        String protoName = rpcRequest.getDeclaringClassProtocolName();
        //獲取客戶端版本
        long clientVersion = rpcRequest.getClientProtocolVersion();
        if (server.verbose)
          LOG.info("Call: protocol=" + protocol + ", method=" + methodName);
        //獲取接口實現類
        ProtoClassProtoImpl protocolImpl = getProtocolImpl(server, protoName,
            clientVersion);
        BlockingService service = (BlockingService) protocolImpl.protocolImpl;
        //根據方法名獲取方法描述信息
MethodDescriptor methodDescriptor = service.getDescriptorForType()
            .findMethodByName(methodName);
        if (methodDescriptor == null) {
          String msg = "Unknown method " + methodName + " called on " + protocol
              + " protocol.";
          LOG.warn(msg);
          throw new RpcNoSuchMethodException(msg);
        }
        //根據方法描述信息獲取客戶端發送的message信息（protocol方式採用message類序列化信息）。
        Message prototype = service.getRequestPrototype(methodDescriptor);
        //獲取方法參數
        Message param = prototype.newBuilderForType()
            .mergeFrom(request.theRequestRead).build();
        
        Message result;
        long startTime = Time.now();
        int qTime = (int) (startTime - receiveTime);
        Exception exception = null;
        try {
          server.rpcDetailedMetrics.init(protocolImpl.protocolClass);
          //調用方法返回結果，內部是protocol方式實現調用協議接口中的方法。
          result = service.callBlockingMethod(methodDescriptor, null, param);
        } catch (ServiceException e) {
          exception = (Exception) e.getCause();
          throw (Exception) e.getCause();
        } catch (Exception e) {
          exception = e;
          throw e;
        } finally {
          int processingTime = (int) (Time.now() - startTime);
          if (LOG.isDebugEnabled()) {
            String msg = "Served: " + methodName + " queueTime= " + qTime +
                " procesingTime= " + processingTime;
            if (exception != null) {
              msg += " exception= " + exception.getClass().getSimpleName();
            }
            LOG.debug(msg);
          }
          String detailedMetricsName = (exception == null) ?
              methodName :
              exception.getClass().getSimpleName();
          server.rpcMetrics.addRpcQueueTime(qTime);
          server.rpcMetrics.addRpcProcessingTime(processingTime);
          server.rpcDetailedMetrics.addProcessingTime(detailedMetricsName,
              processingTime);
        }
        //返回最終的結果
        return new RpcResponseWrapper(result);
      }

Client

Client中包含不少內部類，大體可概括爲兩部分，一部分是與IPC鏈接相關的類 connection、connectionId等，另外一部分與遠程接口調用相關的 Call、ParallelCall等

Client大體類圖以下(不包含內部類，最終總結會包含全部類)

callIDCounter 一個生成Client.Call 類中惟一id的一個生成器。
callId 當前線程對應的call對象的id
retryCount 重試次數，鏈接失敗或者返回結果錯誤或者超時
connections 當前client全部的正在處理的鏈接
running client是否處於運行狀態
conf configuration配置類
socketFactory 建立socket的工廠
clientId 當前client的惟一id
CONNECTION_CONTEXT_CALL_ID 特殊的一種callId 用於傳遞connection上下文信息的callId
valueClass :Class<? extends Writable> Call服務端返回結果類型
sendParamsExecutor 多線程方式處理connection
Client構造方法
先看Client構造方法，上面Invoker調用過

public Client(Class<? extends Writable> valueClass, Configuration conf, 
      SocketFactory factory) {
    this.valueClass = valueClass;
    this.conf = conf;
this.socketFactory = factory;
//獲取超時時間
    this.connectionTimeout = conf.getInt(CommonConfigurationKeys.IPC_CLIENT_CONNECT_TIMEOUT_KEY,
        CommonConfigurationKeys.IPC_CLIENT_CONNECT_TIMEOUT_DEFAULT);
    this.fallbackAllowed = conf.getBoolean(CommonConfigurationKeys.IPC_CLIENT_FALLBACK_TO_SIMPLE_AUTH_ALLOWED_KEY,
        CommonConfigurationKeys.IPC_CLIENT_FALLBACK_TO_SIMPLE_AUTH_ALLOWED_DEFAULT);
//經過uuid方式生成clientId
this.clientId = ClientId.getClientId();
//生成一個cache類型的executorService 稍後分析
    this.sendParamsExecutor = clientExcecutorFactory.refAndGetInstance();
  }

call
下面就看一下，Invoker類中的invoke方法調用的call方法是怎麼把方法發送到服務端的。

public Writable call(RPC.RpcKind rpcKind, Writable rpcRequest,
      ConnectionId remoteId, int serviceClass,
      AtomicBoolean fallbackToSimpleAuth) throws IOException {
    //生成一個Call類型的對象，上面曾說過，client中包含不少內部類，Call就是其中之一，負責遠程接口調用。下面會細化此類
final Call call = createCall(rpcKind, rpcRequest);
//生成一個connection對象，Hadoop在此處進行了一些優化措施，若是當前鏈接在過去的曾經應用過，而且當前仍然是活躍的，那麼就複用此鏈接。這會減小內存的開銷和遠程socket通訊的開銷，後面會細化此類
    Connection connection = getConnection(remoteId, call, serviceClass,
      fallbackToSimpleAuth);
try {
  //call對象已經把調用信息進行了封裝，而後經過connection對象將call封裝的信息發送到server端。
      connection.sendRpcRequest(call);                 // send the rpc request
    } catch (RejectedExecutionException e) {
      throw new IOException("connection has been closed", e);
    } catch (InterruptedException e) {
      Thread.currentThread().interrupt();
      LOG.warn("interrupted waiting to send rpc request to server", e);
      throw new IOException(e);
    }

    boolean interrupted = false;
    synchronized (call) {
      while (!call.done) {
        try {
//在此處會堵塞當前線程，直道call有返回結果。由notify喚醒。
          call.wait();                           // wait for the result
        } catch (InterruptedException ie) {
          // save the fact that we were interrupted
          interrupted = true;
        }
      }
//線程中斷異常處理
      if (interrupted) {
        // set the interrupt flag now that we are done waiting
        Thread.currentThread().interrupt();
      }
        //call 返回錯誤處理
      if (call.error != null) {
        if (call.error instanceof RemoteException) {
          call.error.fillInStackTrace();
          throw call.error;
        } else { // local exception
          InetSocketAddress address = connection.getRemoteAddress();
          throw NetUtils.wrapException(address.getHostName(),
                  address.getPort(),
                  NetUtils.getHostname(),
                  0,
                  call.error);
        }
      } else {
        //將正確信息返回到invoker中。
        return call.getRpcResponse();
      }
    }
}

此方法主要步驟，先建立call遠程調用對象將調用信息封裝，在生成遠程鏈接對象connection，而後將call經過connection發送到服務端等待返回結果，期間可能出現各類錯誤信息（超時、鏈接錯誤，線程中斷等等），最後將正確的結果返回到invoker中。
getConnection
獲取鏈接connection方法getConnection

private Connection getConnection(ConnectionId remoteId,
      Call call, int serviceClass, AtomicBoolean fallbackToSimpleAuth)
      throws IOException {
//確保當前client處於運行狀態
    if (!running.get()) {
      // the client is stopped
      throw new IOException("The client is stopped");
    }
    Connection connection;
    /* we could avoid this allocation for each RPC by having a  
     * connectionsId object and with set() method. We need to manage the
     * refs for keys in HashMap properly. For now its ok.
     */
do {
//加上同步鎖會有多個線程同時獲取鏈接，避免相同鏈接生成屢次
      synchronized (connections) {
        connection = connections.get(remoteId);
        //若是鏈接池中不包含想要的鏈接則建立新鏈接
        if (connection == null) {
          connection = new Connection(remoteId, serviceClass);
          connections.put(remoteId, connection);
        }
      }
    } while (!connection.addCall(call));//將剛剛建立的call添加到次connection中，一個connection能夠處理多個調用。
    
//connection初始IOstream，其中包含建立請求頭消息併發送信息。
//此段代碼並無放到同步代碼塊中，緣由是若是服務端很慢的話，它會花費很長的時間建立一個鏈接，這會使整個系統宕掉(同步代碼使得每次只能處理一個線程，其餘的connection都要等待，這會使系統處於死等狀態)。
    connection.setupIOstreams(fallbackToSimpleAuth);
    return connection;
  }

createCall
建立Call 方法很簡單直接調用call的構造方法。

Call createCall(RPC.RpcKind rpcKind, Writable rpcRequest) {
    return new Call(rpcKind, rpcRequest);
}

Connection
下面講一下Client的內部類：
在說connection以前，說一下Hadoop IPC消息傳遞的方式，實際上是採用變長消息格式，因此每次發送消息以前要發送消息的總長度包含消息頭信息，通常用dataLength表示消息長度，Hadoop用4個字節的來存儲消息的大小。
Hadoop在connection初始創建鏈接的時候，會發送connection消息頭和消息上下文（後面會有兩個方法處理這兩段信息），那麼Hadoop是如何判斷髮送過來的信息是connection過來的，
相似java，Hadoop也有一個魔數 ‘hrpc’ 這個魔數存儲在connection發送的消息頭中，正好佔的是dataLength的4個字節，這是Hadoop精心設置的一種方式。若是dataLength字段是hrpc則說明是集羣中某個client發送過來的信息，而頭信息並不須要數據內容，只包含頭信息，這使得在處理頭信息時，不用關心信息長度。由於他的長度就是頭信息那麼大。
Connection類圖大體以下(只包含重要信息，安全和權限相關去掉)

Server 對應服務端的地址和端口
remoteId connectionId 是connection的惟一id屬性
socket 與服務端的socket鏈接
in 輸入，從鏈接中獲取服務端返回的結果用
out 輸出，發送數據到服務端用
lastActivity 最近一次進行I/O的時間用於判斷超時
rpcTimeout 超時時間範圍
calls 當前connection處理的全部call
maxIdleTime 最大空閒時間，若是超過這個時間，connection將會從client對象中的connections map對象中剔除掉，將剩餘的空間留給比較忙的connection。
connectionRetryPolicy 鏈接失敗的重試策略。
maxRetriesOnSocketTimeouts 在socket中最大的重試超時時間範圍。
shouldCloseConnection 是否應該關閉當前connection，true關閉
sendRpcRequestLock 同步鎖用對象。
TcpNoDelay 是否採用Nagle算法（與tcp數據包相關）
closeException 關閉connection多是由於某種錯誤，記錄錯誤信息
doping 每隔一段時間發送的ping信息，防止服務端誤認爲客戶端死掉。
pingInterval ping的時間間隔
pingRequest ping發送的內容

在上面的getConnection中，若是當前沒有對應的Connection對象，那麼就生成新的
//Connection中的不少屬性在ConnectionId類中都已經存在了。構造方法主要是初始化上面的屬性

 public Connection(ConnectionId remoteId, int serviceClass) throws IOException {
      this.remoteId = remoteId;
      this.server = remoteId.getAddress();
      if (server.isUnresolved()) {
        throw NetUtils.wrapException(server.getHostName(),
            server.getPort(),
            null,
            0,
            new UnknownHostException());
      }
      this.rpcTimeout = remoteId.getRpcTimeout();
      this.maxIdleTime = remoteId.getMaxIdleTime();
      this.connectionRetryPolicy = remoteId.connectionRetryPolicy;
      this.maxRetriesOnSasl = remoteId.getMaxRetriesOnSasl();
      this.maxRetriesOnSocketTimeouts = remoteId.getMaxRetriesOnSocketTimeouts();
      this.tcpNoDelay = remoteId.getTcpNoDelay();
      this.doPing = remoteId.getDoPing();
      if (doPing) {
        // construct a RPC header with the callId as the ping callId
        pingRequest = new ByteArrayOutputStream();
        RpcRequestHeaderProto pingHeader = ProtoUtil
            .makeRpcRequestHeader(RpcKind.RPC_PROTOCOL_BUFFER,
                OperationProto.RPC_FINAL_PACKET, PING_CALL_ID,
                RpcConstants.INVALID_RETRY_COUNT, clientId);
        pingHeader.writeDelimitedTo(pingRequest);
      }
      this.pingInterval = remoteId.getPingInterval();
      this.serviceClass = serviceClass;
      if (LOG.isDebugEnabled()) {
        LOG.debug("The ping interval is " + this.pingInterval + " ms.");
      }

      UserGroupInformation ticket = remoteId.getTicket();
      // try SASL if security is enabled or if the ugi contains tokens.
      // this causes a SIMPLE client with tokens to attempt SASL
      boolean trySasl = UserGroupInformation.isSecurityEnabled() ||
                        (ticket != null && !ticket.getTokens().isEmpty());
      this.authProtocol = trySasl ? AuthProtocol.SASL : AuthProtocol.NONE;
      
      this.setName("IPC Client (" + socketFactory.hashCode() +") connection to " +
          server.toString() +
          " from " + ((ticket==null)?"an unknown user":ticket.getUserName()));
      this.setDaemon(true);
    }

setupIOstreams
下面分析一下在getConnection中的setupIOstreams，這是Connection初始IO和發送頭信息的方法，注意此處的同步鎖synchronized和上面的getConnection 的同步代碼塊意義不同，代碼塊鎖住了全部的Connection，而這裏的同步鎖只是在Connection重用的時候同步鎖。

private synchronized void setupIOstreams(
        AtomicBoolean fallbackToSimpleAuth) {
      //若是是已經存在的鏈接，或者這個鏈接應該關閉了，直接返回。兩種狀況都已不須要初始化Connection了。
      if (socket != null || shouldCloseConnection.get()) {
        return;
      } 
      try {
        if (LOG.isDebugEnabled()) {
          LOG.debug("Connecting to "+server);
        }
        if (Trace.isTracing()) {
          Trace.addTimelineAnnotation("IPC client connecting to " + server);
        }
        short numRetries = 0;
        Random rand = null;
        while (true) {
          //connection初始化
          setupConnection();
          //生成socket的IO
          InputStream inStream = NetUtils.getInputStream(socket);
          OutputStream outStream = NetUtils.getOutputStream(socket);
          //發送請求頭信息
          writeConnectionHeader(outStream);
----------------------------------------安全、權限相關---------------------------------------------
          if (authProtocol == AuthProtocol.SASL) {
            final InputStream in2 = inStream;
            final OutputStream out2 = outStream;
            UserGroupInformation ticket = remoteId.getTicket();
            if (ticket.getRealUser() != null) {
              ticket = ticket.getRealUser();
            }
            try {
              authMethod = ticket
                  .doAs(new PrivilegedExceptionAction<AuthMethod>() {
                    @Override
                    public AuthMethod run()
                        throws IOException, InterruptedException {
                      return setupSaslConnection(in2, out2);
                    }
                  });
            } catch (Exception ex) {
              authMethod = saslRpcClient.getAuthMethod();
              if (rand == null) {
                rand = new Random();
              }
              handleSaslConnectionFailure(numRetries++, maxRetriesOnSasl, ex,
                  rand, ticket);
              continue;
            }
            if (authMethod != AuthMethod.SIMPLE) {
              // Sasl connect is successful. Let's set up Sasl i/o streams.
              inStream = saslRpcClient.getInputStream(inStream);
              outStream = saslRpcClient.getOutputStream(outStream);
              // for testing
              remoteId.saslQop =
                  (String)saslRpcClient.getNegotiatedProperty(Sasl.QOP);
              LOG.debug("Negotiated QOP is :" + remoteId.saslQop);
              if (fallbackToSimpleAuth != null) {
                fallbackToSimpleAuth.set(false);
              }
            } else if (UserGroupInformation.isSecurityEnabled()) {
              if (!fallbackAllowed) {
                throw new IOException("Server asks us to fall back to SIMPLE " +
                    "auth, but this client is configured to only allow secure " +
                    "connections.");
              }
              if (fallbackToSimpleAuth != null) {
                fallbackToSimpleAuth.set(true);
              }
            }
          }
        ----------------------------------------安全、權限相關---------------------------------------------
        //是否到了發送ping的時間
          if (doPing) {
            //將ping內容讀入
            inStream = new PingInputStream(inStream);
          }
          this.in = new DataInputStream(new BufferedInputStream(inStream));

          // SASL may have already buffered the stream
          if (!(outStream instanceof BufferedOutputStream)) {
            outStream = new BufferedOutputStream(outStream);
          }
          this.out = new DataOutputStream(outStream);
          //發送Connection上下文
          writeConnectionContext(remoteId, authMethod);

          // 更新活躍時間
          touch();

          if (Trace.isTracing()) {
            Trace.addTimelineAnnotation("IPC client connected to " + server);
          }

          // 開啓run方法，其中包含接受server返回信息。
          start();
          return;
        }
      } catch (Throwable t) {
        //異常關閉鏈接
        if (t instanceof IOException) {
        //此方法會是shouldCloseConnection 變爲true，
          markClosed((IOException)t);
        } else {
          markClosed(new IOException("Couldn't set up IO streams", t));
        }
        close();
      }
}

此方法主要是初始化Connection，創建鏈接頭信息，併發送請求頭和請求上下文，更新活躍時間。代碼最後開啓線程開始接受server端返回的結果。markClosed方法會使shouldCloseConnection變爲true，標記表示Connection應該關閉了，其餘方法遇到這個屬性時將會直接跳過不處理任何事情，最終到run（Connection繼承自Thread）方法時，經過waitForWork判斷關閉鏈接，調用Connection的close方法。
markClosed

private synchronized void markClosed(IOException e) {
    //經過cas方式設置爲true
      if (shouldCloseConnection.compareAndSet(false, true)) {
        closeException = e;
        //喚醒全部阻塞在此鏈接的線程。
        notifyAll();
      }
}

setupConnection
下面看一下如何初始化Connection

private synchronized void setupConnection() throws IOException {
      //io錯誤次數
short ioFailures = 0;
//超時次數
      short timeoutFailures = 0;
        //循環直道成功建立socket鏈接
      while (true) {
        try {
        //建立socket
          this.socket = socketFactory.createSocket();
          this.socket.setTcpNoDelay(tcpNoDelay);
          this.socket.setKeepAlive(true);
          ---------------------------權限、安全相關---------------------------------------
          /*
           * Bind the socket to the host specified in the principal name of the
           * client, to ensure Server matching address of the client connection
           * to host name in principal passed.
           */
          UserGroupInformation ticket = remoteId.getTicket();
          if (ticket != null && ticket.hasKerberosCredentials()) {
            KerberosInfo krbInfo = 
              remoteId.getProtocol().getAnnotation(KerberosInfo.class);
            if (krbInfo != null && krbInfo.clientPrincipal() != null) {
              String host = 
                SecurityUtil.getHostFromPrincipal(remoteId.getTicket().getUserName());
              
              // If host name is a valid local address then bind socket to it
              InetAddress localAddr = NetUtils.getLocalInetAddress(host);
              if (localAddr != null) {
                this.socket.bind(new InetSocketAddress(localAddr, 0));
              }
            }
          }
          ---------------------------權限、安全相關---------------------------------------
        //將socket綁定到server端
          NetUtils.connect(this.socket, server, connectionTimeout);
        //超時時間和ping間隔相同。
          if (rpcTimeout > 0) {
            pingInterval = rpcTimeout;  // rpcTimeout overwrites pingInterval
          }
            //設置socket超時
          this.socket.setSoTimeout(pingInterval);
          return;
        } catch (ConnectTimeoutException toe) {
          /* 鏈接超時多是鏈接地址發生了改變，調用updateAdress方法，若是返回true
*說明鏈接地址確實改變了，從新創建鏈接。
           */
          if (updateAddress()) {
            //更新超時次數和io錯誤次數爲0
            timeoutFailures = ioFailures = 0;
          }
         //此方法會關閉socket鏈接，
          handleConnectionTimeout(timeoutFailures++,
              maxRetriesOnSocketTimeouts, toe);
        } catch (IOException ie) {
          if (updateAddress()) {
            timeoutFailures = ioFailures = 0;
          }
          handleConnectionFailure(ioFailures++, ie);
        }
      }
}

updateAddress
更新server端

private synchronized boolean updateAddress() throws IOException {
      // Do a fresh lookup with the old host name.
      InetSocketAddress currentAddr = NetUtils.createSocketAddrForHost(
                               server.getHostName(), server.getPort());
        //若是地址與之前的不一樣則更新
      if (!server.equals(currentAddr)) {
        LOG.warn("Address change detected. Old: " + server.toString() +
                                 " New: " + currentAddr.toString());
        //更新爲新的地址
        server = currentAddr;
        return true;
      }
      return false;
}

writeConnectionHeader
發送請求頭，相對簡單，不解釋

/**
     * Write the connection header - this is sent when connection is established
     * +----------------------------------+
     * |  "hrpc" 4 bytes                  |      
     * +----------------------------------+
     * |  Version (1 byte)                |
     * +----------------------------------+
     * |  Service Class (1 byte)          |
     * +----------------------------------+
     * |  AuthProtocol (1 byte)           |      
     * +----------------------------------+
     */
    private void writeConnectionHeader(OutputStream outStream)
        throws IOException {
      DataOutputStream out = new DataOutputStream(new BufferedOutputStream(outStream));
      // Write out the header, version and authentication method
      out.write(RpcConstants.HEADER.array());
      out.write(RpcConstants.CURRENT_VERSION);
      out.write(serviceClass);
      out.write(authProtocol.callId);
      out.flush();
}

writeConnectionContext
發送請求上下文
/* 此方法和上面的方法都不是同步的，緣由是他們只在初始化的時候調用一次。
*/

 private void writeConnectionContext(ConnectionId remoteId,
                                        AuthMethod authMethod)
                                            throws IOException {
      // Write out the ConnectionHeader
      IpcConnectionContextProto message = ProtoUtil.makeIpcConnectionContext(
          RPC.getProtocolName(remoteId.getProtocol()),
          remoteId.getTicket(),
          authMethod);
//構造上下文信息，只有上下文內容，沒有信系， 
      RpcRequestHeaderProto connectionContextHeader = ProtoUtil
//rpc引擎類型，rpc打包方式，context的callId默認-3，重試次數-1表示一直重試，客戶端id
          .makeRpcRequestHeader(RpcKind.RPC_PROTOCOL_BUFFER,
              OperationProto.RPC_FINAL_PACKET, CONNECTION_CONTEXT_CALL_ID,
              RpcConstants.INVALID_RETRY_COUNT, clientId);
      RpcRequestMessageWrapper request =
          new RpcRequestMessageWrapper(connectionContextHeader, message);
      
      // Write out the packet length
      out.writeInt(request.getLength());
      request.write(out);
    }

sendRpcRequest
下面是client call方法中經過Connection sendRPCRequest發送遠程調用

/** Initiates a rpc call by sending the rpc request to the remote server.
*/
    public void sendRpcRequest(final Call call)
        throws InterruptedException, IOException {
        //若是應該關閉鏈接，返回
      if (shouldCloseConnection.get()) {
        return;
      }

      // 序列化的call將會被髮送到服務端，這是在call線程中處理
      // 而不是sendParamsExecutor 線程
      
      // 所以若是序列化出現了問題，也能準確的報告
      // 這也是一種併發序列化的方式.
      //
      // Format of a call on the wire:
      // 0) Length of rest below (1 + 2)
      // 1) RpcRequestHeader  - is serialized Delimited hence contains length
      // 2) RpcRequest
      //
      // Items '1' and '2' are prepared here. 
      final DataOutputBuffer d = new DataOutputBuffer();
          //構造請求頭信息，與鏈接剛創建時候相似。
      RpcRequestHeaderProto header = ProtoUtil.makeRpcRequestHeader(
          call.rpcKind, OperationProto.RPC_FINAL_PACKET, call.id, call.retry,
          clientId);
            //將請求信息和頭信息寫到一個輸入流的buffer中
      header.writeDelimitedTo(d);
      call.rpcRequest.write(d);
          //
      synchronized (sendRpcRequestLock) {
            //多線程方式發送請求
        Future<?> senderFuture = sendParamsExecutor.submit(new Runnable() {
          @Override
          public void run() {
            try {
                //out加同步鎖，以避免多個消息寫亂輸出流
              synchronized (Connection.this.out) {
                if (shouldCloseConnection.get()) {
                  return;
                }
                
                if (LOG.isDebugEnabled())
                  LOG.debug(getName() + " sending #" + call.id);
                 //經過Connection的out輸出流將請求信息發送到服務端
                byte[] data = d.getData();
                    //計算信息總長度
                int totalLength = d.getLength();
                    //寫出長度信息
                out.writeInt(totalLength); // Total Length
                    //寫出內容信息
                out.write(data, 0, totalLength);// RpcRequestHeader + RpcRequest
                out.flush();
              }
            } catch (IOException e) {
              // exception at this point would leave the connection in an
              // unrecoverable state (eg half a call left on the wire).
              // So, close the connection, killing any outstanding calls
              markClosed(e);
            } finally {
              //the buffer is just an in-memory buffer, but it is still polite to
              // close early
              IOUtils.closeStream(d);
            }
          }
        });
      
        try {
            //阻塞等待結果，真正的返回結果是在call 中。
          senderFuture.get();
        } catch (ExecutionException e) {
          Throwable cause = e.getCause();
          
          // cause should only be a RuntimeException as the Runnable above
          // catches IOException
          if (cause instanceof RuntimeException) {
            throw (RuntimeException) cause;
          } else {
            throw new RuntimeException("unexpected checked exception", cause);
          }
        }
      }
    }

Connection.run
Connection是thread的子類，每一個Connection都會有一個本身的線程，這樣可以加快請求的處理速度。在setupIOStream方法中最後的地方調用的Connection開啓線程的方法，start，這樣Connection就可以等待返回的結果。

public void run() {
      if (LOG.isDebugEnabled())
        LOG.debug(getName() + ": starting, having connections " 
            + connections.size());

      try {
        //等待是否有可用的call，直到Connection可關閉時，結束循環
        while (waitForWork()) {//wait here for work - read or close connection
            //接受返回結果
          receiveRpcResponse();
        }
      } catch (Throwable t) {
        // This truly is unexpected, since we catch IOException in receiveResponse
        // -- this is only to be really sure that we don't leave a client hanging
        // forever.
        LOG.warn("Unexpected error reading responses on connection " + this, t);
        markClosed(new IOException("Error reading responses", t));
      }
      //while循環判斷shouldCloseConnection爲true，關閉Connection
      close();
      
      if (LOG.isDebugEnabled())
        LOG.debug(getName() + ": stopped, remaining connections "
            + connections.size());
}

此方法中若是有待處理的call而且當前Connection可用，client客戶端尚在運行中，則停留在while循環中處理call。直到shouldCloseConnection爲true，關閉鏈接。下面是waitForWork方法
waitForWork

private synchronized boolean waitForWork() {
//在鏈接可用，還沒有有可處理的call時，掛起當前線程直到達到最大空閒時間。
      if (calls.isEmpty() && !shouldCloseConnection.get()  && running.get())  {
        long timeout = maxIdleTime-
              (Time.now()-lastActivity.get());
        if (timeout>0) {
          try {
            wait(timeout);
          } catch (InterruptedException e) {}
        }
      }
      //在有處理的call且鏈接可用，client尚在運行，返回true
      if (!calls.isEmpty() && !shouldCloseConnection.get() && running.get()) {
        return true;
    //其餘情況則返回false，並標記shouldCloseConnection爲true
      } else if (shouldCloseConnection.get()) {
        return false;
      } else if (calls.isEmpty()) { // idle connection closed or stopped
        markClosed(null);
        return false;
      } else { // get stopped but there are still pending requests 
        markClosed((IOException)new IOException().initCause(
            new InterruptedException()));
        return false;
      }
}

waitForWork方法主要做用就是判斷當前在全部狀況都正常時，有沒有可處理的call，有返回true，沒有等待到最大空閒時間（這段時間內會被addCalls中的notify喚醒，因爲有了新的call要處理全部要喚醒），若是這段時間當中扔沒有要處理的call則返回false，其餘狀況均返回false，並標記shouldCloseConnection爲true。

addCall

private synchronized boolean addCall(Call call) {
    //若是當前鏈接不可用則返回false。
      if (shouldCloseConnection.get())
        return false;
    //將call對象放入Connection正在處理的call隊列裏。
      calls.put(call.id, call);
    //喚醒在waitForWork中被wait的鏈接，若是沒有這略過
      notify();
      return true;
}

Addcall 方法是在上面client解析中getConnection的方法中調用。由於鏈接會複用，因此方法中會判斷鏈接是否可用。
receiveRpcResponse
下面看一下Connection接受返回結果的receiveRpcResponse方法。HadoopIPC鏈接採用的是變長格式的消息，因此每次發送消息是先發送消息的長度，讓後是消息的內容。

private void receiveRpcResponse() {
      if (shouldCloseConnection.get()) {
        return;
      }
      touch();
      
      try {
        //獲取消息長度
        int totalLen = in.readInt();
        讀取消息內容
        RpcResponseHeaderProto header = 
            RpcResponseHeaderProto.parseDelimitedFrom(in);
        //結果校驗
        checkResponse(header);
        
        int headerLen = header.getSerializedSize();
        headerLen += CodedOutputStream.computeRawVarint32Size(headerLen);
        //獲取對應處理的call
        int callId = header.getCallId();
        if (LOG.isDebugEnabled())
          LOG.debug(getName() + " got value #" + callId);
        //找到對應的call並將結果放到call對象的RpcResponse中
        Call call = calls.get(callId);
        //查看處理結果的狀態，是否爲success
        RpcStatusProto status = header.getStatus();
        if (status == RpcStatusProto.SUCCESS) {
            //狀態success將返回值放入call的rpcresponse中
          Writable value = ReflectionUtils.newInstance(valueClass, conf);
          value.readFields(in);                 // read value
            //此請求已處理完成，從calls中移除call
          calls.remove(callId);
          call.setRpcResponse(value);
          
          // verify that length was correct
          // only for ProtobufEngine where len can be verified easily
//若是是ProtoBuffEngine則用protocol方式將結果包裹一次，用於protocol的方式處理
          if (call.getRpcResponse() instanceof ProtobufRpcEngine.RpcWrapper) {
            ProtobufRpcEngine.RpcWrapper resWrapper = 
                (ProtobufRpcEngine.RpcWrapper) call.getRpcResponse();
            if (totalLen != headerLen + resWrapper.getLength()) { 
              throw new RpcClientException(
                  "RPC response length mismatch on rpc success");
            }
          }
        } else { // Rpc 返回錯誤
          // Verify that length was correct
          if (totalLen != headerLen) {
            throw new RpcClientException(
                "RPC response length mismatch on rpc error");
          }
          //獲取錯誤信息
          final String exceptionClassName = header.hasExceptionClassName() ?
                header.getExceptionClassName() : 
                  "ServerDidNotSetExceptionClassName";
          final String errorMsg = header.hasErrorMsg() ? 
                header.getErrorMsg() : "ServerDidNotSetErrorMsg" ;
          final RpcErrorCodeProto erCode = 
                    (header.hasErrorDetail() ? header.getErrorDetail() : null);
          if (erCode == null) {
             LOG.warn("Detailed error code not set by server on rpc error");
          }
          RemoteException re = 
              ( (erCode == null) ? 
                  new RemoteException(exceptionClassName, errorMsg) :
              new RemoteException(exceptionClassName, errorMsg, erCode));
          if (status == RpcStatusProto.ERROR) {
            //error時，將錯誤信息填充到call中，並將call從calls中移除
            calls.remove(callId);
            call.setException(re);
          } else if (status == RpcStatusProto.FATAL) {
            //若是是致命錯誤則關閉鏈接，多是鏈接異常引發的錯誤
            // Close the connection
            markClosed(re);
          }
        }
      } catch (IOException e) {
        //若是發生IO錯誤則關閉鏈接。
        markClosed(e);
      }
}

Call
下面看一下client中最後一個內部類call，大概的類圖以下

Id call的惟一id 來自於client的callId
Retry 重試次數，來自於client的retryCount
rpcRequest 請求內容序列化後的
rpcResponese 返回結果序列化後的
error 錯誤信息
rpcKind rpc引擎
done 此請求是否完成
setRpcResponse
下面看一下Connection中receiveRpcResponse方法裏所調用的setRPCResponse方法。看看結果是如何設置並返回到client中的call方法裏的（前面有記載）。

//其實方法很簡單只是將receiveRpcResponse中序列化好的結果放到了call的RPCResponse中。並調用了callComplete。
public synchronized void setRpcResponse(Writable rpcResponse) {
    this.rpcResponse = rpcResponse;
    callComplete();
}

callComplete
那麼看看callComplete中又作了什麼。

protected synchronized void callComplete() {
    //標記這次請求已完成
    this.done = true;
    notify(); // notify caller
}

還記得在client的call方法中，有一段判斷call的done字段是否爲true麼，以下
若是當前正在處理的call沒有作完，就wait等待，直到完成notify喚醒，或者是線程被中斷。

while (!call.done) {
    try {
        call.wait(); // wait for the result
    } catch (InterruptedException ie) {
        // save the fact that we were interrupted
        interrupted = true;
    }
}

Client圖解
以上全部就是client端的所有內容。下面一個總體的client端的一個圖解。